Config poisoning
2025–2026. Safety guardrails silently disabled.
A new class of attack is emerging against AI coding agents: prompt injection that targets the agent’s own configuration. Instead of stealing data or running destructive commands, the attacker tricks the agent into rewriting its safety settings. Once the config is modified, every future action runs without guardrails.
What happened
Multiple incidents have demonstrated this pattern:
-
CurXecute (CVE-2025-54135): A prompt injection in a repository’s README caused Cursor to rewrite its MCP configuration file. The modified config pointed to an attacker-controlled server, enabling remote code execution on every subsequent MCP tool call.
-
Claude Code settings hijack (2026): A malicious repository included instructions in its code comments that caused an AI agent to modify
.claude/settings.json, disabling the safety hooks that would normally block dangerous operations. -
Rules File Backdoor (2025): Researchers demonstrated that invisible Unicode characters hidden in
.cursorrulesand.github/copilot-instructions.mdcaused agents to silently generate backdoored code. The files looked normal to human reviewers. -
IDE launch configuration poisoning (IDEsaster, 2025): Thirty vulnerabilities were found across major AI IDEs. Several involved modifying VS Code
launch.jsonfiles to execute arbitrary code when debugging was started.
In each case, the agent modified a configuration file that controlled its own behavior or the behavior of surrounding tools. The changes were silent and persistent.
Why it works
AI agents treat configuration files as ordinary files. They have no concept of “this file controls my own safety.” When prompted, whether by malicious repository content, a compromised MCP server or injected instructions, agents will read and write config files just like any source code file.
The attack is especially dangerous because it is self-reinforcing. Once safety settings are disabled, the agent will no longer flag subsequent malicious actions.
Which rules block this
Four Vectimus rules prevent configuration poisoning:
- vectimus-fileint-004: Blocks writes to governance config files (
.claude/settings.json, hook configurations). Agents cannot modify their own safety settings. - vectimus-fileint-008: Blocks writes to MCP configuration files. Agents cannot redirect tool calls to attacker-controlled servers.
- vectimus-fileint-007: Blocks writes to VS Code
launch.jsonandextensions.json. Agents cannot plant execution payloads in IDE configurations. - vectimus-fileint-011: Blocks writes to agent instruction and rules files (
.cursorrules,.github/copilot-instructions.md,.claude/instructions.md). Agents cannot modify the files that shape their own behavior.
The deny response tells the agent: “Configuration changes require human review. Suggest the change and let the developer apply it manually.”
What to learn from this
The most dangerous thing an agent can modify is the file that controls what it is allowed to do. Vectimus treats configuration files as a protected category. No agent action, regardless of the prompt, can modify governance settings, MCP configs, IDE launch configurations or agent instruction files. The safety layer protects itself. See the architecture overview for how file integrity policies work and the OWASP agentic mapping for how config poisoning maps to the OWASP top 10 for agentic AI.
Sources
- When Public Prompts Turn Into Local Shells: CurXecute, RCE in Cursor via MCP Auto-Start, AIM Security, CVE-2025-54135 disclosure
- CVE-2025-54135 and CVE-2025-54136, Vulnerabilities in Cursor, Tenable FAQ
- IDEsaster: A Novel Vulnerability Class in AI IDEs, Ari Marzouk, 30+ vulnerabilities across AI IDEs including launch.json poisoning
- New Vulnerability in GitHub Copilot and Cursor: How Hackers Can Weaponize Code Agents, Pillar Security, Rules File Backdoor research