What is a Confused Deputy Attack?

2 min read Updated

A confused deputy attack tricks a privileged AI agent into performing actions it shouldn't by exploiting its access to MCP tools. The agent becomes the 'confused deputy' — authorised but manipulated.

WHY IT MATTERS

The confused deputy problem is a classic security concept: a program with elevated privileges is tricked into misusing those privileges on behalf of an unauthorised party. AI agents are the quintessential confused deputy — they have broad tool access, they follow instructions, and they can be manipulated through their context.

In the MCP ecosystem, the agent holds credentials for multiple servers and tools. It's authorised to read databases, send messages, modify files, and call APIs. An attacker who can influence the agent's context — through tool poisoning, indirect injection, or a malicious server — can redirect these privileges toward harmful actions. The agent performs the actions using its own legitimate credentials, making attribution difficult.

The attack is especially potent because AI agents lack the human intuition to question suspicious requests in context. A human administrator might pause before deleting a production database because an email told them to. An agent processes the instruction, verifies it has the required tool access, and executes — all in milliseconds.

Every other attack vector in this glossary ultimately aims to create a confused deputy scenario. Tool poisoning, context poisoning, indirect injection — they're all techniques to manipulate the deputy. The confused deputy is the end state where the agent's own privileges become the weapon.

HOW POLICYLAYER USES THIS

Intercept directly addresses the confused deputy problem by separating authorisation from execution. The agent may be confused about its intent, but Intercept's YAML policies enforce ground-truth rules on what actions are permitted. Destructive operations require explicit policy allowance, argument validation constrains parameters to safe ranges, and tool denylists prevent access to dangerous operations regardless of the agent's reasoning. The deputy may be confused, but the policy layer is not.

FREQUENTLY ASKED QUESTIONS

How is the confused deputy problem different in AI agents vs traditional software?
Traditional confused deputy attacks exploit fixed code paths. AI agents are more vulnerable because their behaviour is determined by natural language context — which is far easier to manipulate than compiled code. The attack surface is the agent's entire context window.
Can reducing agent permissions prevent confused deputy attacks?
Partially. Principle of least privilege reduces the blast radius, but agents often need broad access to be useful. Policy enforcement at the proxy layer provides fine-grained control without removing necessary capabilities.
Is human-in-the-loop the only complete defence?
It's the strongest defence but impractical for high-throughput agent operations. Automated policy enforcement at the tool call layer provides comparable safety for well-defined operations, with human approval reserved for high-risk or ambiguous actions.

FURTHER READING

Enforce policies on every tool call

Intercept is the open-source MCP proxy that enforces YAML policies on AI agent tool calls. No code changes needed.

npx -y @policylayer/intercept
github.com/policylayer/intercept →
// GET IN TOUCH

Have a question or want to learn more? Send us a message.

Message sent.

We'll get back to you soon.