What is an MCP Rug Pull?

1 min read Updated

An attack where an MCP server silently modifies a tool's description or behaviour after the client has approved it, turning a previously trusted tool malicious without triggering a new approval flow.

WHY IT MATTERS

Most MCP clients approve tools once at connection time. After approval, tool calls flow through without re-evaluation. A rug pull exploits this by changing what a tool does after it's been trusted.

The server might add data exfiltration to a previously benign tool, or modify argument handling to redirect outputs. Because the client already approved the tool, these changes are invisible. Per-call enforcement — evaluating every invocation against policy, not just the first — is the only defence.

HOW POLICYLAYER USES THIS

Intercept evaluates every tool call against policy at invocation time, not at approval time. Even if a server changes a tool's behaviour, the policy still gates every call.

FREQUENTLY ASKED QUESTIONS

How common are rug pulls?
Documented in security research (Invariant Labs, Acuvity) but not yet widespread in the wild. As the MCP ecosystem grows and more agents handle sensitive operations, the incentive for rug pulls increases.
Can annotations prevent this?
No. Annotations are self-reported and can be changed alongside the tool behaviour. Independent, per-call enforcement is needed.

FURTHER READING

Let agents act without letting them run wild.

Deterministic policy on every MCP tool call. Per-identity grants. Full audit log.

// GET IN TOUCH

Have a question or want to learn more? Send us a message.

Message sent.

We'll get back to you soon.