What is a Content Injection Trap?
An agent trap that exploits the gap between human perception and machine parsing, using hidden text, dynamic rendering, or encoding tricks to inject instructions that the agent processes but humans cannot see.
WHY IT MATTERS
Agents parse raw HTML, metadata, and binary data that humans never see. Attackers embed instructions in CSS comments, invisible text, image metadata, or dynamically rendered content that appears only to machine parsers.
These traps are particularly dangerous because human reviewers can't detect them by looking at the page. The content looks normal to humans while containing a completely different set of instructions for the agent.
HOW POLICYLAYER USES THIS
Intercept's tool-level enforcement ensures that even if an agent processes injected content and attempts to act on it, the resulting tool calls are still evaluated against policy.