// GLOSSARY -- POLICY ENFORCEMENT

What are Agent Guardrails?

1 min read Updated Mar 8, 2026

Safety mechanisms constraining AI agent behaviour within acceptable boundaries. Guardrails operate at multiple levels — from prompt instructions to infrastructure-level enforcement — to prevent agents from taking unauthorised or harmful actions through tool calls.

WHY IT MATTERS

Guardrails span three levels: prompt (natural language instructions telling the agent what not to do), application (code-level checks before executing tool calls), and infrastructure (external enforcement independent of the agent). Only infrastructure-level guardrails cannot be bypassed by prompt injection or agent reasoning.

Prompt-level guardrails are easily circumvented — a well-crafted prompt injection can override any natural language instruction. Application-level guardrails are stronger but require code changes for every agent and tool. Infrastructure-level guardrails are the gold standard — they operate outside the agent entirely.

Best systems layer all three: prompts set intent, application code provides first-pass validation, infrastructure provides the hard stop that cannot be bypassed.

HOW POLICYLAYER USES THIS

PolicyLayer is the infrastructure-level guardrail layer. As a transparent MCP proxy, it enforces YAML-defined policies on every tool call — independent of the agent's reasoning, immune to prompt injection, and requiring no code changes. If the policy says deny, the call is denied. The agent's LLM cannot override this decision.

Read the policy-writing guide →

FREQUENTLY ASKED QUESTIONS

Why can't prompt guardrails be enough?

Prompts can be overridden through jailbreaking, prompt injection, or model uncertainty. PolicyLayer operates outside the model's reasoning loop entirely — it enforces policies regardless of what the model 'decides' to do.

How many layers of guardrails do I need?

At minimum: infrastructure-level tool call enforcement via PolicyLayer. Ideally: prompt instructions + application-level validation + PolicyLayer enforcement. Each layer catches what others miss.

Do guardrails slow agents down?

PolicyLayer adds single-digit milliseconds per tool call evaluation. The safety benefit massively outweighs the negligible performance impact.

What are Agent Guardrails?

WHY IT MATTERS

HOW POLICYLAYER USES THIS

FREQUENTLY ASKED QUESTIONS

FURTHER READING

Take your agents live. Without losing control.

What are Agent Guardrails?

WHY IT MATTERS

HOW POLICYLAYER USES THIS

FREQUENTLY ASKED QUESTIONS

RELATED TERMS

FURTHER READING

Take your agents live. Without losing control.