// GLOSSARY -- AI AGENT SECURITY

What is Agent Drift?

2 min read Updated Mar 8, 2026

The gradual divergence of an AI agent's behaviour from its intended purpose over time, potentially caused by context accumulation, model updates, environmental changes, or evolving tool landscapes.

WHY IT MATTERS

Agent drift is subtle and insidious. Unlike a rogue agent that deviates sharply, a drifting agent changes gradually — each individual action seems reasonable, but the cumulative trajectory moves away from the intended purpose. It is the boiling frog problem applied to AI behaviour.

Several mechanisms cause drift. Context accumulation is the most common: as an agent processes more information over a session or across sessions, its decision-making shifts based on accumulated context that may include irrelevant, misleading, or adversarial content. The agent's effective system prompt evolves even though the actual system prompt has not changed.

Model updates are another driver. When the underlying LLM is updated — even minor version changes — the agent's behaviour can shift in unexpected ways. A tool call pattern that was reliable with one model version may behave differently with another. Without continuous monitoring, these shifts go unnoticed until they cause problems.

Environmental drift matters too. The MCP servers an agent connects to evolve — tools are added, argument schemas change, response formats shift. An agent calibrated for one environment may behave incorrectly when the environment changes underneath it, invoking tools with stale assumptions about their behaviour.

HOW POLICYLAYER USES THIS

PolicyLayer provides a fixed reference frame against which agent drift is measured and contained. Regardless of how the agent's behaviour evolves, YAML policies define hard boundaries that do not drift. An agent that gradually shifts towards more permissive tool usage is caught the moment it exceeds policy boundaries. PolicyLayer's audit logs also enable trend analysis — tracking tool usage patterns over time to detect gradual shifts before they become policy violations. The policies themselves serve as documentation of intended behaviour, making it straightforward to identify when actual behaviour has diverged.

See the MCP Security reference →

FREQUENTLY ASKED QUESTIONS

How is agent drift different from a rogue agent?

A rogue agent deviates sharply, often due to a specific cause like prompt injection. Agent drift is gradual and may have no single cause — it is the accumulation of small behavioural shifts over time. Both are dangerous, but drift is harder to detect because no individual action is obviously wrong.

Can drift happen within a single session?

Yes. Long-running agent sessions are particularly susceptible. As the context window fills with tool responses, previous outputs, and accumulated information, the agent's decision-making can shift significantly from how it behaved at the start of the session.

How do I monitor for agent drift?

Track tool usage patterns over time using PolicyLayer's audit logs. Look for trends: increasing use of tools that were rarely invoked, changing argument patterns, or growing frequency of policy-boundary tool calls. Establish baselines and alert on deviations.

What is Agent Drift?

WHY IT MATTERS

HOW POLICYLAYER USES THIS

FREQUENTLY ASKED QUESTIONS

FURTHER READING

Take your agents live. Without losing control.

What is Agent Drift?

WHY IT MATTERS

HOW POLICYLAYER USES THIS

FREQUENTLY ASKED QUESTIONS

RELATED TERMS

FURTHER READING

Take your agents live. Without losing control.