What is Resource Exhaustion (Agent)?
Agent resource exhaustion is when an AI agent consumes excessive compute, memory, API calls, or tokens — either through manipulation or runaway behaviour — potentially causing cost overruns or outages.
WHY IT MATTERS
AI agents consume resources with every operation: LLM API tokens for reasoning, compute for tool execution, memory for context management, and API quota for external service calls. Resource exhaustion occurs when this consumption spirals beyond intended limits — either through deliberate manipulation or accidental runaway behaviour.
Manipulation scenarios include an attacker crafting inputs that cause the agent to enter infinite reasoning loops, tools returning responses that trigger exponential follow-up calls, or prompt injections that instruct the agent to perform unnecessary expensive operations. The attacker's goal might be financial damage (running up API bills), operational disruption (exhausting rate limits), or distraction (keeping the agent busy while another attack proceeds).
Accidental resource exhaustion is equally common. An agent in a retry loop after a transient error, an agentic workflow with a poorly defined termination condition, or a tool that returns paginated results the agent fetches exhaustively — all of these can consume resources far beyond expectations without any malicious intent.
The financial impact can be severe. LLM API calls are priced per token, and an agent processing millions of tokens in a runaway loop generates significant costs. External API calls may have per-request pricing. Compute resources in cloud environments scale with usage and billing. A single incident can produce thousands of pounds in unexpected charges.
HOW POLICYLAYER USES THIS
Intercept enforces resource boundaries through YAML policies that set rate limits, call count ceilings, and per-session budgets for tool calls. These limits operate independently of the agent's reasoning — even if the agent believes it should continue, Intercept blocks tool calls that exceed configured thresholds. The fail-closed design ensures that exhaustion of Intercept's own resources blocks operations rather than allowing unlimited pass-through.