What is Agent Rate Limiting?
Restricting the number or frequency of an agent's tool calls within a time window — preventing runaway loops, excessive resource consumption, and denial-of-service patterns against MCP servers.
WHY IT MATTERS
An unconstrained agent loop can fire hundreds of tool calls per minute. Rate limiting caps this velocity to prevent agents from overwhelming MCP servers, consuming excessive resources, or entering infinite retry loops.
Rate limiting complements tool-level permissions. An agent might be permitted to call search — but not 1,000 times per minute. Rate limits add the time dimension to access control.
Smart rate limiting adapts to context: normal operation at 60 calls per minute, automatic throttling when error rates spike, full halt when rate exceeds a critical threshold (circuit breaker pattern).
HOW POLICYLAYER USES THIS
Intercept enforces YAML-defined rate limits on MCP tool calls. Rate limits can be set per tool, per agent, or globally — for example, 60 calls per minute for read_file, 10 per minute for write_file, 100 per minute across all tools. When the limit is exceeded, Intercept denies the call and returns a structured error to the client.