Microsoft's Agent Governance Toolkit: What It Gets Right and What It Misses
Microsoft just open-sourced the Agent Governance Toolkit — nine packages covering policy enforcement, cryptographic identity, trust scoring, compliance automation, and SRE patterns for AI agents. It’s the clearest signal yet that agent governance is infrastructure, not a nice-to-have.
The toolkit is comprehensive. It’s also completely blind to MCP.
What the Toolkit Does
The Agent Governance Toolkit applies patterns from production infrastructure — service meshes, privilege rings, SRE error budgets — to autonomous agents. Nine packages, each independently installable:
Agent OS is a stateless policy engine. It intercepts agent actions before execution using pattern matching and semantic intent classification. Policies can be written in YAML, OPA Rego, or Cedar. It classifies actions as DESTRUCTIVE_DATA, DATA_EXFILTRATION, or PRIVILEGE_ESCALATION and blocks or downgrades trust accordingly.
Agent Mesh implements zero-trust identity through decentralised identifiers (DIDs) with Ed25519 cryptography. Agents get identities like did:mesh:data-analyst:a7f3b2... with human sponsor accountability and trust scores that decay over time without positive signals.
Agent Hypervisor borrows CPU privilege rings. Ring 0 (trust score ≥900) gets full system access. Ring 3 (<400) gets read-only sandboxed execution. Agents move between rings based on continuous trust assessment.
Agent SRE adapts site reliability engineering to agent behaviour. Safety SLOs, error budgets, circuit breakers. When an agent’s safety SLI drops below 99%, the system automatically restricts capabilities until recovery. Nine chaos engineering templates for resilience testing.
Agent Compliance automates governance verification against OWASP Agentic Top 10, EU AI Act, NIST AI RMF, HIPAA, and SOC 2.
The remaining packages cover runtime supervision, reinforcement learning governance, a plugin marketplace with Ed25519 signing, and framework adapters for LangChain, CrewAI, AutoGen, Semantic Kernel, and others.
What It Gets Right
Microsoft is naming the problem correctly: agents operating without meaningful security controls is the norm, not the exception. The toolkit codifies what anyone running agents in production already knows — you need policy enforcement before tool execution, identity verification between agents, resource limits, and failure containment.
Three things stand out.
Trust decay is a strong primitive. Most access control systems grant permissions and forget about them. Trust scores that degrade over time without positive signals model reality better — an agent that hasn’t been observed behaving correctly for a while shouldn’t retain its privileges.
Compliance mapping matters. Mapping agent behaviour to specific regulatory frameworks (EU AI Act, SOC 2, HIPAA) is work that every enterprise will need to do. Having pre-built mappings saves months.
The execution rings model is elegant. Tying agent capabilities to a continuous trust score rather than static role assignments creates a more responsive security posture. An agent that starts misbehaving gets demoted automatically.
What It Misses
The toolkit enforces policies by hooking into agent framework SDKs. LangChain callbacks, CrewAI decorators, Semantic Kernel filters. This means the enforcement point is inside the agent’s runtime — the agent code must be instrumented with the toolkit’s adapters.
This works for frameworks Microsoft controls or closely partners with. It doesn’t work for MCP.
MCP agents connect to tool servers over a standardised protocol. The agent sends tools/call requests; the server executes them. The agent could be built with any framework, or no framework at all. Claude Desktop, Cursor, Windsurf, custom clients — they all speak MCP, and none of them are going to integrate Microsoft’s SDK hooks into their runtimes.
The toolkit has no concept of:
- Transport-layer interception. MCP tool calls are structured requests with known schemas. You can enforce policies on them without touching agent code. The toolkit requires code-level integration.
- Tool visibility control. MCP’s
tools/listresponse determines what tools an agent even knows about. Hiding destructive tools from the agent is a security primitive the toolkit doesn’t address. - Protocol-native rate limiting. MCP tool calls carry structured arguments. You can rate-limit
create_chargeto 10 calls per hour by counting requests at the transport layer. The toolkit’s rate limiting operates inside the agent’s process. - Stateful counters across sessions. An MCP proxy can maintain spending counters that persist across agent restarts and context window resets. SDK-level hooks lose state when the agent process dies.
Two Enforcement Points, One Gap
The toolkit operates at the agent layer — inside the runtime, evaluating intent before the agent acts. This is valuable for complex policy decisions that require semantic understanding. “Is this agent trying to exfiltrate data?” is a question that benefits from intent classification.
MCP enforcement operates at the transport layer — between the agent and the tool server, evaluating structured requests against deterministic rules. “Is args.amount greater than 50000?” doesn’t need semantic analysis. It needs a comparison operator.
Agent Layer (Microsoft Toolkit) Transport Layer (Intercept)
┌─────────────────────────┐ ┌──────────────────────────┐
│ Semantic intent check │ │ Deterministic rules │
│ Trust scoring │ MCP │ Rate limiting │
│ Framework SDK hooks │ ──────────── │ Tool hiding │
│ Compliance mapping │ request │ Stateful counters │
│ Agent identity (DIDs) │ │ Audit logging │
└─────────────────────────┘ └──────────────────────────┘
Both layers are useful. But here’s the gap: the toolkit assumes it controls the agent’s runtime. For MCP, the runtime is a black box. You can’t instrument Claude Desktop. You can’t add SDK hooks to Cursor. The only reliable enforcement point is the transport — the protocol connection between agent and server.
Deterministic vs Semantic
The toolkit’s Agent OS uses semantic intent classification to detect threats. It analyses an action and classifies it as DESTRUCTIVE_DATA or PRIVILEGE_ESCALATION regardless of how the action is phrased. This is powerful for catching novel attack patterns the policy author didn’t anticipate.
But semantic classification is probabilistic. The same action, analysed twice, might get different classifications. For safety constraints with definitive answers — “is this charge over $500?”, “has this agent exceeded 5 issues per hour?” — deterministic evaluation is more reliable:
version: "1"
description: "Stripe MCP policies"
tools:
create_charge:
rules:
- name: "max single charge"
conditions:
- path: "args.amount"
op: "lte"
value: 50000
on_deny: "Single charge cannot exceed $500.00"
- name: "daily spend cap"
conditions:
- path: "state.create_charge.daily_spend"
op: "lte"
value: 1000000
on_deny: "Daily spending cap of $10,000.00 reached"
state:
counter: "daily_spend"
window: "day"
increment_from: "args.amount"
This policy produces the same result for the same input, every time. No classification variance, no model dependency, no probabilistic drift. The trade-off is obvious: you can’t catch novel attacks you didn’t write rules for. But for known constraints — spending limits, rate limits, tool access control — deterministic enforcement is the right tool.
What This Means for MCP Teams
Microsoft validating agent governance as critical infrastructure is good for everyone building in this space. The toolkit raises awareness, establishes vocabulary, and gives enterprise teams a starting point.
If you’re running MCP agents, the practical question is: where does enforcement belong?
For identity, trust scoring, and compliance reporting — SDK-level toolkits like Microsoft’s make sense, assuming you control the agent runtime.
For tool-level spending controls, rate limiting, and access control on MCP servers — transport-layer enforcement is the only option that works regardless of which client your agents run on. You don’t need to instrument the agent. You configure a policy file and put a proxy in the connection path.
The two approaches are complementary. An enterprise might use Agent Governance Toolkit for broad agent lifecycle management and Intercept for deterministic MCP tool controls. The first governs the agent. The second governs the tools.
Ready to add deterministic policy enforcement to your MCP servers?
- Quick Start Guide — Production-ready in 5 minutes
- GitHub — Open source proxy
Protect your agent in 30 seconds
Scans your MCP config and generates enforcement policies for every server.
npx -y @policylayer/intercept init