Where the AI Act actually lands on AI-agent fleets — which duties are live today, which are conditional on a high-risk use case, and the deployer-controlled surface a gateway gives the assigned human overseer.
QUICK ANSWERThe AI Act regulates AI systems, not protocols — and most agent fleets (coding, DevOps, back-office) are not high-risk. For ordinary non-high-risk deployments the broadly applicable live duty is Art. 4 AI literacy (the Art. 5 prohibited-practice rules are also live where a use case falls into them). Heavy Art. 26 deployer duties bite only for Annex III high-risk uses, now dated 2 December 2027 under the Digital Omnibus. A gateway’s part: the call-level control surface and the deployer-held evidence behind those duties.
The Act binds providers and deployers of AI systems — your duties attach to your agents and, above all, to your risk tier, so get the tier right first: it decides which duties apply at all. Three things frame where an MCP fleet lands.
Building or branding an AI system makes you a provider; using one under your own authority makes you a deployer. Companies wiring agents to MCP servers are deployers. Routing through a gateway will not usually amount to a substantial modification by itself — though high-risk deployments should still assess whether any change alters intended purpose, performance or risk (Art. 25).
High-risk means Annex III — biometrics, critical infrastructure, education, employment, essential services, law enforcement, migration, justice. Coding assistants, DevOps agents and back-office automation are not on it. Only Art. 4 literacy (and Art. 50 transparency where relevant) applies to them.
If an agent is pointed at an Annex III decision — screening applicants, scoring credit — Art. 26 deployer duties engage, now from 2 December 2027 under the Digital Omnibus (provisional agreement 7 May 2026, pending formal adoption — dates may shift). For that slice, call-level logging and an intervention surface are exactly what the deployer needs.
For each obligation: the question it raises for an agent fleet, the gap when there is no call-level control, and where the gateway fits. Read the tiers honestly — for non-high-risk fleets the broadly applicable live duty is Art. 4 (with the Art. 5 prohibitions applying where relevant); the Art. 26 family is conditional on a high-risk use case and now dated 2 December 2027.
Providers and deployers take measures to ensure, to their best extent, a sufficient level of AI literacy among staff operating AI systems. In force since February 2025. The Digital Omnibus would soften the wording to “take measures to support”, but that amendment is not yet formally adopted.
Do the staff running MCP-connected agents actually know what those agents can do — the real action surface, not just the read paths?
Teams wire agents to server bundles without an inventory of the tools they have just enabled, so literacy measures have nothing concrete to point at.
The catalogue enumerates every connected tool and its risk class — a concrete artefact that supports the literacy duty. It does not discharge it; training and process remain yours.
For ecosystem context: across the public catalogue, 31,002 tools on 4,628 servers — 7,607 write, 1,773 destructive, 1,649 execute, 154 financial. Connecting a server routinely grants far more than read.
Deployers assign human oversight of a high-risk system to natural persons with the necessary competence, training and authority, and with support.
Does the person you assign to oversee a high-risk agent have an actual surface to oversee — something to see and something to pull?
Raw MCP gives the overseer nothing: no live view of what the agent invoked, no lever to stop it short of editing configs.
The gateway equips the assigned human — live call visibility, deny rules, per-person revocation. It supports oversight designed under Art. 14; it is not the oversight itself.
Deployers monitor operation of the high-risk system against its instructions for use, inform the provider or authorities where risks arise, and suspend use where appropriate.
Can you see what a high-risk agent actually invoked — and stop it without ripping out configs across your fleet?
There is no deployer-side record of what the agent called, and no suspend lever short of tearing down the connection by hand.
Every call is evaluated live; disabling a grant or flipping a tool to deny centrally is the practical suspension mechanism the duty asks for.
Deployers keep logs automatically generated by the high-risk system, to the extent the logs are under their control, for at least six months.
When the authority asks for six months of logs under your control, do you hold anything to retain?
In a default setup the deployer holds nothing — the agent-to-server traffic is ephemeral and unrecorded.
The audit log — grant, tool, argument keys, deciding rule, verdict — is a deployer-controlled automatic log, with retention configurable to the duty.
High-risk systems must technically allow automatic recording of events over their lifetime, enabling traceability — including the monitoring required under Art. 26(5). This is primarily a duty on the provider who builds the system.
Where high-risk applies, can the events your agent generated be traced — and does the deployer hold a usable call-level record?
Without a mediation point there is no call-level record to make traceability or the Art. 26(5) monitoring practical.
The gateway produces the call-level record that makes Art. 12-style traceability and Art. 26(6) retention practical for the deployer. It never satisfies Art. 12, which sits with the system’s provider.
High-risk systems are designed so natural persons can effectively oversee them — intervene, interrupt via a stop control, override or disregard output — guarding against automation bias.
Can the assigned human actually intervene in a running agent, or only watch after the fact?
A purely automated gate is a technical control, not human oversight — and on its own it gives the human no place to step in.
An automated policy gate is a technical control, not the oversight itself. What the gateway adds is a real intervention surface — a deny rule, a revoked grant, a central stop — that operationalises the human’s oversight.
Illustrative policies — not complete compliance controls on their own.
Encode the reviewed tool set as the allowed envelope; everything outside it stays denied by default. This keeps the agent operating within its instructions for use — and the deny verdicts are themselves deployer-held records.
{
"version": "1",
"default": "deny",
"tools": {
"list_candidates": {},
"get_candidate_profile": {},
"search_records": {}
}
} Routine reads and updates run; updates to finalised records are denied, and destructive or payment tools are simply not granted. The deny verdicts and the withheld grants are both deployer-held records — and the policy is the overseer’s intervention surface.
{
"version": "1",
"default": "deny",
"tools": {
"get_record": {},
"update_record": {
"deny_if": [
{
"conditions": [
{ "path": "args.status", "op": "eq", "value": "final" }
]
}
]
}
}
} See Writing policies for the policy format, operators, and quota shapes.
Where a high-risk use case applies, an authority can request deployer-held proof. The artefact a gateway deployment hands over for each:
| What the auditor asks for | What the gateway exports |
|---|---|
| Six months of system logs under deployer control (Art. 26(6)) | Audit log export — grant, tool, argument keys, deciding rule, verdict — retained to the period. |
| The oversight assignment and the surface the overseer uses (Art. 26(2)) | The dashboard, deny rules and revocation records the assigned human acts through. |
| Monitoring records and a suspension capability (Art. 26(5)) | The live verdict stream plus central disable — the evidence that monitoring and suspension exist. |
| Operation kept within the instructions for use (Art. 26(1)) | The versioned policy encoding the reviewed operating envelope, with its change history. |
| Tool and system inventory supporting literacy measures (Art. 4) | The catalogue of connected tools with risk classes — the artefact your literacy programme references. |
The Act regulates AI systems, not protocols. MCP is the wire; the agent is the AI system, and you are its deployer. Which duties apply depends on your risk tier: AI literacy (Art. 4) and the Art. 5 prohibited-practice rules are live now, transparency duties (Art. 50) apply from 2 August 2026, and the Art. 26 high-risk deployer duties bite only for Annex III use cases, now dated 2 December 2027 under the Digital Omnibus.
If you build, brand or place an AI system on the market you are a provider; if you use one under your own authority you are a deployer — which is most companies running agents. You can become a provider via Art. 25 by rebranding, substantially modifying a high-risk system, or re-purposing a system into a high-risk use. Routing your traffic through a gateway will not usually be a substantial modification by itself, but high-risk deployments should assess whether any change alters intended purpose, performance or risk.
Almost certainly not. High-risk is the closed Annex III list — biometrics, critical infrastructure, education, employment, essential services, law enforcement, migration, justice. A coding assistant, DevOps agent or back-office bot is not on it. It becomes high-risk only if pointed at an Annex III decision such as screening candidates or scoring credit — and even Annex III systems can escape high-risk under Art. 6(3) where they perform narrow procedural tasks without materially influencing the outcome.
Mandated logging applies only to high-risk systems: automatic event logging by the system (Art. 12, a provider duty) and deployer retention of at least six months (Art. 26(6)). For everything else there is no mandated logging. A call-level audit trail is still the practical evidence base — and the thing you would already hold if your risk tier ever changed.
Yes — it is extraterritorial. Art. 2 catches third-country providers and deployers where the output of the AI system is used in the Union. GPAI model-provider duties under Chapter V, by contrast, sit with the model providers — OpenAI, Anthropic, Google — not with deployers and not with PolicyLayer.
| Default setup | Through the gateway |
|---|---|
| One shared upstream API key on every laptop | Per-person scoped grant tokens, revocable individually |
| No record of what agents called | Per-call audit log: grant, tool, argument keys, rule, verdict |
| Every tool on a server is callable | Deny-by-default — each tool and argument explicitly granted |
| Access rules scattered across client configs | One central, version-controlled policy |
PolicyLayer doesn’t certify your organisation — it gives your compliance team enforceable controls and exportable evidence for the MCP slice of the audit.
Last reviewed 04-06-2026 by the PolicyLayer research team. This guide maps how the framework intersects with MCP deployments — it is not legal advice.
Per-person grants, deny-by-default policy and a per-call audit log — the EU AI Act evidence for the MCP slice of your programme. Live in minutes.
Free to start. No card required.
4,600+ MCP servers and 31,000+ tools scanned and risk-classified.