How AI-agent tool calls fall under the Trust Services Criteria — what auditors ask, where default setups fall short, and the controls that produce the evidence.
QUICK ANSWERSOC 2 has no AI module. MCP tool calls are tested under the existing Common Criteria — logical access (CC6.1–6.3), monitoring (CC7.2–7.3), change management (CC8.1). The recurring finding: privileged actions must trace to an identifiable person, and a shared API key is not attribution.
A protocol can’t “be SOC 2 compliant” any more than HTTP can — the audit tests your controls over your deployment: the AI clients, the servers they reach, and everything in between. Three things put that traffic in scope.
A tools/call reaches the database, repo or payment API behind the tool. That is CC6 territory: access must be mediated, inventoried and attributable.
The 2022 points of focus direct auditors to evaluate all actor types, not just employees. Agents get the same provisioning, least-privilege and review expectations as privileged service accounts.
Current auditor commentary favours deployments where every identity is registered, every call mediated by policy, every verdict recorded — what a gateway produces by construction.
For each criterion: what it requires, the question asked of your AI traffic, the default-setup gap, and the control that closes it.
Logical access security software and architectures over protected information assets, starting with identifying, inventorying and classifying those assets.
Each tool call is logical access to the upstream system. Is that access mediated by access-control software, and are the tools inventoried and classified by what they can do?
A shared API key in client config means no access-control layer between agent and upstream — and no inventory or risk classification of the tools exposed.
The gateway sits inline as the access-control layer; credentials stay in central custody, out of client configs; the catalogue provides the asset inventory with risk classification.
31,002 tools across 4,628 servers carry a risk classification — 19,718 Read · 7,607 Write · 1,773 Destructive · 1,649 Execute · 154 Financial. The classification is the CC6.1 inventory artefact.
Users are registered and authorised before credentials are issued and access is granted.
Who authorised this person — or this agent — to call this tool? The auditor wants a record that access was granted deliberately, per identity, before first use.
A shared key has no registration step: whoever holds it has access, with no per-identity authorisation record.
Per-person scoped grants — each identity is registered and explicitly authorised to a tool set before any call. The grant issuance log is the provisioning evidence.
Access is authorised, modified and removed based on roles and responsibilities, considering least privilege and segregation of duties.
Is each agent’s tool access scoped to what its role needs — and is access revoked promptly when someone leaves? Auditors routinely sample leavers and ask for timestamped revocation proof.
A shared key is all-or-nothing: no least privilege, and revoking it cuts off every user at once — so in practice it never gets revoked.
Grants scope tool access per identity; one person’s grant is revoked without touching anyone else’s, leaving a clean, timestamped deprovisioning trail.
1,773 destructive, 1,649 execute and 154 financial tools in the catalogue are exactly what least-privilege scoping keeps away from read-only agents.
Logical access security measures against threats from outside the system boundary.
Where is the trust boundary for AI traffic, and what enforces it? How are external clients authenticated and constrained at the perimeter?
Every client connecting directly to every server with its own copy of the key means there is no defined boundary — the secrets live on every laptop.
The gateway is the single enforced boundary. Upstream credentials never leave it; clients hold only their own revocable grant token.
4,628 catalogued servers consolidate behind one boundary instead of N direct client-to-server connections.
System components are monitored for anomalies indicative of malicious acts or errors — explicitly including unauthorised actions by authorised users and use of compromised credentials.
A read-scoped agent attempting a destructive call, a spike in call volume, off-pattern arguments — the anomaly surface of AI traffic. Can you see it?
Default MCP produces no call-level telemetry. Anomalies are invisible because nothing records the calls.
Every call is logged with its grant, tool, argument keys, the rule that decided, and the verdict. Denied-call records are the tool-call analogue of the failed-login logs auditors sample; rate limits and spend caps bound the blast radius.
Risk classification lets monitoring prioritise what matters: calls touching the 1,773 destructive and 154 financial tools in the corpus.
Detected events are evaluated to determine whether they are security incidents that could impair objectives.
When a high-risk call is denied or a policy trips, can the organisation triage it as a potential security event?
Without verdict records there is nothing to evaluate — events pass unexamined.
Each allow/deny verdict carries the deciding rule, giving a reviewable event record that feeds incident evaluation (and response under CC7.4).
Changes to infrastructure, data, software and procedures are authorised, designed, tested, approved and documented.
Agent actions that change production systems need an authorisation trail — and the policies and grant scopes governing them are themselves changes to control procedures.
On a shared key, agents make changes with no approval workflow and no record of who sanctioned the capability.
Deny-by-default policy routes privileged actions through explicit, documented allow rules — with argument conditions for fine-grained gates. Policy and scope changes are versioned and attributable.
Illustrative policies — not complete compliance controls on their own.
Allow reads and listings. Deny everything else by default — destructive and execute paths never open unless a rule explicitly opens them.
{
"version": "1",
"default": "deny",
"tools": {
"list_repositories": {},
"get_file_contents": {},
"search_code": {}
}
} Allow merges — except into main, which stays blocked so the change goes through your normal human approval path. The deny verdict lands in the audit log with the rule that decided.
{
"version": "1",
"default": "deny",
"tools": {
"create_pull_request": {},
"merge_pull_request": {
"deny_if": [
{
"conditions": [
{ "path": "args.base", "op": "eq", "value": "main" }
]
}
]
}
}
} See Writing policies for the policy format, operators, and quota shapes.
The standard requests for the access and monitoring criteria, and the artefact a gateway deployment hands over for each.
| What the auditor asks for | What the gateway exports |
|---|---|
| Full access listing — who can reach what (auditor samples ~25 identities) | Grant roster: every person and agent identity with its scoped tool set, per server. |
| Access-request and approval records (CC6.2) | Grant issuance log — identity, scope, issuer, timestamp. |
| Periodic access-review evidence (CC6.3) | Exportable grant review: active grants by person, scope and age. |
| Deprovisioning proof for sampled leavers (CC6.3) | Timestamped grant-revocation records, per identity. |
| Logs of successful and failed access attempts (CC7.2) | Per-call audit log: grant, tool, argument keys, deciding rule, allow/deny verdict. |
| Change approval trails for production-affecting actions (CC8.1) | Versioned policy history plus the per-call record of which rule authorised each privileged action. |
| Agent identity inventory (emerging 2026 ask) | The grant roster doubles as the non-human-identity inventory, classified by tool risk tier. |
No protocol can be — SOC 2 certifies an organisation’s controls over a defined system, not a wire protocol. The right question is whether your MCP deployment has the access-control, monitoring and change-management controls the criteria test, which depends on your architecture, not on MCP itself.
There is no separate AI module as of 2026. Auditors evaluate agent tool calls under the existing Common Criteria — chiefly logical access (CC6.1–6.3), monitoring (CC7.2–7.3) and change management (CC8.1). The recurring new finding is attribution: privileged actions must trace to an accountable identity, which autonomous agents on shared keys break.
It is the textbook gap. A shared key used by many agents is not attribution — it fails the CC6.1/CC6.2 expectation that access is per-identity and authorised, and it leaves no call-level audit trail for CC7.2.
Replace shared keys with per-person scoped identities, mediate every tool call through deny-by-default policy, keep upstream credentials in central custody rather than client configs, and log every call with its verdict. Current auditor commentary favours sanctioned, proxied, logged deployments.
No tool confers compliance — your auditor attests your controls across the whole system. What the gateway contributes is the control infrastructure and evidence for the MCP slice: scoped grants, deny-by-default enforcement, and the per-call audit trail your auditor samples. Your other systems, and your upstream vendors’ own reports, remain separate questions.
| Default setup | Through the gateway |
|---|---|
| One shared upstream API key on every laptop | Per-person scoped grant tokens, revocable individually |
| No record of what agents called | Per-call audit log: grant, tool, argument keys, rule, verdict |
| Every tool on a server is callable | Deny-by-default — each tool and argument explicitly granted |
| Access rules scattered across client configs | One central, version-controlled policy |
PolicyLayer doesn’t certify your organisation — it gives your compliance team enforceable controls and exportable evidence for the MCP slice of the audit.
Last reviewed 04-06-2026 by the PolicyLayer research team. This guide maps how the framework intersects with MCP deployments — it is not legal advice.
Per-person grants, deny-by-default policy and a per-call audit log — the SOC 2 evidence for the MCP slice of your programme. Live in minutes.
Free to start. No card required.
4,600+ MCP servers and 31,000+ tools scanned and risk-classified.