Critical Risk →

tool_risk_score

Compute 0-100 risk score for any tool + input combination. 0=minimal risk (read-only), 100=critical (payment/irreversible). Detects secrets, injection attempts, high-value amounts. Use before deciding whether to proceed with a tool call.

Part of the Agentguard server.

tool_risk_score can permanently delete data in Agentguard, with no limits today. PolicyLayer puts allow, deny, and rate-limit rules on every call. Live in minutes.

SECURE AGENTGUARD →

Free to start. No card required.

AI agents may call tool_risk_score to permanently remove or destroy resources in Agentguard. Without a policy, an autonomous agent could delete critical data in a loop with no way to undo the damage. PolicyLayer blocks destructive tools by default and requires explicit human approval before enabling them.

Without a policy, an AI agent could call tool_risk_score in a loop, permanently destroying resources in Agentguard. There is no undo for destructive operations. PolicyLayer blocks this tool by default and only allows it when a human explicitly approves the action.

Destructive tools permanently remove data. Block by default. Only enable with explicit approval workflows.

policy.json
{
  "version": "1",
  "default": "deny",
  "hide": [
    "tool_risk_score"
  ]
}

See the full Agentguard policy for all 24 tools.

Get this rule live on your own Agentguard server in minutes. PolicyLayer enforces it on every call, before it runs.

ENFORCE ON MY AGENTGUARD →

View all 24 tools →

These attack patterns abuse exactly the kind of access tool_risk_score gives an agent. Each links to the full case and the policy that stops it:

Browse the full MCP Attack Database →

Every attack above starts with a tool call. PolicyLayer checks each one against your policy first, so tool_risk_score only ever does what you allow.

SECURE AGENTGUARD →

Other destructive tools across the catalogue. The same approach applies to each: deny by default, or require human approval.

What does the tool_risk_score tool do? +

Compute 0-100 risk score for any tool + input combination. 0=minimal risk (read-only), 100=critical (payment/irreversible). Detects secrets, injection attempts, high-value amounts. Use before deciding whether to proceed with a tool call.. It is categorised as a Destructive tool in the Agentguard MCP Server, which means it can permanently delete or destroy data. Block by default and require explicit approval.

How do I enforce a policy on tool_risk_score? +

Register the Agentguard MCP server in PolicyLayer and add a rule for tool_risk_score: allow, deny, rate-limit, or require approval. Point your MCP client at the PolicyLayer proxy URL and the rule is enforced on every call, before it reaches Agentguard. Nothing to install.

What risk level is tool_risk_score? +

tool_risk_score is a Destructive tool with critical risk. Critical-risk tools should be blocked by default and only enabled with explicit human approval.

Can I rate-limit tool_risk_score? +

Yes. Add a rate_limit block to the tool_risk_score rule in your PolicyLayer policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.

How do I block tool_risk_score completely? +

Set action: deny in the PolicyLayer policy for tool_risk_score. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.

What MCP server provides tool_risk_score? +

tool_risk_score is provided by the Agentguard MCP server (https://feedoracle.io/guard-oracle/mcp/). PolicyLayer sits as a proxy in front of this server to enforce policies before tool calls reach the server.

Enforce policy on every Agentguard tool call.

Deterministic rules across all 24 Agentguard tools. Per-identity grants. Full audit log. Live in minutes. Nothing to install.

Free to start. No card required.

4,600+ MCP servers and 31,000+ tools scanned and risk-classified.

// GET IN TOUCH

Have a question or want to learn more? Send us a message.

Message sent.

We'll get back to you soon.