grade_agent_run

Grade a single agent run on both outcome quality (task success, regressions, time) and process quality (recon, risk, tests, gates, learnings). Combines deterministic grading from the task bank

SERVERNodebench SOURCEnodebench-mcp

Low RISK CLASS

Category Read

Parameters 00 required

Recommended Allowedsee the rule below

Registry record Grade F, identity unverified Pull the record →

This record as markdown: /tools/io-github-homenshum-nodebench/grade-agent-run.md

WHAT IT DOES

What grade_agent_run does on Nodebench

AI agents call grade_agent_run to retrieve information from Nodebench without modifying anything. It is typically the context-gathering step in research, monitoring, and reporting workflows, before the agent takes action elsewhere.

RISK

Why grade_agent_run is rated Low

Grading/scoring an agent run is primarily an analytical/evaluation operation that reads run data and produces a score. It does not appear to create, modify, delete, execute code, or move money. However, it could write a grade record to storage; without clear evidence of side effects, Read is the most appropriate category.

From the tool's definition 'Grade a single agent run on both outcome quality (task success, regressions, time) and process quality (recon, risk, tests, gates, learnings). Combines deterministic grading from the task bank'

Attacks that exploit this kind of access

RECOMMENDED RULE

The rule that runs grade_agent_run safely

PolicyLayer is an MCP gateway: it sits between your AI agents and Nodebench, and checks every tool call against a rule you set before the call runs. Nothing changes on the server itself. For grade_agent_run, this is the rule to start with:

grade_agent_run Allowed

grade_agent_run is read-only, so it stays allowed. Everything else on the server is denied unless you say otherwise.

View as policy code

policy.json

{
  "version": "1",
  "default": "deny",
  "tools": {
    "grade_agent_run": {}
  }
}

ALLOW ONLY THIS TOOL → Instant setup, no code required.

The button opens the PolicyLayer dashboard: create your workspace, connect Nodebench, apply this rule, and every grade_agent_run call is checked against it from then on.

FAQ

Questions about grade_agent_run

What does the grade_agent_run tool do? +

Grade a single agent run on both outcome quality (task success, regressions, time) and process quality (recon, risk, tests, gates, learnings). Combines deterministic grading from the task bank. It is categorised as a Read tool in the Nodebench MCP Server, which means it retrieves data without modifying state.

How do I enforce a policy on grade_agent_run? +

Register the Nodebench MCP server in PolicyLayer and add a rule for grade_agent_run: allow, deny, rate-limit, or require approval. Point your MCP client at the PolicyLayer proxy URL and the rule is enforced on every call, before it reaches Nodebench. Nothing to install.

What risk level is grade_agent_run? +

grade_agent_run is a Read tool with low risk. Read-only tools are generally safe to allow by default.

Can I rate-limit grade_agent_run? +

Yes. Add a rate_limit block to the grade_agent_run rule in your PolicyLayer policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.

How do I block grade_agent_run completely? +

Set action: deny in the PolicyLayer policy for grade_agent_run. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.

What MCP server provides grade_agent_run? +

grade_agent_run is provided by the Nodebench MCP server (nodebench-mcp). PolicyLayer sits as a proxy in front of this server to enforce policies before tool calls reach the server.

KEEP EXPLORING

More on Nodebench, and thousands of servers like it.

This server

All 824 Nodebench tools →

Across the catalogue

get_agent on 0rca Dojo MCP Server → The MCP Attack Database →

Guides

Roll out MCP under one policy →Data exfiltration →MCP token cost →