Can an AI agent delete data through the Nodebench MCP server?

Yes. The Nodebench server exposes 10 destructive tools including abandon_cycle, archiveDocument, cleanup_stale_runs. These permanently remove resources with no undo. PolicyLayer blocks destructive tools by default so they never reach the upstream server.

How do I prevent bulk modifications through Nodebench?

The Nodebench server has 192 write tools including accept_shared_task, ack_shared_context, add_forecast_evidence. Set a rate limit in your policy -- for example, 10 calls per hour prevents an agent from making more than 10 modifications per hour. PolicyLayer enforces this at the gateway, before calls reach Nodebench.

How many tools does the Nodebench MCP server expose?

824 tools across 4 categories: Destructive, Execute, Read, Write. 512 are read-only. 312 can modify, create, or delete data.

How do I enforce a policy on Nodebench?

Register the Nodebench MCP server in PolicyLayer, apply the suggested rules above (adjust the limits to your use case), and point your AI client at the PolicyLayer proxy URL instead of the server directly. Your agents keep the same tools; PolicyLayer evaluates every call against policy before it executes. Nothing to install, live in minutes.

Nodebench MCP Policy · 824 Tools

FULL CATALOGUE

All 824 Nodebench tools

DESTRUCTIVE 10 tools

Destructive abandon_cycle Abandon an active verification cycle that will not be completed. Use this to clean up orphaned or stale cycles Destructive archiveDocument Archive (soft-delete) a document and all its children recursively. Destructive cleanup_stale_runs Clean up orphaned eval runs stuck in Destructive deeptrace_revoke_passport Revoke an agent Destructive delete_learning Delete a learning by key. Use when a learning is outdated or incorrect. Destructive delete_sandbox_policy Delete a sandbox policy. Fails if any active sessions are using it. Destructive deleteAgentMemory Delete stored memory by key Destructive removeDocumentFromFolder Remove a document from a folder. Destructive share_revoke_packet_link Revoke a local share link so it no longer counts as active. Destructive unload_toolset Remove a dynamically loaded toolset from the current session to free up context. Cannot unload toolsets from t

EXECUTE 110 tools

Execute benchmark_models Run the same prompt against multiple LLM providers and compare responses. Returns side-by-side results with la Execute bootstrap_parallel_agents Detect whether a target project repo has parallel agent infrastructure and, if not, scaffold everything needed Execute build_banking_packet Build a banker-readiness packet from the canonical company packet. Execute build_company_profile_starter Build a starter PitchBook/Crunchbase-like company profile. Execute build_founder_operating_model Build the complete founder operating model: execution order, queue topology, packet routing, source trust poli Execute build_research_digest Generate a digest of new (unseen) articles from RSS feeds. Compares against previously seen articles via SQLit Execute build_shared_context_subscription Build the exact pull/subscription manifest an agent client should use to watch a packet or packet scope. Execute build_shared_context_subscription_manifest Build a filtered snapshot/events/pull manifest for one peer, packet class, producer, scope, or subject so clie Execute build_submission_export Build a generic submission export from the canonical company packet. Execute build_temporal_graph Build a temporal relationship graph for an entity. Execute burst_capture Capture N sequential screenshots at fixed intervals using Playwright. Execute call_driver_tool Invoke a tool on a connected MCP driver. This proxies the call to the external MCP server (e.g. playwright-mcp Execute call_llm Call an LLM model directly and get the response with metrics (tokens, latency). Uses available API keys: Gemin Execute call_openclaw_skill Run an OpenClaw skill safely through security checks. Execute call_webmcp_tool Invoke a WebMCP tool on a connected origin. The tool is executed in the browser page context via page.evaluate Execute capture_surface_stats Capture Android SurfaceFlinger stats and logcat for jank analysis (Layer 0 only). Returns janky frame counts, Execute capture_ui_screenshot Capture a screenshot of a URL using headless Playwright. Returns the screenshot as an inline image that multim Execute compile_decision_packet Compile entity intelligence into a decision-ready packet. Execute compile_scenarios Generate 3-7 future scenario branches for an entity or decision. Execute compile_tension_model Model explicit tensions between forces for a decision or entity. Execute compute_ssim_analysis Compute block-based SSIM analysis on a set of frame images. Uses 8x8 blocks with parallel ProcessPoolExecutor. Execute connect_mcp_driver Connect to an external MCP server and make its tools available through nodebench-mcp. Predefined drivers: Execute connect_openclaw Connect to an OpenClaw agent with a security policy applied. Execute connect_webmcp_origin Connect to a WebMCP-enabled website via Playwright. Navigates to the URL, intercepts navigator.modelContext to Execute convex_pre_deploy_gate Run a comprehensive pre-deployment quality gate. Checks: convex/ directory structure, schema.ts validity, depr Execute convex_quality_gate Run a configurable quality gate across all stored audit results. Like SonarQube Execute delta_self_dogfood Dogfood NodeBench Delta on itself. Verifies runtime health, setup friction, distribution surfaces, and compoun Execute disconnect_driver Disconnect from an external MCP driver and shut down its child process. Use this to clean up or to reconnect w Execute disconnect_openclaw Disconnect an OpenClaw session and generate a safety summary. Execute disconnect_webmcp_origin Disconnect from a WebMCP origin and close the browser page. Use this to clean up resources or to reconnect wit Execute dive_auto_discover Scan the current page DOM and auto-register components in the dive tree. Discovers semantic landmarks (nav, he Execute dive_fix_verify After fixing a bug, verify the fix by re-navigating to the affected route, comparing before/after state, and u Execute dive_interaction_test Define and track a structured interaction test for a component. Provide preconditions and a sequence of test s Execute dive_reexplore Re-traverse a route after code changes to detect regressions and verify fixes. Compares the current state agai Execute end_openclaw_session End an OpenClaw session and generate a safety summary. Execute enforce_merge_gate Pre-merge validation combining git state, verification cycles, eval runs, test results, and quality gates. Ret Execute extract_video_frames Record screen and extract key frames from an Android device (Layers 1+2). Uses adb screenrecord then ffmpeg sc Execute founder_local_weekly_reset One-call convenience: gathers all local context and produces a complete Execute get_path_replay Replay a session Execute gtm_script_builder Build a starter GTM script for the current founder wedge. Execute invoke_openclaw_skill Run an OpenClaw tool safely through security checks. Execute invoke_view_tool Invoke a per-view tool on the current or specified view. Execute judge_request_retry Request a retry, re-plan, escalation, or stop for a failed subtask. Execute judge_tool_output Run the 7-criterion LLM judge on a tool Execute load_toolset Dynamically load a toolset into the current session. After loading, the tools become immediately available for Execute log_interaction Log and optionally auto-execute an interaction step. If the built-in Playwright browser is active (launched by Execute manipulate_screenshot Manipulate a screenshot using sharp (image processing). Supports crop (extract a region), resize, and annotate Execute navigate_to_view Navigate to a specific view in the NodeBench AI frontend. Execute nb_start_agent Start a new agent conversation with an optional initial message. Execute nb_switch_research_tab Switch the research hub tab. Execute nodebench_ask_agent Send a question to the NodeBench AI agent. Returns a structured response with reasoning and sources. Execute nodebench_navigate Navigate to a specific view. Returns the target view Execute nodebench.claims.verify Re-run deterministic public/private boundary and source-evidence verification for a stored public claim. Execute nodebench.research_role Run or reuse public role research, store public hiring or market claims, and return a compact role dossier. Execute nodebench.research_run Run adaptive, evidence-backed research across one or more subjects. Automatically resolves entities, infers sc Execute open_dive_dashboard Open the NodeBench UI Dive dashboard in a browser. Shows the full flywheel cycle: Execute open_local_dashboard Start the local Daily Brief dashboard server if needed, and return the URL. The dashboard shows Brief metrics, Execute project_financials Build 5-year financial projections based on historical data and industry assumptions. Projects: - Revenue wit Execute readiness_scan Run a founder readiness scan against the progression and diligence model. Execute render_flow_visualization Render flow visualization with colored bounding boxes for each flow group. Supports overlay on a rendered page Execute request_execution_approval Request a human approval gate for a risky execution-trace action. Approval state is written onto the live run Execute run_autonomous_loop Execute autonomous verification loop with stop conditions. Implements Ralph Wiggum pattern with checkpoints, i Execute run_benchmark_batch Run a longitudinal benchmark batch. N=1 is a smoke test (1 founder, 1 session). Execute run_browserstack_benchmark_lane Return a BrowserStack/browser-automation benchmark lane payload. Execute run_closed_loop Track a compile-lint-test-debug closed loop iteration. Record the result of each step. Never present changes w Execute run_code_analysis Static analysis on code or text content for security issues, secrets, homograph attacks, ANSI injections, susp Execute run_competitor_signal_benchmark Return a competitor-signal-to-response benchmark lane payload. Execute run_deep_sim Run a multi-agent scenario simulation with bounded branching and budget controls. Instantiates agents with per Execute run_dogfood_batch_with_judge Execute the priority 3 dogfood scenarios with automatic LLM judge validation. Execute run_entity_intelligence_mission Run a full DeepTrace entity intelligence mission with optional bounded research cell. Unifies relationship map Execute run_flicker_detection Run full 4-layer Android UI flicker detection pipeline: SurfaceFlinger stats + logcat (L0), screenrecord (L1), Execute run_founder_autonomy_benchmark Run the weekly founder reset autonomy benchmark lane. Execute run_graphify Generate a knowledge graph from a folder of code, docs, papers, or images. Execute run_judge_loop Execute a full judge-fix-verify loop: calls a tool, judges the output, and if it fails, Execute run_mandatory_flywheel Enforce the mandatory 6-step AI Flywheel verification after any non-trivial change. All 6 steps must pass befo Execute run_oracle_comparison Compare actual output against a known-good oracle reference. Based on Anthropic\ Execute run_packet_to_implementation_benchmark Return a packet-to-implementation benchmark lane payload. Execute run_quality_gate Evaluate content or code against a set of boolean rules. Returns pass/fail with specific failures listed. The Execute run_recon Start a reconnaissance research session. Use this at the start of Phase 1 (Context Gathering) to organize rese Execute run_research_cell Run a bounded re-analysis cell for a DeepTrace entity investigation. Queries existing DeepTrace state through Execute run_self_directed_delivery_loop Run a local-first autonomous delivery loop across exploratory research, planning, implementation commands, dog Execute run_self_heal Autonomous self-healing for detected drift issues. Fixes orphaned verification cycles Execute run_self_maintenance Run autonomous self-maintenance cycle. Checks TypeScript compilation, documentation sync, tool counts, test co Execute run_signal_sweep Run a live signal sweep across all data sources (HackerNews, GitHub Trending, Yahoo Finance, ProductHunt). Ret Execute run_sync_bridge_flush Open the outbound websocket bridge, pair or resume the local device, and flush pending approved operations to Execute run_tests_cli Execute a shell test command with timeout, capture stdout/stderr, and return structured results. Useful for ru Execute run_visual_qa_suite End-to-end visual QA pipeline: burst capture → SSIM stability analysis → Execute runNewsroomPipeline Trigger the full DRANE newsroom pipeline (Scout > Historian > Analyst > Publisher) for a given topic or entity Execute runSpotFixScan Scan for common operational issues: stale missions, blocked tasks with met deps, old sniff checks. Execute sandbox_batch Execute multiple commands, index all outputs, and run multiple search queries — all in ONE call. This is the h Execute sandbox_execute Run a shell command, automatically index the output into the sandbox, and return only a summary. The raw stdou Execute scrapling_crawl Start a multi-page spider crawl with extraction. Crawls from start URLs, follows links matching a CSS selector Execute scrapling_crawl_stop Stop a running crawl session. Pass the session_id from scrapling_crawl. Items collected so far are preserved. Execute self_implement Self-implement missing agent infrastructure. Generates implementation plan and code templates for: agent_loop, Execute simulate_decision_paths Run Monte Carlo simulation for founder decisions. Generates multiple random paths to visualize possible future Execute smart_select_tools LLM-powered tool selection: sends your task description + a compact tool catalog to a fast model (Gemini 3 Fla Execute spawn_openclaw_agent Start a secure OpenClaw session with safety rules applied. Execute start_autonomy_benchmark Start an autonomous capability benchmark. Defines a complex build challenge and tracks agent progress through Execute start_dogfood_session Start a new dogfood session for one of the 3 canonical loops (weekly_reset, pre_delegation, company_search). R Execute start_eval_run Start a new eval run. Define the test batch upfront with test cases (input, intent, expected behavior), then r Execute start_execution_run Start a live Convex-backed execution trace run for a workflow. Creates a task session and trace together so la Execute start_ui_dive Initialize a UI/UX Full Dive session. Auto-launches a headless Playwright browser if installed (zero setup). N Execute start_verification_cycle Start a new 6-phase verification cycle for a non-trivial implementation. Returns the cycle ID and Phase 1 inst Execute switchTab Switch between research hub tabs Execute thompson_pipeline End-to-end Thompson Protocol pipeline orchestrator. Takes a complex topic and runs it through all 4 agents (Wr Execute transcribe_audio_file Transcribe a local audio file (MP3/WAV/etc) to text using faster-whisper via Python. Deterministic, no network Execute trigger_batch_run Run a scheduled task right now instead of waiting. Execute trigger_investigation When an eval run shows regression, trigger a new verification cycle to investigate. This is how the outer loop Execute triple_verify Run triple verification on agent implementation. V1: Internal codebase analysis. V2: External authoritative so Execute validate_agent_compatibility Run the agent validation harness — simulates how AI agents (Claude Code,

WRITE 192 tools

Write accept_shared_task Accept a proposed shared-context task. Write ack_shared_context Acknowledge that a peer received and accepted a context packet. Write add_forecast_evidence Add evidence to a forecast Write add_rss_source Register an RSS or Atom feed URL for monitoring. Stored in SQLite for persistent tracking. Validates the feed Write addDocumentToFolder Add a document to a folder. Write archive_content Save generated content to the archive for deduplication and theme tracking. Prevents the engine from regenerat Write assign_agent_role Assign a specialized role to the current agent session. Roles define focus area and behavioral instructions. P Write attach_execution_evidence Attach evidence to a live execution trace. Use for URLs, uploaded files, screenshots, render outputs, and trut Write bind_local_account Record explicit local pairing permission so this device can map durable local context to a specific web user a Write bootstrap_project Register or update your project Write broadcast_agent_update Broadcast a status update to all active agents. Unlike send_agent_message (point-to-point), this creates a mes Write build_before_after_memo Build a memo showing the before and after path plus the validation rationale. Write build_causal_chain Construct a causal chain from temporal observations. Nodes must be in chronological order. Each node represent Write build_company_packet Build the canonical company readiness packet. Write build_slack_onepager Build a Slack-friendly one-page founder report. Write claim_agent_task Claim a task lock so other parallel agents know you Write claimTask Claim a pending task for execution. Returns the task if successful, null if already claimed or dependencies no Write compile_environment_spec Generate a simulation environment specification from entity intelligence. Write complete_autonomy_benchmark Finalize an autonomy benchmark run. Computes final score, duration, tool usage stats, and comparison against r Write complete_eval_run Finalize an eval run and compute aggregate scores. Returns pass rate, average score, failure patterns, and imp Write complete_execution_run Finish a live execution run by updating session status, trace status, and optional usage metrics. Use this aft Write complete_shared_task Complete a shared-context task and attach the output packet if one was produced. Write compress_or_expand_text Precisely compress or expand academic text by a target word count. Compress mode: remove filler words, convert Write compute_dimension_profile Recompute and persist the DeepTrace dimension profile for an entity. Use after new company evidence, relations Write configure_channel_preferences Set your messaging preferences: which channels to use first, Write configure_sandbox_policy Create or update a security policy for OpenClaw agent sessions. Write convex_audit_transaction_safety Audit Convex functions for transaction safety: multiple ctx.runMutation calls in actions (separate transaction Write convex_generate_plan Generate a Convex-specific implementation plan for missing code signatures. Takes the gap analysis from convex Write convex_generate_rules_md Generate or update a Convex rules markdown file from the current gotcha database, recent audit results, and pr Write convex_record_gotcha Record a Convex-specific gotcha, edge case, or pattern for future reference. Stored persistently and searchabl Write convex_schema_migration_plan Compare two schema snapshots and generate a migration checklist. Shows added/removed tables, index changes, an Write create_forecast Create a new forecast with a question, resolution date, and criteria. Optionally set initial probability, base Write create_proof_pack Assemble an immutable proof pack for verification. Bundles a checklist (pass/fail items), optional metrics (to Write create_task_bank Create or add to a fixed task bank for controlled agent evaluation. Each task defines: initial state (repo sna Write create_visual_pr End-to-end PR creation: exports screenshots, generates a rich markdown PR body with visual evidence (before/af Write createDocument Create a new rich-text document with a title and optional initial content blocks. Write createDocumentWithContent Create a document with pre-built ProseMirror JSON content string. Write createFolder Create a new folder for organizing documents. Write createMission Create a new mission with title, type, success criteria, output contract, and budget. Write createPlan Create an explicit task plan as a markdown document with steps marked as pending/in_progress/completed Write createSpreadsheet Create a new spreadsheet with a name. Write decide_re_update Decide whether to update existing instructions or create new files. Implements Write deeptrace_create_evidence_pack Bundle multiple evidence chunks into a named evidence pack. Evidence packs are the unit of provenance — they c Write deeptrace_create_passport Create or update an agent passport — the scoped identity that defines what an agent can read, spend, sign, and Write deeptrace_ingest_evidence Ingest a piece of evidence (article, filing, data point) into the evidence store. Returns a content-addressed Write deeptrace_log_receipt Log a tamper-evident action receipt. Every agent action should produce a receipt recording what was done, what Write delegate_founder_issue Create a bounded shared task handoff for a founder issue packet so the weak angle becomes assigned work. Write delta_handoff Generate a delegation packet for handing off work to another agent or teammate. Produces a delta.handoff packe Write delta_memo Create a decision-ready memo artifact. Produces a delta.memo packet with recommendation, variables, scenarios, Write delta_retain Preserve context for future sessions. Produces a delta.retain packet storing important notes, decisions, meeti Write dismiss_alert Dismiss an important change alert so it no longer appears in proactive alerts. Sets the status to Write dive_changelog Record a change made to fix a bug, design issue, or improve a component. Links before/after screenshots to sho Write dive_design_issue Tag a design inconsistency found during the dive. Covers visual problems like color mismatches, spacing deviat Write dive_generate_tests Generate Playwright regression test code from dive findings. Creates test cases from: bugs (verify the fix hol Write dive_link_backend Link a UI component to its backend dependencies. Connect components to API endpoints, Convex queries/mutations Write dive_record_test_step Record the actual result of a test step after executing it via the MCP Bridge. Compare expected vs actual, att Write dive_save_screenshot Save a screenshot during a dive session. Pass base64 image data (from bridge Write draft_email_reply Structure an email thread for reply drafting. Parses the thread, extracts context (from, subject, date), and b Write duplicateDocument Duplicate a document. Clones content, icon, type. Resets visibility and favorite. Write end_component_flow Complete a component Write end_dogfood_session End a dogfood session with summary metrics: time-to-first-useful-output, delegation success, packet export sta Write escalate_shared_task Escalate a shared-context task when the assignee cannot complete it cleanly. Write export_artifact_packet Formats a Founder Artifact Packet or memo for export to a specific audience and format. Write flag_important_change Flag a detected important change with impact scoring, affected entities, and optional suggested action. Used b Write founder_local_synthesize Takes gathered local context and synthesizes a complete Founder Artifact Packet. Write generate_academic_caption Generate academic figure or table captions following top-venue conventions. Handles Title Case for noun phrase Write generate_flicker_report Generate visual flicker report from existing analysis data. Produces SSIM timeline chart (1200x400 PNG, PIL-ba Write generate_grid_collage Tile N screenshot images into a single grid collage PNG for visual inspection. Write generate_implementation_plan Generate a structured implementation plan for missing code signatures. Takes the gap analysis from verify_conc Write generate_parallel_agents_md Generate a portable, framework-agnostic AGENTS.md section for parallel agent coordination. Designed to be drop Write generate_plan_delegation_packet Convert a FeaturePlan into an agent-ready delegation packet Write generate_pr_report Generate a rich markdown PR body from a UI Dive session. Compiles visual changes (before/after screenshot comp Write generate_proposal_memo Render a FeaturePlan as a human-readable proposal memo. Write generate_report Compile structured findings, eval results, and quality gate data into a formatted markdown report. Useful for Write generate_self_instructions Generate self-instructions for the agent in various formats: skills_md (SKILL.md), rules_md (RULES.md), guidel Write generate_team_install_plan Generate a practical install and rollout plan for a founder, solo developer, or small team using NodeBench MCP Write generate_voice_scaffold Generate starter code for a voice bridge. Returns file contents, setup instructions, and dependency lists for Write generate_zero_draft Auto-draft an artifact (slack message, email, spec doc, PR draft, architecture note, career plan, or content b Write graphify_import_to_subconscious Import a graphify knowledge graph into NodeBench Write ingest_dive_screenshots Scan a directory for PNG/JPG screenshot files and bulk-import them into the dive session Write ingest_temporal_observation Ingest a raw observation into the temporal substrate (timeSeriesObservations). Supports numeric, categorical, Write ingest_upload Ingest uploaded file content into the NodeBench entity intelligence system. Write install_nodebench_plugin Generate or write a starter .mcp.json entry for NodeBench MCP so a local team member can install the preset qu Write judge_session Score a dogfood session on 6 dimensions (1-5 each): truth, compression, anticipation, output, delegation, trus Write link_durable_objects Create a durable relationship such as screen -> action, workflow -> run, run -> artifact, or outcome -> eviden Write log_benchmark_milestone Record completion of a benchmark milestone. Tracks which milestones the agent achieved, time taken, tools used Write log_gap Record a gap found during Phase 2 (Gap Analysis). Gaps are categorized by severity: CRITICAL (protocol violati Write log_phase_findings Record findings for the current phase of a verification cycle. Advances the cycle to the next phase if the cur Write log_recon_finding Record a finding from reconnaissance research. Link it to a recon session and categorize it. Use for both exte Write log_test_result Record a test result for Phase 4 (Testing & Validation). Tests are organized by layer: static, unit, integrati Write manage_implementation_packets Create and manage implementation packets — structured instructions for Claude Code or other coding agents. Write manage_task_list Manage the workspace task list. Add, update, complete, delete, or list tasks. Write merge_compose_output Judge-gated merge of subtask artifacts into a composed output. Write merge_research_results Merge parallel sub-agent research results into a unified dataset. Takes arrays of records from multiple source Write nb_open_approval_queue Open the approval queue for held actions. Write nodebench_create_document Create a new document in NodeBench. Supports markdown content with optional metadata tags. Write nodebench.capture Persist a messy event capture into the active NodeBench event workspace without live paid search. Uses event c Write nodebench.claims.submit_public Submit a sourced public claim. The verifier rejects private sources, raw email/resume/private artifact text, a Write nodebench.link_private_signal_to_public_entity Link an app-private signal to a public entity by hash only. Raw Gmail, resume, inbox, and private artifact tex Write nodebench.notebook_append Append reviewed text into a NodeBench report notebook through the same Convex-backed report notebook persisten Write nodebench.report_export_complete Complete a previously previewed NodeBench report export after review. Writes the export completion event to th Write nodebench.submit_public_claim Submit a sourced public claim to NodeBench. The verifier rejects private email/resume-derived text and non-pub Write nodebench.watch_entity Create a watch request for an entity. Current MVP returns the entity and recommended refresh cadence. Write open_core_boundary_advisor Advise what should stay open-core versus proprietary. Write plan_decompose_mission Decompose a mission into subtasks with verifiability routing. Write polish_academic_text Deep-polish academic text for top-venue quality (NeurIPS, ICLR, ICML, ACL). Handles English and Chinese papers Write promote_to_eval Take findings from a completed verification cycle and promote them into eval test cases. This is how the inner Write propose_shared_task Propose a task handoff between peers with input contexts and required output packet shape. Write publish_founder_issue_packet Turn the weakest founder-direction angle into a durable shared-context issue packet with lineage, proof links, Write publish_shared_context Publish a structured shared-context packet with subject, claims, evidence refs, freshness, permissions, and li Write publish_to_queue Push content to the LinkedIn content queue on the Convex platform. Content goes through the engagement gate an Write queue_sync_operation Queue an explicit outbound sync operation when a custom workflow needs to push metadata, receipts, or approval Write rate_packet_usefulness Rate a packet Write record_dogfood_telemetry Record a full telemetry row for a dogfood run. Captures surface, scenario, user role, prompt, tool usage, toke Write record_eval_result Record the actual result for a specific eval case. Include what happened, the verdict (pass/fail/partial), and Write record_event Record a typed event to the causal event ledger. Supports causal linking via causedByEventId and correlation g Write record_execution_decision Record a structured decision on a live execution trace without storing raw hidden reasoning. Use for rankings, Write record_execution_step Record a structured execution step receipt on a live execution trace. Use this for meaningful actions like fil Write record_execution_verification Record a verification result on a live execution trace. Use for render checks, formula checks, diff checks, ar Write record_fix_attempt Record a fix attempt with replay proof and regression protection description. Links to a failure case. Write record_learning Store an edge case, gotcha, pattern, or regression discovered during verification. Learnings are searchable vi Write record_manual_correction Track a human correction to agent output. Every correction is evidence of a system gap — the system should hav Write record_openclaw_gotcha Record a discovered OpenClaw pitfall or security finding. Write record_path_step Record a navigation/exploration step in the user Write record_provenance_receipt Persist a durable execution receipt for a tool call, approval, verification, or other meaningful action. Write record_repeated_question Track a question the user asked that NodeBench should have already known. This is the core failure signal — re Write record_state_diff Record a before/after state change on an entity. Tracks what changed, which fields, and why. Write record_sync_artifact Persist a local artifact with verification state so it can be replayed, reviewed, and optionally synced to the Write record_sync_outcome Persist an outcome with user value, stakeholder value, evidence, and status so the system always resolves work Write recordCanaryRun Record a new canary benchmark run with throughput and quality scores. Write register_component Register a UI component in the dive tree. Components form a hierarchy: page → section → form/modal/list → butt Write register_shared_context_peer Register a scoped peer with product, role, surface, capabilities, and heartbeat metadata for shared-context co Write register_skill Register a skill (rule/memory .md file) with its source documents, update triggers, Write reject_shared_task Reject a proposed shared-context task with a reason. Write release_agent_task Release a task lock after completing work. Updates status and optionally records a progress note for the next Write remove_ai_signatures Detect and remove AI-generated writing signatures from academic text. First runs pattern matching for known AI Write report Produce a human-readable artifact for either a research topic or a decision. If recommendation inputs are prov Write resolve_forecast Resolve a forecast with an outcome. Auto-computes Brier and log scores for binary forecasts. Ambiguous outcome Write resolve_founder_issue Invalidate a founder issue packet and optionally publish a resolution packet so the issue lifecycle stays expl Write resolve_gap Mark a gap as resolved after implementing the fix. Returns remaining gap counts by severity. Write resolveRoutingRecommendation Accept, reject, or expire a routing recommendation. Write resolveSniffCheck Resolve a pending sniff check with approved, rejected, or needs_revision. Write restoreDocument Restore an archived document and its children. Write retention_register_connection Register a retention.sh team connection in local MCP state so QA findings and token savings can flow into foun Write retention_sync Sync data between NodeBench Delta and retention.sh. Pushes delta packets as team context and pulls QA findings Write retention_sync_findings Sync retention.sh QA findings, scores, and token savings into local MCP state. Write sandbox_ingest Index arbitrary text into the context sandbox (FTS5). Raw content stays in SQLite — only a compact reference e Write save_research_resource Save a research resource with URL, source citation, tags, and notes. Write save_session_note Persist a critical finding, decision, or progress note to the filesystem. Notes survive context compaction — c Write scaffold_directory Scaffold directory structure following OpenClaw patterns. Creates organized subdirectories and placeholder fil Write scaffold_nodebench_project Create a complete project template pre-configured for nodebench-mcp. Generates: package.json, AGENTS.md, .mcp. Write scaffold_openclaw_project Generate a starter project for OpenClaw + NodeBench. Write scaffold_openclaw_sandbox Generate Docker/WSL2/Podman setup files for running OpenClaw in an isolated environment. Write scaffold_research_pipeline Generate a complete, standalone Node.js project for an automated research digest pipeline. Creates: package.js Write send_agent_message Send a message to another agent by session ID or role. Enables asynchronous inter-agent communication for task Write send_email Send an email via SMTP over TLS. Requires EMAIL_USER and EMAIL_PASS env vars. Defaults to Gmail SMTP (smtp.gma Write send_openclaw_message Send a message through any connected channel. Write send_peer_message Send a direct structured message to a peer without routing everything through a central orchestrator. Write sendMessage Send a message to the live chat run Write set_watchdog_config Configure the background watchdog that continuously monitors system health. Write setup_operator_profile Set up your profile to customize how the AI assistant works for you. Write share_create_packet_link Create a durable local share link record for a packet or founder memo so it can be rendered or synced later. Write sniff_record_human_review Record a human sniff-check for a subtask or merge output. Write start_component_flow Claim a component for traversal by a specific subagent. Marks it as Write summarize Turn raw context into a compact brief with key points and optional persistence. This is the fast human-readabl Write sync_company Push a company profile into NodeBench AI from Claude Code. Extracts company truth from a summary you provide ( Write sync_daily_brief Sync daily brief + narrative data from Convex to local SQLite. Requires CONVEX_SITE_URL and MCP_SECRET environ Write sync_operator_profile Sync the Operator Profile to the local filesystem at ~/.nodebench/USER.md. Write sync_report Push a report artifact from Claude Code into NodeBench AI. The report is saved locally and published as a shar Write sync_skill Resync a stale skill after applying updates. Recomputes source hashes, updates Write synthesize_extension_plan Synthesize a plan for extending or deepening an existing feature. Write synthesize_feature_plan Synthesize a phased feature implementation plan conditioned on founder context, Write synthesize_integration_proposal Synthesize an integration plan for an external tool, API, or framework. Write synthesize_recon_to_learnings Convert recon findings into persistent learnings. Recon findings are ephemeral research notes; learnings are t Write tag_ui_bug Tag a bug to a specific component (and optionally a specific interaction). Bugs are categorized by severity (c Write thompson_write Transform complex content into Thompson Protocol format — plain English mandate, intuition-before-mechanics, a Write track Add, check, remove, or list tracked entities with one workflow tool. The default path optimizes for watched en Write track_action Record any significant action with before/after state, reasoning, and temporal metadata. Auto-captures session Write track_intent Track a user intent that should survive context window compaction. On Write track_milestone Record a significant milestone (phase complete, deploy, ship, launch, pivot, decision) with optional evidence Write update_agents_md Read, append, or update sections in the AGENTS.md file. This file contains instructions for AI agents working Write update_company_truth Update a subconscious memory block with new information. Write update_forecast_probability Update a forecast Write updateDocument Update document fields: title, content, icon, visibility, favorite status. Write updatePlanStep Update the status or notes of a specific step in a task plan Write upsert_durable_object Register or update a durable local object so views, tools, workflows, runs, artifacts, and outcomes share one Write upsertEntity Create or update a canonical entity record (company, person, fund, etc.). Write watchlist_add_entity Add an entity to the local founder watchlist with alert preferences and optional strategic-angle linkage. Write watchlist_refresh_entities Refresh watchlist timestamps and optionally attach change summaries for watched entities. Write write_workspace_file Create or update a file in the agent workspace (~/.nodebench/workspace/). Write writeAgentMemory Store intermediate results or data for later retrieval. Use this to avoid context window overflow. Write zip_extract_file Extract a single file from a local ZIP archive to a local output directory (zip-slip safe). Deterministic, no

READ 512 tools

Nodebench

What Nodebench exposes to your agents

The most dangerous Nodebench tools

How to control Nodebench

All 824 Nodebench tools

Related servers

Questions about Nodebench

Enforce policy on every Nodebench tool call.