JUDGES PANEL TOOLS

78 tools from the Judges Panel MCP Server, categorised by risk level.

EXECUTE 63 tools
Execute ai-code-review Optimized for reviewing AI-generated code Execute benchmark_gate Run the benchmark suite and check results against quality thresholds. Returns pass/fail with metric details... Execute boilerplate-express Standard Express.js boilerplate patterns Execute ci-friendly Optimized for CI pipelines with critical-only findings Execute Compliance Focus on compliance, data security, sovereignty, and privacy judges. Execute Django Tuned for Django apps — emphasizes template security, ORM misuse, CSRF, admin security. Execute error-handling-gaps Async code without error handling (common AI omission) Execute evaluate_app_builder_flow Run a 3-step app-builder workflow: tribunal review, plain-language risk translation, and prioritized remedi... Execute evaluate_batch Evaluate multiple code files in a single call. Returns per-file verdicts with scores and findings, plus agg... Execute evaluate_code Submit code to the full Judges Panel for evaluation. Handles ALL code types including application code, inf... Execute evaluate_code_single_judge Submit code to a specific judge for targeted domain analysis. Handles ALL code types including application ... Execute evaluate_code_streaming Submit code for streaming evaluation — returns per-judge results as each judge completes, with running aggr... Execute evaluate_diff Evaluate only the changed lines in a code diff. Runs all ${JUDGES.length} judges on the full file but filte... Execute evaluate_focused Run a focused evaluation using only the specified judges. Use this after an initial full evaluation to re-c... Execute evaluate_git_diff Evaluate code changes from a git diff. Parses the unified diff from a git repository, identifies changed fi... Execute evaluate_policy_aware Run policy-aware tribunal evaluation with named policy profiles (startup, regulated, healthcare, fintech, p... Execute evaluate_project Submit multiple files for project-level analysis. All ${JUDGES.length} judges evaluate each file, plus cros... Execute evaluate_public_repo_report Clone a public repository URL, run the full judges panel across source files, and generate a consolidated m... Execute evaluate_then_fix Evaluate code and automatically generate fix patches for all findings that have auto-fix support. Returns t... Execute evaluate_with_progress Evaluate code with progressive judge-by-judge reporting. Returns intermediate counts as each judge complete... Execute evaluate-code Evaluate a code snippet or file for issues across all judge categories Execute evaluate-diff Evaluate a code diff (PR or commit) for introduced issues Execute example-domains Example domains/placeholder URLs from training data Execute excessive-inline-comments Line-by-line explanatory comments (AI teaching style) Execute execute_sql Execute any SQL query on the database Execute explain_finding Explain a Judges Panel finding in plain language. Provides OWASP/CWE references, risk context, and remediat... Execute explain-finding Provide detailed explanation of a specific finding Execute Express Tuned for Express.js APIs — emphasizes middleware security, authentication, CORS, and rate limiting. Execute FastAPI Tuned for Python FastAPI — focuses on input validation, async patterns, and API security. Execute Fintech For financial services — PCI DSS compliance, cryptography, authentication, Execute fix_code Evaluate code with the Judges Panel and automatically apply all available auto-fix patches. Returns the fix... Execute generic-naming Generic variable names (data, result, response, temp, item, value) Execute Government For government and public sector — FedRAMP/NIST compliance, data sovereignty, Execute Healthtech For healthcare applications — HIPAA compliance, data sovereignty, encryption at rest, Execute Kubernetes Tuned for Kubernetes manifests — security contexts, RBAC, resource limits, network policies. Execute Lenient Only critical and high severity findings. Good for early development. Execute minimal Minimal configuration with only critical findings Execute missing-tests Complex implementation file without corresponding test references Execute Next.js Tuned for Next.js — covers both server and client security, API routes, SSR/ISR patterns. Execute onboarding Gentle review for new team members Execute Onboarding Smart defaults for first-time adoption — suppresses noisy absence-based rules, Execute performance Focus on performance issues Execute Performance Focus on performance, caching, scalability, and concurrency judges. Execute placeholder-credentials Placeholder API keys/tokens from AI training data Execute pr-review Balanced review for pull requests Execute Rails Tuned for Ruby on Rails — emphasizes mass assignment, CSRF, SQL injection, strong params. Execute re_evaluate_with_context Re-evaluate code with developer-provided context from a multi-turn conversation. Accepts disputed findings,... Execute React Tuned for React/Next.js apps — enables accessibility, XSS protection, disables backend-only judges. Execute record_feedback Record user feedback on a finding — mark it as a true positive (tp), false positive (fp), or won Execute review-project Full project-level review with cross-file analysis Execute run_benchmark Run the full benchmark suite and return a detailed dashboard with per-judge, per-category, and per-difficul... Execute run_command Execute shell command on the server Execute SaaS For multi-tenant SaaS platforms — tenant isolation, rate limiting, scalability, Execute scaffold_plugin Generate a starter plugin template for the Judges Panel. Creates a self-contained plugin file with custom r... Execute security-audit Deep security review with all severity levels Execute security-focused Focus on security vulnerabilities and best practices Execute spawn_agent Spawn a new agent to handle a subtask Execute strict Strict mode with all judges enabled and low severity threshold Execute Strict All judges, all severities. No findings tolerated. Best for production code reviews. Execute suggest-fix Generate fix suggestions for detected findings Execute Terraform Tuned for Terraform/OpenTofu IaC — focuses on infrastructure security, cloud-readiness, compliance. Execute todo-placeholder TODO/FIXME placeholders common in AI-generated code Execute uniform-comments Uniform JSDoc/docstring style on every function

The managed route: connect Judges Panel through the PolicyLayer gateway — every tool call above is checked against your policy before it runs, with a full audit log.

DIRECT INSTALL (UNMANAGED) npx -y @kevinrabun/judges
How many tools does the Judges Panel MCP server have? +

The Judges Panel MCP server exposes 78 tools across 4 categories: Read, Write, Destructive, Execute.

How do I enforce policies on Judges Panel tools? +

Route the Judges Panel server through the PolicyLayer gateway. Define allow, deny, or approval rules per tool in the dashboard — they are enforced on every call before it reaches the server.

What risk categories do Judges Panel tools fall into? +

Judges Panel tools are categorised as Read (10), Write (4), Destructive (1), Execute (63). Each category has a recommended default policy.

Let agents act without letting them run wild.

Route your MCP servers through PolicyLayer and every tool call is checked against your policy before it runs — allow, deny, or require approval. Per-identity grants. Full audit log. Live in minutes.

Free to start. No card required.

4,600+ MCP servers and 31,000+ tools scanned and risk-classified.

// GET IN TOUCH

Have a question or want to learn more? Send us a message.

Message sent.

We'll get back to you soon.