run_evaluation_tests

// WHEN AI AGENTS USE THIS TOOL

AI agents invoke run_evaluation_tests to trigger processes or run actions in CircleCI MCP Server. Execute operations can have side effects beyond the immediate call -- triggering builds, sending notifications, or starting workflows. Rate limits and argument validation are essential to prevent runaway execution.

// WHY ENFORCE A POLICY ON RUN_EVALUATION_TESTS

run_evaluation_tests can trigger processes with real-world consequences. An uncontrolled agent might start dozens of builds, send mass notifications, or kick off expensive compute jobs. Intercept enforces rate limits and validates arguments to keep execution within safe bounds.

// RECOMMENDED POLICY

Execute tools trigger processes. Rate-limit and validate arguments to prevent unintended side effects.

circleci-public-mcp-server-circleci.yaml

tools:
  run_evaluation_tests:
    rules:
      - action: allow
        rate_limit:
          max: 10
          window: 60
        validate:
          required_args: true

See the full CircleCI MCP Server policy for all 16 tools.

// DETAILS

Tool Name run_evaluation_tests

Category Execute

MCP Server CircleCI MCP Server MCP Server

Risk Level High

// MORE CIRCLECI MCP SERVER TOOLS

Execute rerun_workflow Execute run_pipeline Execute run_rollback_pipeline Write create_prompt_template Read analyze_diff Read config_helper Read download_usage_api_data Read find_flaky_tests

View all 16 tools →

// WHAT CAN GO WRONG

Agents calling execute-class tools like run_evaluation_tests have been implicated in these attack patterns. Read the full case and prevention policy for each:

Browse the full MCP Attack Database →

// OTHER EXECUTE TOOLS ACROSS MCP SERVERS

Other tools in the Execute risk category across the catalogue. The same policy patterns (rate-limit, validate) apply to each.

Atv build_stake_tx Atv build_unstake_tx Acdp start_local Pentagonal pentagonal_audit Pentagonal pentagonal_compile ACR — Agent Composition Records orient_me

// IN CONTEXT OF HIGH RISK

run_evaluation_tests is one of the high-risk operations in CircleCI MCP Server. For the full severity-focused view — only the high-risk tools with their recommended policies — see the breakdown for this server, or browse all high-risk tools across every MCP server.

// RELATED READING

// FAQ

What does the run_evaluation_tests tool do? +

This tool allows the users to run evaluation tests on a circleci pipeline. They can be referred to as "Prompt Tests" or "Evaluation Tests". This tool triggers a new CircleCI pipeline and returns the URL to monitor its progress. The tool will generate an appropriate circleci configuration file and trigger a pipeline using this temporary configuration. The tool will return the project slug. Input options (EXACTLY ONE of these THREE options must be used): Option 1 - Project Slug and branch (BOTH required): - projectSlug: The project slug obtained from listFollowedProjects tool (e.g., "gh/organization/project") - branch: The name of the branch (required when using projectSlug) Option 2 - Direct URL (provide ONE of these): - projectURL: The URL of the CircleCI project in any of these formats: * Project URL with branch: https://app.circleci.com/pipelines/gh/organization/project?branch=feature-branch * Pipeline URL: https://app.circleci.com/pipelines/gh/organization/project/123 * Workflow URL: https://app.circleci.com/pipelines/gh/organization/project/123/workflows/abc-def * Job URL: https://app.circleci.com/pipelines/gh/organization/project/123/workflows/abc-def/jobs/xyz Option 3 - Project Detection (ALL of these must be provided together): - workspaceRoot: The absolute path to the workspace root - gitRemoteURL: The URL of the git remote repository - branch: The name of the current branch Test Files: - promptFiles: Array of prompt template file objects from the ./prompts directory, each containing: * fileName: The name of the prompt template file * fileContent: The contents of the prompt template file Pipeline Selection: - If the project has multiple pipeline definitions, the tool will return a list of available pipelines - You must then make another call with the chosen pipeline name using the pipelineChoiceName parameter - The pipelineChoiceName must exactly match one of the pipeline names returned by the tool - If the project has only one pipeline definition, pipelineChoiceName is not needed Additional Requirements: - Never call this tool with incomplete parameters - If using Option 1, make sure to extract the projectSlug exactly as provided by listFollowedProjects - If using Option 2, the URLs MUST be provided by the user - do not attempt to construct or guess URLs - If using Option 3, ALL THREE parameters (workspaceRoot, gitRemoteURL, branch) must be provided - If none of the options can be fully satisfied, ask the user for the missing information before making the tool call Returns: - A URL to the newly triggered pipeline that can be used to monitor its progress . It is categorised as a Execute tool in the CircleCI MCP Server MCP Server, which means it can trigger actions or run processes. Use rate limits and argument validation.

How do I enforce a policy on run_evaluation_tests? +

Add a rule in your Intercept YAML policy under the tools section for run_evaluation_tests. You can allow, deny, rate-limit, or validate arguments. Then run Intercept as a proxy in front of the CircleCI MCP Server MCP server.

What risk level is run_evaluation_tests? +

run_evaluation_tests is a Execute tool with high risk. Execute tools should be rate-limited and have argument validation enabled.

Can I rate-limit run_evaluation_tests? +

Yes. Add a rate_limit block to the run_evaluation_tests rule in your Intercept policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.

How do I block run_evaluation_tests completely? +

Set action: deny in the Intercept policy for run_evaluation_tests. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.

What MCP server provides run_evaluation_tests? +

run_evaluation_tests is provided by the CircleCI MCP Server MCP server (CircleCI-Public/mcp-server-circleci). Intercept sits as a proxy in front of this server to enforce policies before tool calls reach the server.

Let agents act without letting them run wild.