replay_test_suite

WHEN AGENTS USE THIS TOOL

AI agents call replay_test_suite to retrieve information from Keploy without modifying any data. This is common in research, monitoring, and reporting workflows where the agent needs context before taking action. Because read operations don't change state, they are generally safe to allow without restrictions -- but you may still want rate limits to control API costs.

Even though replay_test_suite only reads data, uncontrolled read access can leak sensitive information or rack up API costs. An agent caught in a retry loop could make thousands of calls per minute. A rate limit gives you a safety net without blocking legitimate use.

RECOMMENDED POLICY

Read-only tools are safe to allow by default. No rate limit needed unless you want to control costs.

policy.json

{
  "version": "1",
  "default": "deny",
  "tools": {
    "replay_test_suite": {}
  }
}

See the full Keploy policy for all 103 tools.

MORE KEPLOY TOOLS

Destructive bulkDeleteTestSuites Destructive delete_mock Destructive delete_recording Destructive delete_test_suite Destructive deleteApp Destructive deleteMock Destructive deleteTestSuite Destructive devloop_mutation_demo

View all 103 tools →

WHAT CAN GO WRONG

These attack patterns abuse exactly the kind of access replay_test_suite gives an agent. Each links to the full case and the policy that stops it:

Browse the full MCP Attack Database →

FAQ

What does the replay_test_suite tool do? +

Replay an existing test suite live against the dev's LOCAL APP (no mocks, no docker spin-up). Returns a playbook that delegates to the enterprise CLI keploy test-suite, which walks each suite's steps, fires HTTP requests at base_path, evaluates assertions, and uploads per-suite results to api-server. The CLI prints a final pass/fail summary table plus a "Report:" URL to stdout. Output produces a TEST SUITE REPORT — it answers "does the suite hold up against the actual current system?". ═══════════════════════════════════════════════════════════════════ DISAMBIGUATION — pick this tool vs. replay_sandbox_test: ═══════════════════════════════════════════════════════════════════ USE replay_test_suite (THIS TOOL) when the dev says: * "run the test suite" / "run my test suites" * "execute test suite X" / "run suite 810d3ebe…" * "test the suite again" / "rerun the suite" * "validate the suite changes" (after editing a suite) * "smoke test against the live app" Default reading: bare verbs "run" / "execute" / "test" applied to "the suite" mean LIVE-APP execution, NOT replay against captured mocks. USE replay_sandbox_test INSTEAD when the dev says: * "run my sandbox tests" / "replay my sandbox tests" * "integration-test my app" / "check if my mocks still match" * "replay the captured tests" / "run against the recorded mocks" Trigger keyword: "sandbox" / "replay" / "mocks" / "integration-test" — explicit signal that the dev wants captured-mock replay, not live-app. After a record_sandbox_test run, the natural next step is replay_sandbox_test (replay against the freshly captured mocks). After create_test_suite / update_test_suite, the natural next step is replay_test_suite (validate the new/edited suite against the live app). When the dev's verb is bare ("run the suite") and the prior turn was create/update, prefer THIS tool. When the prior turn was record, ASK the dev if unsure — the verbs overlap and silently picking sandbox-replay can mask code-change failures with mock-replay noise. USE THIS for: re-running previously-created suites against a running local app — verifying a regression after a code change, smoke-testing a branch, re-validating after editing a suite. DO NOT USE this for: validating a NEW suite that hasn't been inserted yet (use create_test_suite — it runs the suite twice as part of validation), or for running suites against the captured-mock copy of the app (use replay_sandbox_test — captured-mock replay flow). ═══════════════════════════════════════════════════════════════════ DISCOVERY — when the dev hands you a bare suite_id with no app_id / branch_id: ═══════════════════════════════════════════════════════════════════ Suites live on a (app_id, branch_id) tuple. A bare suite_id has no on-disk hint about which app or branch holds it; you have to RESOLVE both before calling this tool. Walk these steps in order — STOP as soon as getTestSuite returns 200: 1. Detect the dev's git branch: Bash git rev-parse --abbrev-ref HEAD in app_dir. If exit non-zero / output is "HEAD" → not a git repo / detached HEAD; ASK the dev for the Keploy branch name (don't invent one). 2. Resolve candidate apps via the cwd basename: Bash basename $(pwd) → call listApps with q=<basename> (case-insensitive substring match). Usually 1–2 candidates (e.g. "orderflow" → matches "orderflow" and "orderflow.producer"). If 0 → ASK the dev for the app_id; if >1 → walk every candidate in step 4. 3. For each candidate app, call list_branches({app_id}) and find the branch whose name matches the git branch from step 1. That gives you {branch_id, status}. If no match → that app's not the owner; try the next candidate. If status is closed/merged → ask the dev whether to use this branch anyway. 4. Verify with getTestSuite({app_id, suite_id, branch_id=<from step 3>}). 200 → resolved; 404 → wrong app, try next candidate. 5. If steps 2–4 exhaust without a hit, the suite is on a branch whose name doesn't match the git branch (the dev created it with a custom name, or it's on main). Then: call list_branches on each candidate app and try every OPEN branch's branch_id with getTestSuite, then try main (branch_id omitted). If still nothing → ASK the dev for the {app_id, branch_id} pair. The reverse "look up suite_id globally" path doesn't exist — auditing is branch-scoped, so resolution starts from a branch context. After resolving once in a session, REUSE the {app_id, branch_id} for any subsequent suite-targeting call (delete_test_suite / update_test_suite / replay_test_suite); don't re-walk discovery for every action. ═══════════════════════════════════════════════════════════════════ INPUTS ═══════════════════════════════════════════════════════════════════ * app_id (required) — Keploy app ID. Same value used for create_test_suite / list_branches. * branch_id (required) — Keploy branch UUID. Resolve via the explicit two-step flow BEFORE calling: (1) Bash git rev-parse --abbrev-ref HEAD in app_dir; (2) call create_branch tool with {app_id, name: <git branch>} — find-or-create returns {branch_id, ...}; pass it here. Direct main writes are blocked. * base_path (required) — base URL of the dev's local app, e.g. http://localhost:8080. Each suite step's relative path is appended to this. * suite_ids (optional) — list of suite IDs to run. Omit / empty = run every suite registered for app_id on the branch. * header (optional) — single header to inject into every request, e.g. "Cookie: session=…". Same shape as the CLI's -H flag. * app_dir (optional) — absolute path to the dev's repo root (where the app is running). Defaults to '.' (cwd). The CLI invocation cd's here. ═══════════════════════════════════════════════════════════════════ HOW THIS TOOL WORKS ═══════════════════════════════════════════════════════════════════ This tool DOES NOT execute the suite itself. It returns a "playbook" — a small array of shell steps for you (Claude) to walk via Bash. The playbook spawns the enterprise CLI keploy test-suite in the foreground; the CLI: 1. Validates the branch exists + is writable (fails fast with a clear message if not). 2. Loads suites from api-server (filtered by --suite-id when supplied; otherwise every suite on the branch). 3. For each suite: fires step requests at base_path, evaluates assertions, records per-step results. 4. Uploads a TestSuiteRun + TestSuiteReport entry to api-server (?branch_id=<uuid>). 5. Prints a summary table to stdout, exits 0 on all-pass / 1 on any failure. Walk the playbook in order. Surface the CLI's stdout to the dev — the table shows which suites passed / failed / were "buggy" (suite-level verdict separate from individual step failures). PREREQUISITES the playbook assumes: * The dev's app is up and reachable at base_path. * keploy binary is on PATH. If missing, install before calling this tool: curl --silent -O -L https://keploy.io/install.sh && source install.sh. * Either ~/.keploy/cred.yaml exists (API key) or KEPLOY_API_KEY is exported.. It is categorised as a Read tool in the Keploy MCP Server, which means it retrieves data without modifying state.

How do I enforce a policy on replay_test_suite? +

Register the Keploy MCP server in PolicyLayer and add a rule for replay_test_suite: allow, deny, rate-limit, or require approval. Point your MCP client at the PolicyLayer proxy URL and the rule is enforced on every call, before it reaches Keploy. Nothing to install.

What risk level is replay_test_suite? +

replay_test_suite is a Read tool with low risk. Read-only tools are generally safe to allow by default.

Can I rate-limit replay_test_suite? +

Yes. Add a rate_limit block to the replay_test_suite rule in your PolicyLayer policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.

How do I block replay_test_suite completely? +

Set action: deny in the PolicyLayer policy for replay_test_suite. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.

What MCP server provides replay_test_suite? +

replay_test_suite is provided by the Keploy MCP server (https://api.keploy.io/client/v1/mcp). PolicyLayer sits as a proxy in front of this server to enforce policies before tool calls reach the server.

WHEN AGENTS USE THIS TOOL

RECOMMENDED POLICY

MORE KEPLOY TOOLS

WHAT CAN GO WRONG

OTHER READ TOOLS

RELATED READING

FAQ

Enforce policy on every Keploy tool call.

replay_test_suite

// WHEN AGENTS USE THIS TOOL

// RECOMMENDED POLICY

// MORE KEPLOY TOOLS

// WHAT CAN GO WRONG

// OTHER READ TOOLS

// RELATED READING

// FAQ

Enforce policy on every Keploy tool call.

WHEN AGENTS USE THIS TOOL

RECOMMENDED POLICY

MORE KEPLOY TOOLS

WHAT CAN GO WRONG

OTHER READ TOOLS

RELATED READING

FAQ