Low Risk

get_session_report

Fetch the report for a completed run. ONE tool, THREE report kinds — the response's top-level kind field discriminates which kind it is (rerecord / sandbox_run / test_suite_run) and which question the report answers (see core glossary's "three reports"). Read kind first, then pick the matching re...

Risk signalsBulk/mass operation — affects multiple targets

Part of the Keploy server.

get_session_report is read-only, but an agent in a loop can still rack up calls and cost. PolicyLayer caps every call before it runs. Live in minutes.

SECURE KEPLOY →

Free to start. No card required.

AI agents call get_session_report to retrieve information from Keploy without modifying any data. This is common in research, monitoring, and reporting workflows where the agent needs context before taking action. Because read operations don't change state, they are generally safe to allow without restrictions -- but you may still want rate limits to control API costs.

Even though get_session_report only reads data, uncontrolled read access can leak sensitive information or rack up API costs. An agent caught in a retry loop could make thousands of calls per minute. A rate limit gives you a safety net without blocking legitimate use.

Read-only tools are safe to allow by default. No rate limit needed unless you want to control costs.

policy.json
{
  "version": "1",
  "default": "deny",
  "tools": {
    "get_session_report": {}
  }
}

See the full Keploy policy for all 103 tools.

Get this rule live on your own Keploy server in minutes. PolicyLayer enforces it on every call, before it runs.

ENFORCE ON MY KEPLOY →

View all 103 tools →

These attack patterns abuse exactly the kind of access get_session_report gives an agent. Each links to the full case and the policy that stops it:

Browse the full MCP Attack Database →

Every attack above starts with a tool call. PolicyLayer checks each one against your policy first, so get_session_report only ever does what you allow.

SECURE KEPLOY →

Other read tools across the catalogue. The same approach applies to each: allow, with a rate cap to control cost.

What does the get_session_report tool do? +

Fetch the report for a completed run. ONE tool, THREE report kinds — the response's top-level kind field discriminates which kind it is (rerecord / sandbox_run / test_suite_run) and which question the report answers (see core glossary's "three reports"). Read kind first, then pick the matching reading rules below; do NOT assume the kind from how you got here. Call this as the final step of the playbook, AFTER you read the terminal NDJSON event (phase=done) and confirmed data.ok=true. Pass app_id and test_run_id — extract test_run_id from data.test_run_id on the phase=done line of the progress_file returned by record_sandbox_test or replay_sandbox_test (for replay_test_suite, the CLI prints test_run_id to stdout instead). ===== OUTPUT SHAPE ===== (Conditional verbosity so the dev isn't drowned in noise on a green run.) * Always includes totals at the SUITE level only (total_suites / passed_suites / failed_suites) and a per_suite array where each entry carries suite_id, suite_name, total_steps, passed_steps, failed_steps. Aggregate step counts across suites are intentionally omitted — they hide where damage actually is. * PER-KIND READING of passed_steps / failed_steps — same column names, different meaning per kind: - RERECORD (kind=rerecord): passed_steps = steps whose auto-replay byte-comparison matched the live capture. failed_steps = steps that diverged on auto-replay. EVEN IF every suite shows passed_steps == total_steps, the rerecord is only successful when every suite is also linked=true (a sandbox test got produced). Always check linked; the step counts alone do not indicate "did the rerecord work". - SANDBOX_RUN (kind=sandbox_run): passed_steps = steps whose assertions held under captured-mock replay. failed_steps = assertion failures or response diffs against the captured baseline. - TEST_SUITE_RUN (kind=test_suite_run): passed_steps = steps whose assertions held against the live app. failed_steps = same against live, no mocks involved. No linkage to report. * Top-level kind discriminates the report: "rerecord" for record_sandbox_test runs (rerecord report — answers "did the sandbox test get created and linked?"), "sandbox_run" for replay_sandbox_test runs (sandbox run report — answers "does the suite still hold up against its captured baseline?"), "test_suite_run" for replay_test_suite runs (test suite report — live execution, no mocks; answers "does the suite hold up against the actual current system?"). Use kind to pick the right reading; do NOT mix them in one response. * RERECORD runs (kind="rerecord") carry a linked bool + test_set_id string on every per_suite[] entry. linked=true means the rerecord produced a sandbox test for the suite (replay-ready). linked=false means rerecord did NOT produce a sandbox test for the suite — it cannot be replayed until rerecord succeeds. ALWAYS surface this on rerecord output — even when every step's capture passed at the wire level, a suite without a sandbox test is a real failure. For the per-suite table, add a "Linked" column (yes/no from per_suite[].linked). For the one-line all-green reply, report "N/N suites passed, L/N have a sandbox test (test_run_id=<id>)". * When any suite has failures (or verbose=true), also includes failed_steps[] with per-step diagnostics (suite, step name, method+url, diff excerpt, error, mock_mismatches, assertion_failures, mock_mismatch_failure, authored_assertions, authored_response_body) PLUS mock_mismatch_failed_steps (count) and mock_mismatch_dominant (bool — true when the majority of failed steps have unconsumed recorded mocks, which points at a keploy-side egress-hook issue rather than dev app breakage). On RERECORD, failed_steps[] also carries linked (whether the owning suite has a sandbox test after this rerecord) and the mock_mismatch_* fields are suppressed (irrelevant in rerecord context). * authored_assertions / authored_response_body — the SUITE's authored contract for the failing step (the assert array and response.body as defined when the suite was created/updated). Surfaced inline so route B vs route C can be decided without a second getTestSuite round-trip. KEY DECISION POINT: if any authored_assertions entry is pinned to the value the diff shows as "expected" (e.g. assert {path: "$.order.status", expected: "created order"} and the diff says "expected 'created order', got 'created'"), route C is MANDATORY — re-record alone leaves that assertion stuck on the old contract and the next rerecord/replay will gate-1-fail on the same step. If authored_assertions is empty/absent (suite asserts nothing structural on that field), route B or route-C-without-assertion-edit may suffice. * When everything passes and verbose is false, failed_steps is omitted. ===== HOW TO RESPOND TO THE DEV ===== * status == "all_passed" AND kind == "sandbox_run" → ONE-LINER: "<passed_suites>/<total_suites> suites passed (test_run_id=<id>)". Do not dump the JSON, do not list per-suite rows unless asked. * status == "all_passed" AND kind == "test_suite_run" → ONE-LINER: "<passed_suites>/<total_suites> suites passed live (test_run_id=<id>)". No mocks involved, no linkage to report. * status == "all_passed" AND kind == "rerecord" → ONE-LINER including linkage: "<passed_suites>/<total_suites> suites passed, <linked>/<total> linked (test_run_id=<id>)" where <linked> = count of per_suite[] entries with linked=true. If linked < total, ALSO list the unlinked suite names so the dev knows which ones are silently broken (skip sandbox replay on them, or investigate the linking failure). Never drop linkage reporting on rerecord even when it's all green. * status == "has_failures" → response MUST contain (in order, no collapsing rows even when failures look homogeneous — the dev needs the full inventory): 1. per-suite table — one row per suite in per_suite (passing suites included), columns = Suite name | passed/total steps. 2. failed-steps table — ONE ROW per entry in failed_steps[], columns = Suite | Step name | Method + URL | Expected → Actual status | mock_mismatch y/n. 3. Diagnosis + Recommendation (rules below). Do NOT print aggregate step totals across suites. Frame the diagnosis from the glossary: a mock mismatch IS the signal that the sandbox test has drifted from current app behavior. The three routes below (SKIP / FIX-CODE / FIX-TEST-RERECORD) are not separate buckets — they're three possible SOURCES of that drift: * keploy proxy didn't replay correctly → drift is artificial, no real change → route A (SKIP). * app regressed → drift is unintended, fix the code → route B. * contract changed on purpose → drift is intentional, refresh the sandbox test → route C. Your repo inspection picks which source applies; the routes are the prescription for that source. DIAGNOSE WITH THE REPO, NOT THE DEV. Before recommending anything on a failing run, inspect the source tree yourself (git log / git diff against the last green run or main, read the failing handler + its downstream call sites). DO NOT ask the dev "did you change X since the last green run" — you have the repo, find the answer. Only come back with a concrete conclusion. * mock_mismatch_dominant == true → failure signature is "keploy didn't intercept the app's egress traffic". Use git to check whether the failing endpoints or their dependency wiring have been modified recently: (a) NO relevant changes → tell the dev this is almost certainly a KEPLOY-SIDE issue and ask them to file a keploy issue with test_run_id. Do NOT ask them to re-record. (b) Relevant changes EXIST → name them (file:line or commit hash), explain how each plausibly caused the failure, say whether the change looks intended or accidental, and tell the dev exactly what to fix. * status == "has_failures" AND mock_mismatch_dominant == false → same discipline: identify the commit(s) / diff hunks that most likely caused each failure, state whether they look intended, and prescribe a fix (rerecord, revert, patch the handler). Don't hand the investigation back to the dev. ===== HANDLING "FIX IT" FOLLOW-UPS ===== (After the dev has seen the analysis and as. It is categorised as a Read tool in the Keploy MCP Server, which means it retrieves data without modifying state.

How do I enforce a policy on get_session_report? +

Register the Keploy MCP server in PolicyLayer and add a rule for get_session_report: allow, deny, rate-limit, or require approval. Point your MCP client at the PolicyLayer proxy URL and the rule is enforced on every call, before it reaches Keploy. Nothing to install.

What risk level is get_session_report? +

get_session_report is a Read tool with low risk. Read-only tools are generally safe to allow by default.

Can I rate-limit get_session_report? +

Yes. Add a rate_limit block to the get_session_report rule in your PolicyLayer policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.

How do I block get_session_report completely? +

Set action: deny in the PolicyLayer policy for get_session_report. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.

What MCP server provides get_session_report? +

get_session_report is provided by the Keploy MCP server (https://api.keploy.io/client/v1/mcp). PolicyLayer sits as a proxy in front of this server to enforce policies before tool calls reach the server.

Enforce policy on every Keploy tool call.

Deterministic rules across all 103 Keploy tools. Per-identity grants. Full audit log. Live in minutes. Nothing to install.

Free to start. No card required.

4,600+ MCP servers and 31,000+ tools scanned and risk-classified.

// GET IN TOUCH

Have a question or want to learn more? Send us a message.

Message sent.

We'll get back to you soon.