Generate a behavioral test scenario with test cases for your agent. Creates exactly ONE scenario containing multiple test cases, then returns a generation_id. Use invarium_get_tests to check status and retrieve results. IMPORTANT: Do NOT call this tool immediately. Before generating, ALWAYS: 1....
Part of the Invarium MCP server. Enforce policies on this tool with Intercept, the open-source MCP proxy.
AI agents use invarium_generate_tests to create or modify resources in Invarium. Write operations carry medium risk because an autonomous agent could trigger bulk unintended modifications. Rate limits prevent a single agent session from making hundreds of changes in rapid succession. Argument validation ensures the agent passes expected values.
Without a policy, an AI agent could call invarium_generate_tests repeatedly, creating or modifying resources faster than any human could review. Intercept's rate limiting ensures write operations happen at a controlled pace, and argument validation catches malformed or unexpected inputs before they reach Invarium.
Write tools can modify data. A rate limit prevents runaway bulk operations from AI agents.
tools:
invarium_generate_tests:
rules:
- action: allow
rate_limit:
max: 30
window: 60 See the full Invarium policy for all 19 tools.
Agents calling write-class tools like invarium_generate_tests have been implicated in these attack patterns. Read the full case and prevention policy for each:
Other tools in the Write risk category across the catalogue. The same policy patterns (rate-limit, validate) apply to each.
Generate a behavioral test scenario with test cases for your agent. Creates exactly ONE scenario containing multiple test cases, then returns a generation_id. Use invarium_get_tests to check status and retrieve results. IMPORTANT: Do NOT call this tool immediately. Before generating, ALWAYS: 1. Review the agent's blueprint (tools, constraints, workflows) to understand what scenarios would be most valuable. 2. Present the user with a suggested test plan — propose scenario categories based on the agent's tools (e.g., "happy path for search_products", "edge case for payment processing", "guardrail test for PII handling"). 3. Ask the user to confirm or adjust the parameters: - scenario_description: a SHORT, high-level description of the behavior area to test. Do NOT list individual test cases — the AI generator creates specific test cases automatically from the description. - test_cases: how many test cases in the scenario (1-25) - complexity: simple / moderate / complex / adversarial / edge_case - failure_category: optional failure category to target: knowledge_failure (hallucinations, outdated info, self-contradictions) reasoning_failure (logic errors, calculation mistakes, planning failures) context_failure (lost conversation context, misinterpreted references) instruction_failure (constraint violations, partial execution) tool_usage_failure (wrong tool, bad params, sequence violations) safety_failure (prompt injection, guardrail bypass, unauthorized actions) communication_failure (unhelpful, unclear, or inappropriate responses) operational_failure (timeouts, rate limits, integration failures) coordination_failure (multi-agent deadlocks, lost handoffs) - persona: optional user persona — novice / expert / frustrated / confused / adversarial 4. Only call this tool after the user explicitly confirms the plan. WHEN TO USE: Phase 2, Step 7 — only after the user has picked which scenarios to generate and confirmed the parameters for each. AFTER THIS: Share the generation ID and ask: "Want to work on the next scenario while this generates, or wait for results?" Args: agent_name: Name of the agent to generate tests for. Must have a blueprint uploaded first. scenario_description: A short, high-level description of the behavior area to test (1-2 sentences). The AI generator creates specific test cases from this. Do NOT enumerate individual test cases here. Good: "Happy path tests covering all tools with valid inputs" Good: "Guardrail bypass attempts on tour scheduling" Bad: "1) Search in SF, 2) Get details for PROP001, 3) Calculate mortgage..." test_cases: Number of test cases to include in the scenario (default: 5, max: 25). complexity: Scenario complexity — "simple", "moderate", "complex", "adversarial", or "edge_case" (default: "moderate"). failure_category: Optional failure category to target. One of: "knowledge_failure" (hallucinations, outdated info, self-contradictions), "reasoning_failure" (logic errors, calculation mistakes, planning failures), "context_failure" (lost context, positional bias, misinterpreted references), "instruction_failure" (constraint violations, partial execution, priority conflicts), "tool_usage_failure" (wrong tool, parameter errors, sequence violations), "safety_failure" (prompt injection, guardrail bypass, unauthorized actions), "communication_failure" (unhelpful, unclear, or inappropriate responses), "operational_failure" (timeouts, rate limits, non-determinism), "coordination_failure" (multi-agent deadlocks, lost handoffs, conflicting actions). persona: Optional user persona — "novice", "expert", "frustrated", "confused", or "adversarial".. It is categorised as a Write tool in the Invarium MCP Server, which means it can create or modify data. Consider rate limits to prevent runaway writes.
Add a rule in your Intercept YAML policy under the tools section for invarium_generate_tests. You can allow, deny, rate-limit, or validate arguments. Then run Intercept as a proxy in front of the Invarium MCP server.
invarium_generate_tests is a Write tool with medium risk. Write tools should be rate-limited to prevent accidental bulk modifications.
Yes. Add a rate_limit block to the invarium_generate_tests rule in your Intercept policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.
Set action: deny in the Intercept policy for invarium_generate_tests. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.
invarium_generate_tests is provided by the Invarium MCP server (invarium-ai/invarium). Intercept sits as a proxy in front of this server to enforce policies before tool calls reach the server.
Open source. One binary. Zero dependencies.
npx -y @policylayer/intercept