Generate a behavioral test scenario with test cases for your agent. Creates exactly ONE scenario containing multiple test cases, then returns a generation_id. Use invarium_get_tests to check status and retrieve results. IMPORTANT: Do NOT call this tool immediately. Before generating, ALWAYS: 1. R...
Part of the Invarium server.
Free to start. No card required.
AI agents use invarium_generate_tests to create or modify resources in Invarium. Write operations carry medium risk because an autonomous agent could trigger bulk unintended modifications. Rate limits prevent a single agent session from making hundreds of changes in rapid succession. Argument validation ensures the agent passes expected values.
Without a policy, an AI agent could call invarium_generate_tests repeatedly, creating or modifying resources faster than any human could review. PolicyLayer's rate limiting ensures write operations happen at a controlled pace, and argument validation catches malformed or unexpected inputs before they reach Invarium.
Write tools can modify data. A rate limit prevents runaway bulk operations from AI agents.
{
"version": "1",
"default": "deny",
"tools": {
"invarium_generate_tests": {
"limits": [
{
"counter": "invarium_generate_tests_rate",
"window": "minute",
"max": 30,
"scope": "grant"
}
]
}
}
} See the full Invarium policy for all 19 tools.
These attack patterns abuse exactly the kind of access invarium_generate_tests gives an agent. Each links to the full case and the policy that stops it:
Other write tools across the catalogue. The same approach applies to each: rate-limit and validate the arguments.
Generate a behavioral test scenario with test cases for your agent. Creates exactly ONE scenario containing multiple test cases, then returns a generation_id. Use invarium_get_tests to check status and retrieve results. IMPORTANT: Do NOT call this tool immediately. Before generating, ALWAYS: 1. Review the agent's blueprint (tools, constraints, workflows) to understand what scenarios would be most valuable. 2. Present the user with a suggested test plan — propose scenario categories based on the agent's tools (e.g., "happy path for search_products", "edge case for payment processing", "guardrail test for PII handling"). 3. Ask the user to confirm or adjust the parameters: - scenario_description: a SHORT, high-level description of the behavior area to test. Do NOT list individual test cases — the AI generator creates specific test cases automatically from the description. - test_cases: how many test cases in the scenario (1-25) - complexity: simple / moderate / complex / adversarial / edge_case - failure_category: optional failure category to target: knowledge_failure (hallucinations, outdated info, self-contradictions) reasoning_failure (logic errors, calculation mistakes, planning failures) context_failure (lost conversation context, misinterpreted references) instruction_failure (constraint violations, partial execution) tool_usage_failure (wrong tool, bad params, sequence violations) safety_failure (prompt injection, guardrail bypass, unauthorized actions) communication_failure (unhelpful, unclear, or inappropriate responses) operational_failure (timeouts, rate limits, integration failures) coordination_failure (multi-agent deadlocks, lost handoffs) - persona: optional user persona — novice / expert / frustrated / confused / adversarial 4. Only call this tool after the user explicitly confirms the plan. WHEN TO USE: Phase 2, Step 7 — only after the user has picked which scenarios to generate and confirmed the parameters for each. AFTER THIS: Share the generation ID and ask: "Want to work on the next scenario while this generates, or wait for results?" Args: agent_name: Name of the agent to generate tests for. Must have a blueprint uploaded first. scenario_description: A short, high-level description of the behavior area to test (1-2 sentences). The AI generator creates specific test cases from this. Do NOT enumerate individual test cases here. Good: "Happy path tests covering all tools with valid inputs" Good: "Guardrail bypass attempts on tour scheduling" Bad: "1) Search in SF, 2) Get details for PROP001, 3) Calculate mortgage..." test_cases: Number of test cases to include in the scenario (default: 5, max: 25). complexity: Scenario complexity — "simple", "moderate", "complex", "adversarial", or "edge_case" (default: "moderate"). failure_category: Optional failure category to target. One of: "knowledge_failure" (hallucinations, outdated info, self-contradictions), "reasoning_failure" (logic errors, calculation mistakes, planning failures), "context_failure" (lost context, positional bias, misinterpreted references), "instruction_failure" (constraint violations, partial execution, priority conflicts), "tool_usage_failure" (wrong tool, parameter errors, sequence violations), "safety_failure" (prompt injection, guardrail bypass, unauthorized actions), "communication_failure" (unhelpful, unclear, or inappropriate responses), "operational_failure" (timeouts, rate limits, non-determinism), "coordination_failure" (multi-agent deadlocks, lost handoffs, conflicting actions). persona: Optional user persona — "novice", "expert", "frustrated", "confused", or "adversarial".. It is categorised as a Write tool in the Invarium MCP Server, which means it can create or modify data. Consider rate limits to prevent runaway writes.
Register the Invarium MCP server in PolicyLayer and add a rule for invarium_generate_tests: allow, deny, rate-limit, or require approval. Point your MCP client at the PolicyLayer proxy URL and the rule is enforced on every call, before it reaches Invarium. Nothing to install.
invarium_generate_tests is a Write tool with medium risk. Write tools should be rate-limited to prevent accidental bulk modifications.
Yes. Add a rate_limit block to the invarium_generate_tests rule in your PolicyLayer policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.
Set action: deny in the PolicyLayer policy for invarium_generate_tests. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.
invarium_generate_tests is provided by the Invarium MCP server (invarium-ai/invarium). PolicyLayer sits as a proxy in front of this server to enforce policies before tool calls reach the server.
Deterministic rules across all 19 Invarium tools. Per-identity grants. Full audit log. Live in minutes. Nothing to install.
Free to start. No card required.
4,600+ MCP servers and 31,000+ tools scanned and risk-classified.