11 tools from the Claude Token Saver MCP Server, categorised by risk level.
View the Claude Token Saver policy →cost_dashboard View cumulative cost savings and model usage statistics. get_metrics Get server metrics in Prometheus text format or JSON. Includes request counts, latency, queue stats, cost savings, and health status. list_loaded_models List all models currently loaded in VRAM with usage details. Shows VRAM usage, expiry time, and available slots. recommend_model Recommend the best local LLM model for a given task category based on system specs and installed models. Returns prioritized list with installation... auto_setup Automate the full model setup flow: recommend the best model for a task category, download it if needed, and preload it into VRAM — all in one step. 2/5 batch_offload Submit multiple coding tasks as a batch to the local LLM. Tasks are processed sequentially or in parallel. Supports partial failure. 2/5 compress_context Compress/summarize large text content using a local LLM to reduce cloud token usage. Use for summarizing logs, large files, or verbose context befo... 2/5 configure_model_selector View or modify model selector settings at runtime. Manage blocked models, license filters, and custom model recommendations. 2/5 offload_work Offload coding/text tasks to a local LLM (Ollama) to save Claude API tokens. Use for code generation, refactoring, formatting, boilerplate, and oth... 2/5 preload_model Preload a model into VRAM for warm inference. Sends an empty chat request with keep_alive to keep the model loaded during the session. 2/5 pull_model Download a model from the Ollama registry to local storage. Use this to install recommended models before preloading them into VRAM. 2/5 The Claude Token Saver MCP server exposes 11 tools across 2 categories: Read, Write.
Use Intercept, the open-source MCP proxy. Write YAML rules for each tool — rate limits, argument validation, or deny rules — then run Intercept in front of the Claude Token Saver server.
Claude Token Saver tools are categorised as Read (4), Write (7). Each category has a recommended default policy.
Open source. One binary. Zero dependencies.
npx -y @policylayer/intercept