Skip to content

cyyij/latent-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

latent-tools

Your LLM already knows most of its tools. Stop re-teaching it every request.

The Problem

Every API request to Claude resends all tool definitions -- hundreds of tools, full JSON schemas, detailed descriptions. For a setup with 170+ MCP tools, that's ~26K tokens of redundant schema per request.

Text compression proxies (tamp, wet, squeezr) focus on compressing tool results. The real waste is tool definitions.

The Numbers

In our production workload with 170+ tools:

Approach Tokens saved
Text compression (results) 8M
Schema compression (definitions) 279M

You're optimizing the wrong thing.

Methodology: Data collected over 8 days of real Claude Code usage (10,893 requests, 173 MCP tools). Token counting via @anthropic-ai/tokenizer. The 8M text compression comparison comes from running the same traffic through tamp's text-only stages. Tool call counts, error rates, and state transitions from the self-learning state file (~/.latent-tools/state.json).

Quick Start

npx latent-tools                                    # Start proxy
export ANTHROPIC_BASE_URL=http://localhost:7778      # Point Claude Code
claude                                               # Done.

How It Works

latent-tools uses an adaptive state machine that progressively strips tool descriptions based on observed success rates. Each tool transitions independently:

NEW ──→ LIGHT ──→ CANARY ──→ NUKED
          ↑           │
          └── PROTECTED ←┘
                (error spike)

NEW -- Full schema sent as-is. No compression. The tool is being observed.

LIGHT -- Description truncated to its first sentence. Parameters and types remain intact.

CANARY -- All descriptions removed. The tool is in an observation period, tracking whether the model still uses it correctly.

NUKED -- All descriptions removed, stable. The model has proven it can use this tool from its training data alone. This is a terminal state — the only exit is a model version change (which demotes NUKED → LIGHT for re-validation).

PROTECTED -- Rolled back from CANARY after an error spike. Enters a 30-day cooldown before re-attempting compression (returns to LIGHT).

Seed List

Claude Code's built-in tools (Read, Write, Bash, Grep, etc.) and well-known MCP tools (GitHub, Gmail, Google Calendar) start at NUKED. Claude has strong training data for these -- there's no reason to send their descriptions.

Self-Learning

The proxy intercepts tool_result blocks in follow-up requests to track success and failure per tool. Successful calls advance the tool through the state machine. Errors trigger rollbacks. No external feedback loop required.

Production Metrics

Metric Value
Requests processed 10,893
Avg tokens saved/request 25,662
Total tokens saved 279.5M
Tools tracked 173
Tools at max compression 115 (66%)
Tool calls after compression 1,529,820
Success rate 98.15%*
Rollbacks 0

* Success rate measures explicit error responses (is_error flags and error-pattern matching in tool results). It does not capture tool selection correctness — if the model silently chooses a different approach instead of using a compressed tool, that miss is invisible to the proxy. This is the primary silent failure risk.

CLI Options

Usage: latent-tools [options]

Options:
  --port, -p <port>     Port (default: 7778)
  --upstream, -u <url>  Upstream API (default: https://api.anthropic.com)
  --verbose, -v         Enable logging
  --help, -h            Help
  --version             Version

Endpoints

Method Path Description
POST /v1/messages Proxy with schema compression
GET /health Health check
GET /stats Token savings statistics

Chainability

latent-tools compresses schemas; text compression proxies compress results. They stack:

Claude Code → latent-tools (:7778) → tamp (:7779) → api.anthropic.com

Set --upstream http://localhost:7779 to chain through tamp or any other proxy.

Compatibility

  • Targets Anthropic Messages API (POST /v1/messages)
  • Handles gzip, deflate, br, and zstd request encoding
  • Bodies over 50MB are passed through uncompressed
  • Non-Messages API paths (except /health and /stats, which are handled locally) are proxied unchanged to the upstream

Privacy and Data

  • Learning state and stats are persisted to ~/.latent-tools/
  • No data is sent anywhere except your configured upstream API
  • The learning file contains tool names and call counts, no conversation content

Limitations

Silent failure risk. When a tool schema is heavily compressed, the model may not recognize when to use that tool. Unlike parameter errors (which produce error responses the self-learning system can detect), tool selection misses are invisible -- the model simply uses a different approach. This is an inherent trade-off of latent knowledge exploitation.

Model version sensitivity. Different model versions have different latent knowledge. When latent-tools detects a model version change, it automatically demotes tools for re-validation: learned NUKED tools are demoted to LIGHT, PROTECTED tools are demoted to NEW, and LIGHT/CANARY tools are left unchanged. Seeded (well-known) tools remain at NUKED regardless of model changes.

Seed list assumptions. The built-in seed list assumes Claude has strong training data for common tools (Read, Write, Bash, GitHub MCP, etc.). If Anthropic changes how these tools are handled in a future model, the seed list may need updating.

Seed list bypass. Seeded tools (Claude Code built-ins, common MCP tools) start at NUKED and bypass the self-learning state machine entirely. If a future model loses familiarity with a seeded tool, the proxy won't auto-detect the problem. The denylist prevents destructive-sounding tools (names matching delete, remove, destroy, etc.) from ever being fully compressed, capping them at LIGHT.

Acknowledgements

latent-tools was born as a patch inside tamp. Thanks to @sliday for building the proxy foundation that inspired this project.

License

MIT

About

Your LLM already knows most of its tools. Stop re-teaching it every request.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors