latent-tools

Your LLM already knows most of its tools. Stop re-teaching it every request.

The Problem

Every API request to Claude resends all tool definitions -- hundreds of tools, full JSON schemas, detailed descriptions. For a setup with 170+ MCP tools, that's ~26K tokens of redundant schema per request.

Text compression proxies (tamp, wet, squeezr) focus on compressing tool results. The real waste is tool definitions.

The Numbers

In our production workload with 170+ tools:

Approach	Tokens saved
Text compression (results)	8M
Schema compression (definitions)	279M

You're optimizing the wrong thing.

Methodology: Data collected over 8 days of real Claude Code usage (10,893 requests, 173 MCP tools). Token counting via @anthropic-ai/tokenizer. The 8M text compression comparison comes from running the same traffic through tamp's text-only stages. Tool call counts, error rates, and state transitions from the self-learning state file (~/.latent-tools/state.json).

Quick Start

npx latent-tools                                    # Start proxy
export ANTHROPIC_BASE_URL=http://localhost:7778      # Point Claude Code
claude                                               # Done.

How It Works

latent-tools uses an adaptive state machine that progressively strips tool descriptions based on observed success rates. Each tool transitions independently:

NEW ──→ LIGHT ──→ CANARY ──→ NUKED
          ↑           │
          └── PROTECTED ←┘
                (error spike)

NEW -- Full schema sent as-is. No compression. The tool is being observed.

LIGHT -- Description truncated to its first sentence. Parameters and types remain intact.

CANARY -- All descriptions removed. The tool is in an observation period, tracking whether the model still uses it correctly.

NUKED -- All descriptions removed, stable. The model has proven it can use this tool from its training data alone. This is a terminal state — the only exit is a model version change (which demotes NUKED → LIGHT for re-validation).

PROTECTED -- Rolled back from CANARY after an error spike. Enters a 30-day cooldown before re-attempting compression (returns to LIGHT).

Seed List

Claude Code's built-in tools (Read, Write, Bash, Grep, etc.) and well-known MCP tools (GitHub, Gmail, Google Calendar) start at NUKED. Claude has strong training data for these -- there's no reason to send their descriptions.

Self-Learning

The proxy intercepts tool_result blocks in follow-up requests to track success and failure per tool. Successful calls advance the tool through the state machine. Errors trigger rollbacks. No external feedback loop required.

Production Metrics

Metric	Value
Requests processed	10,893
Avg tokens saved/request	25,662
Total tokens saved	279.5M
Tools tracked	173
Tools at max compression	115 (66%)
Tool calls after compression	1,529,820
Success rate	98.15%*
Rollbacks	0

* Success rate measures explicit error responses (is_error flags and error-pattern matching in tool results). It does not capture tool selection correctness — if the model silently chooses a different approach instead of using a compressed tool, that miss is invisible to the proxy. This is the primary silent failure risk.

CLI Options

Usage: latent-tools [options]

Options:
  --port, -p <port>     Port (default: 7778)
  --upstream, -u <url>  Upstream API (default: https://api.anthropic.com)
  --verbose, -v         Enable logging
  --help, -h            Help
  --version             Version

Endpoints

Method	Path	Description
POST	`/v1/messages`	Proxy with schema compression
GET	`/health`	Health check
GET	`/stats`	Token savings statistics

Chainability

latent-tools compresses schemas; text compression proxies compress results. They stack:

Claude Code → latent-tools (:7778) → tamp (:7779) → api.anthropic.com

Set --upstream http://localhost:7779 to chain through tamp or any other proxy.

Compatibility

Targets Anthropic Messages API (POST /v1/messages)
Handles gzip, deflate, br, and zstd request encoding
Bodies over 50MB are passed through uncompressed
Non-Messages API paths (except /health and /stats, which are handled locally) are proxied unchanged to the upstream

Privacy and Data

Learning state and stats are persisted to ~/.latent-tools/
No data is sent anywhere except your configured upstream API
The learning file contains tool names and call counts, no conversation content

Limitations

Silent failure risk. When a tool schema is heavily compressed, the model may not recognize when to use that tool. Unlike parameter errors (which produce error responses the self-learning system can detect), tool selection misses are invisible -- the model simply uses a different approach. This is an inherent trade-off of latent knowledge exploitation.

Model version sensitivity. Different model versions have different latent knowledge. When latent-tools detects a model version change, it automatically demotes tools for re-validation: learned NUKED tools are demoted to LIGHT, PROTECTED tools are demoted to NEW, and LIGHT/CANARY tools are left unchanged. Seeded (well-known) tools remain at NUKED regardless of model changes.

Seed list assumptions. The built-in seed list assumes Claude has strong training data for common tools (Read, Write, Bash, GitHub MCP, etc.). If Anthropic changes how these tools are handled in a future model, the seed list may need updating.

Seed list bypass. Seeded tools (Claude Code built-ins, common MCP tools) start at NUKED and bypass the self-learning state machine entirely. If a future model loses familiarity with a seeded tool, the proxy won't auto-detect the problem. The denylist prevents destructive-sounding tools (names matching delete, remove, destroy, etc.) from ever being fully compressed, capping them at LIGHT.

Acknowledgements

latent-tools was born as a patch inside tamp. Thanks to @sliday for building the proxy foundation that inspired this project.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
bin		bin
lib		lib
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

latent-tools

The Problem

The Numbers

Quick Start

How It Works

Seed List

Self-Learning

Production Metrics

CLI Options

Endpoints

Chainability

Compatibility

Privacy and Data

Limitations

Acknowledgements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

latent-tools

The Problem

The Numbers

Quick Start

How It Works

Seed List

Self-Learning

Production Metrics

CLI Options

Endpoints

Chainability

Compatibility

Privacy and Data

Limitations

Acknowledgements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages