Skip to content

mr-beaver/tokencost

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TokenCost

Know exactly what you spend on AI β€” and automatically spend less.

A local proxy that sits between your tools and the LLM APIs. It tracks every token, calculates real costs, and optimizes requests on the fly. Works with Claude Code, VS Code, Claude Desktop, OpenClaw, and 15+ API providers (OpenAI, Groq, Mistral, DeepSeek, xAI, Perplexity, Cerebras, Together, Fireworks, Cohere, OpenRouter, Ollama, Amazon Bedrock, Google Gemini, and more).

image image image image

Why

If you use Claude Code or VS Code AI tools heavily, you're probably spending $2–10/day without knowing exactly where it goes. This tool:

  • Shows you a real-time breakdown by tool, model, session, and task type
  • Automatically reduces your bill by 50–80% via prompt caching and smart model routing
  • Works locally β€” your prompts never leave your machine

What it does

πŸ“Š Live Dashboard

Full spend analytics at http://localhost:8082/dashboard:

  • Cost by day / week / month β€” trend chart with daily breakdown
  • By task type β€” Coding vs Shell vs Agent vs Planning vs Web
  • By model β€” Claude Haiku / Sonnet / Opus / GPT-4o / etc.
  • By source β€” Claude Code, Claude Desktop, VS Code Extensions, OpenClaw, GitHub Copilot (usage only), API providers
  • Cache analytics β€” hit rate, money saved vs. what you'd pay without caching
  • Session view β€” every conversation: tokens in/out, cost, tools used
  • RAW Logs β€” last 500 requests with full prompt preview
  • Health grade A–F β€” scores your efficiency and tells you what to fix

🍎 macOS Menu Bar Widget

Always-visible cost tracker in your menu bar. Click to see:

  • Today's spend + health grade
  • 7-day trend chart with hover tooltips
  • Breakdown by task, model, cache, tools
  • Smart Routing optimizer stats
  • One-click sync from local logs
image

⚑ Automatic Optimizations

The proxy applies these silently on every request:

Optimization What it does Typical savings
Prompt Cache Auto-tags large system prompts and user messages for caching 60–90% on repeat reads
Smart Routing Routes simple requests to Haiku instead of Sonnet/Opus 5–25Γ— cheaper per request
Thinking Budget Caps budget_tokens for extended thinking based on complexity 80–90% on thinking tokens
Message Trim Removes old messages when context exceeds 50k tokens Prevents runaway costs
Session Cap Limits history to 40 messages per session Prevents VS Code "prompt too long" errors
Deduplication Returns cached response for identical requests within 5s 100% on accidental double-sends

Smart Routing scores each prompt 0–10 (length, keywords, code presence) and silently downgrades model when complexity is low. You get the same response, cheaper.

image

Install

⚠️ macOS only β€” This tool requires macOS (Monterey 12+) and the launchd system. Linux and Windows are not currently supported.

Requirements: Python 3.9+

Step 1 β€” first time only:

cd ~ && git clone https://github.com/mr-beaver/tokencost && cd tokencost && bash onbording.sh

The setup script installs everything and adds a tokencost command to your terminal.

Step 2 β€” every time after (restart / update):

Open Terminal and type:

tokencost

First install adds this command automatically. Open a new terminal tab after step 1 for it to appear.

The setup script:

  1. Creates a Python virtualenv and installs dependencies
  2. Imports your full Claude usage history from local logs
  3. Sets ANTHROPIC_BASE_URL=http://localhost:8082 in ~/.zshrc and macOS launchd
  4. Adds tokencost alias for quick restarts
  5. Starts the proxy and opens the dashboard

After install, all your Claude Code and VS Code requests flow through the proxy automatically.

macOS Menu Bar Widget (optional)

The pre-built app is included in the repo β€” onbording.sh installs and launches it automatically.

To install manually:

cp -R menubar/TokenCostBar.app ~/Applications/
open ~/Applications/TokenCostBar.app

Build from source (requires Xcode Command Line Tools):

cd menubar && bash build.sh
mv TokenCostBar.app ~/Applications/
open ~/Applications/TokenCostBar.app

Supported Sources

Local applications (auto-tracked from logs, synced every 5 minutes)

  • Claude Code / Claude CLI β€” from ~/.claude/projects/**/*.jsonl
  • Claude Desktop β€” from ~/Library/.../local-agent-mode-sessions/**/*.jsonl
  • OpenClaw β€” from ~/.openclaw/agents/**/*.jsonl
  • VS Code Extensions (Cline, Roo Code, Kilo Code, IBM Bob) β€” from VSCode/workspaceStorage/*/tasks/ui_messages.json
  • GitHub Copilot β€” usage tracking from VS Code logs (model + requests, pricing unavailable)

API Providers (real-time via proxy β€” set base URL)

Set the base URL for each provider you want to track:

export ANTHROPIC_BASE_URL=http://localhost:8082            # Claude (auto-set by onbording.sh)
export OPENAI_BASE_URL=http://localhost:8082/openai        # OpenAI
export GROQ_API_BASE=http://localhost:8082/groq            # Groq
export MISTRAL_API_BASE=http://localhost:8082/mistral      # Mistral
export DEEPSEEK_API_BASE=http://localhost:8082/deepseek    # DeepSeek

Supported: Anthropic Β· OpenAI Β· Groq Β· Mistral Β· DeepSeek Β· xAI Β· Perplexity Β· Cerebras Β· Together Β· Fireworks Β· Cohere Β· OpenRouter Β· Ollama Β· Amazon Bedrock Β· Google Gemini

218 models in the pricing database.


How it works

Your tool (Claude Code / VS Code / etc.)
        ↓
  localhost:8082  ← proxy runs here
        ↓  scores complexity, applies optimizations, logs tokens+cost
  api.anthropic.com / api.openai.com / etc.
        ↓
  Response back through proxy β†’ your tool

The proxy is transparent β€” your tools see it as the real API. Zero changes to your workflow.


Smart Routing details

When enabled (onbording.sh β†’ option 1), the proxy scores the last user message:

Score Original model Routes to Savings
0–2 Sonnet Haiku ~5Γ— cheaper
0–2 Opus Haiku ~25Γ— cheaper
3–5 Opus Sonnet ~5Γ— cheaper
6–10 any unchanged β€”

Simple questions (what is X, explain Y), short messages, and tool-chain intermediates score 0–2. Long coding tasks with keywords like implement, refactor, debug score 6+.


Stats API

curl http://localhost:8082/stats?period=today   # today's stats (JSON)
curl http://localhost:8082/stats?period=7d      # last 7 days
curl http://localhost:8082/stats?period=30d     # last 30 days
curl http://localhost:8082/raw-logs?limit=100   # recent requests

Data & Privacy

  • Everything stored locally in tracker.db (SQLite)
  • Nothing is sent to any external service
  • The proxy only reads your token usage metadata β€” not the content of your conversations (prompt previews are stored locally, never transmitted)
  • Stop the proxy anytime: bash onbording.sh β†’ option 2

Stop / Uninstall

bash onbording.sh   # choose option 2 β€” stops proxy, removes env vars

⚠️ Don't kill the proxy process manually if you're using Claude Code β€” Claude routes through it and will crash with exit 143. Always use onbording.sh.


Project structure

tokencost/
β”œβ”€β”€ proxy.py           β€” FastAPI proxy, port 8082
β”œβ”€β”€ optimizer.py       β€” Request optimizations (cache, routing, trim)
β”œβ”€β”€ db.py              β€” SQLite schema, cost calculations, analytics
β”œβ”€β”€ import_history.py  β€” Import historical logs from Claude CLI, Desktop, OpenClaw
β”œβ”€β”€ projects.py        β€” Project/session tracking
β”œβ”€β”€ dashboard.html     β€” Web dashboard UI
β”œβ”€β”€ onbording.sh       β€” Setup / start / stop script
└── menubar/           β€” macOS SwiftUI menu bar app
    β”œβ”€β”€ build.sh
    └── Sources/TokenCostBar/
        β”œβ”€β”€ MenuBarView.swift
        └── StatsModel.swift

License

MIT

About

See exactly what you spend on Claude CLI + cost optimizations functions

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors