Skip to content

ATOM00blue/tokenlens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔍 TokenLens

Where are your AI agent's tokens going? Find out in 30 seconds.

CI License: MIT

tokenlens scan

→ Walks your local Claude Code / Codex / Cursor logs, aggregates token spend per session, file, and tool call, and prints a sorted report.


Why this exists

March-May 2026 has been brutal for AI coding budgets. Anthropic and OpenAI both tightened usage limits without warning. Reddit threads in r/ClaudeAI, r/cursor, and r/ChatGPTCoding are full of:

"Limits used to last me a workday. Now an hour."

Most agents ship without any visibility into what's burning tokens. Was it the Read tool calling on node_modules again? An assistant that re-explained a 5,000-line file three times? You can't optimize what you can't measure.

TokenLens reads the log files your agents already write, attributes spend to:

  • Sessions — which conversation cost what
  • Files — which paths the agent kept re-reading
  • Tools — Read / Grep / Bash / Edit ranked by tokens
  • Direction — input vs output split

Zero cloud. Zero telemetry. Reads files you already have.


Quick start

Install

go install github.com/ATOM00blue/tokenlens/cmd/tokenlens@latest

Or download a prebuilt binary from Releases.

Use

# Auto-detect ~/.claude, ~/.codex, ~/.cursor
tokenlens scan

# Specific file
tokenlens scan --input ~/.claude/projects/my-app/conversation.jsonl

# Specific directory of logs
tokenlens scan --input ./agent-logs/

# JSON output for scripting
tokenlens scan --format=json > spend.json

# What does TokenLens auto-detect on this machine?
tokenlens list-sources

Sample output

TokenLens Report
================================================================
Sessions:        12
Total messages:  348
Total tool calls: 1,247
Input tokens:    2.4M
Output tokens:   312.8k
TOTAL TOKENS: 2.71M

Top files by token spend:
  node_modules/typescript/lib/typescript.js     412.3k  (8)
  src/main.ts                                   89.1k   (24)
  package-lock.json                             67.2k   (3)
  tsconfig.json                                 12.4k   (15)

Top tools by token spend:
  Read                                          1.8M    (847)
  Grep                                          612.5k  (291)
  Bash                                          143.2k  (89)
  Edit                                          88.4k   (203)

Sessions (top 10 by total tokens):
  session                            tokens      msgs     tools
  ------------------------------------------------------------
  refactor-api                       485.1k        67       234
  initial-spec                       342.8k        45       178
  ...

The Read row above shows almost a third of total spend went to tool calls. The first file row reveals the agent re-reading typescript.js (a generated artifact) 8 times. Both are typical waste signatures TokenLens makes obvious.


Commands

tokenlens scan          Analyze logs and print a report
tokenlens report        Alias for scan --format=text
tokenlens list-sources  Show auto-detected agent log directories
tokenlens version
tokenlens help

Flags

Flag Meaning
-i, --input <path> Log file or directory (repeatable; default: auto-detect)
-s, --source <name> Override source label: claude-code, codex, cursor, jsonl
-f, --format <text|json> Output format (default text)
--no-color Disable ANSI colors

Auto-detection

TokenLens looks for log files in these locations:

Source Path
Claude Code ~/.claude/projects, ~/.claude/sessions, ~/.claude/logs
Codex CLI ~/.codex/sessions, ~/.codex/logs
Cursor ~/.cursor/logs

Run tokenlens list-sources to see which exist on your machine.


How TokenLens estimates tokens

When the log already includes a usage.input_tokens / usage.output_tokens field (Claude Code does), TokenLens uses that — exact.

When tokens aren't in the log (older formats, plain-text logs), TokenLens uses a fast character-based heuristic calibrated against cl100k_base:

  • ~4 chars per token, with adjustments for whitespace runs and symbol density
  • Within ~10% of ground truth on mixed prose+code samples
  • Sessions with estimated counts are flagged with ~ in the output

We deliberately don't bundle a real tokenizer (tiktoken etc.) because:

  1. They require model-specific BPE tables (large data files, often ~2MB+)
  2. They add CGO or large deps
  3. For visualization-quality answers, ±10% is fine

If you need exact counts, pass JSON output through a real tokenizer.


How it compares

Tool What it does Open Source Local-only Free
TokenLens (this) Per-session/file/tool token attribution from logs MIT Yes Yes
Helicone LLM observability SaaS Yes (self-hostable) Optional Freemium
Langfuse LLM trace platform Yes (MIT) Yes (self-hosted) Free OSS
Claude Code's /cost Per-session total only No Yes Free
OpenAI's usage dashboard Cloud, account-level only No No Free

The wedge: TokenLens runs against logs you already have on disk. No proxy, no SDK, no setup. Five seconds from go install to first report.


Limitations (honest about them)

  • We don't intercept live traffic. TokenLens analyzes logs after the fact. For real-time, you need a proxy like Helicone or Portkey.
  • Token counts from heuristics are approximate. Exact when the log has usage; ~10% off otherwise.
  • Cursor doesn't expose all conversations to disk in the same way Claude Code does, so coverage there is partial.
  • Aggregation is per-file-path string match. If the agent reads the same file via two different paths (relative vs absolute), they'll show as separate rows.
  • No call-graph or causal analysis — we tell you what cost tokens, not why a particular conversation went sideways. Pair with raw log reading for that.

Roadmap

  • v0.1: scan, JSON / text reports, file & tool attribution
  • v0.2: tokenlens watch — live monitor as your agent runs
  • v0.2: HTML dashboard output
  • v0.3: Smart pruning hints ("file x.ts was read 8 times — consider pinning it in CLAUDE.md")
  • v0.3: Histogram visualization in terminal
  • v0.4: Real-time SDK proxy mode

Programmatic API

import (
    "github.com/ATOM00blue/tokenlens/internal/parser"
    "github.com/ATOM00blue/tokenlens/internal/aggregator"
    "github.com/ATOM00blue/tokenlens/internal/types"
)

events, _ := parser.Parse("session.jsonl", types.SourceClaude)
sessions := aggregator.Aggregate(events)
report := aggregator.Report(sessions)
fmt.Printf("Total: %d tokens across %d sessions\n",
    report.Total.TotalTokens, len(sessions))

Contributing

git clone https://github.com/ATOM00blue/tokenlens.git
cd tokenlens
go test ./...
go build -o tlens ./cmd/tokenlens
./tlens list-sources

What we want most:

  • Format adapters for less-common agents (Aider, OpenCode, Goose)
  • Better path normalization so duplicate paths roll up correctly
  • Tokenizer-accurate estimates as an opt-in build-tag

See CONTRIBUTING.md.


License

MIT

About

Where are your AI agent's tokens going? Per-session/file/tool token attribution from local Claude Code / Codex / Cursor logs. Single static Go binary, zero runtime deps, no telemetry.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages