Skip to content

etr/claude_boot_stats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

claude_boot_stats

Transparent HTTP proxy that captures a Claude Code first-turn request, tokenizes each component (system prompt blocks, tools, message blocks, claudeMd sub-sections) with the Qwen3 BPE, calibrates against the server's actual token usage, and writes a breakdown as CSV (default) and/or markdown. Two extra drill-down tables attribute the Agent tool description and the skills-catalog message block to their originating plugins.

Install

git clone <this-repo>
cd claude_boot_stats
pip install -e .

The Qwen3 tokenizer (~20 MB of config + vocab) is downloaded from Hugging Face on first run; subsequent runs are offline.

Use

# default: proxy on 127.0.0.1:8788, CSV output trio (main + agents + skills + mcp + ...)
python -m claude_boot_stats --csv ./cbs.csv

# markdown only
python -m claude_boot_stats --md ./cbs.md

# both
python -m claude_boot_stats --csv ./cbs.csv --md ./cbs.md

Then launch Claude Code pointing at the proxy:

ANTHROPIC_BASE_URL=http://127.0.0.1:8788 claude

Reports are overwritten on each captured request (use --once to stop after the first).

Outputs

  • --csv <path> writes three files: <path> (main breakdown), <path-stem>.agents.csv, <path-stem>.skills.csv. Each agent / skill row carries its originating plugin (prefix before the first : in the identifier; core when there is no prefix).
  • --md <path> writes one markdown file containing all three tables + subtotals.
  • If neither is given, the default is ./claude_boot_stats.csv.

Flags

flag default meaning
--port 8788 local proxy port
--upstream https://api.anthropic.com where to forward requests
--csv (default on if --md absent → ./claude_boot_stats.csv) CSV output stem
--md unset markdown output path
--min-body-bytes 10000 ignore small bodies (quota pings etc.)
--once false write one report then keep proxying

How calibration works

Every row in the report shows a qwen_tokens column — the raw Qwen3 BPE count for that component. The proxy captures the response's usage.input_tokens + cache_creation_input_tokens + cache_read_input_tokens from the first message_start SSE event (or the non-stream JSON .usage field). When usage is available (cal_source = auto), each row also gets a claude_tokens column equal to qwen_tokens × (claude_total / qwen_total), so the claude column sums to the ground truth and gives a proportional per-row allocation.

When usage is not captured (cal_source = none), the report shows the qwen_tokens column only — no multiplier is fabricated. Qwen3 and Claude's tokenizer are close but not identical, and the ratio depends heavily on content mix (JSON tool schemas vs. prose), so there is no single "safe" default to substitute.

About

Per-component token cost analyzer for Claude Code first-turn context: HTTP intercept proxy + Qwen3 BPE + ground-truth calibration.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages