claude_boot_stats

Transparent HTTP proxy that captures a Claude Code first-turn request, tokenizes each component (system prompt blocks, tools, message blocks, claudeMd sub-sections) with the Qwen3 BPE, calibrates against the server's actual token usage, and writes a breakdown as CSV (default) and/or markdown. Two extra drill-down tables attribute the Agent tool description and the skills-catalog message block to their originating plugins.

Install

git clone <this-repo>
cd claude_boot_stats
pip install -e .

The Qwen3 tokenizer (~20 MB of config + vocab) is downloaded from Hugging Face on first run; subsequent runs are offline.

Use

# default: proxy on 127.0.0.1:8788, CSV output trio (main + agents + skills + mcp + ...)
python -m claude_boot_stats --csv ./cbs.csv

# markdown only
python -m claude_boot_stats --md ./cbs.md

# both
python -m claude_boot_stats --csv ./cbs.csv --md ./cbs.md

Then launch Claude Code pointing at the proxy:

ANTHROPIC_BASE_URL=http://127.0.0.1:8788 claude

Reports are overwritten on each captured request (use --once to stop after the first).

Outputs

--csv <path> writes three files: <path> (main breakdown), <path-stem>.agents.csv, <path-stem>.skills.csv. Each agent / skill row carries its originating plugin (prefix before the first : in the identifier; core when there is no prefix).
--md <path> writes one markdown file containing all three tables + subtotals.
If neither is given, the default is ./claude_boot_stats.csv.

Flags

flag	default	meaning
`--port`	8788	local proxy port
`--upstream`	`https://api.anthropic.com`	where to forward requests
`--csv`	(default on if `--md` absent → `./claude_boot_stats.csv`)	CSV output stem
`--md`	unset	markdown output path
`--min-body-bytes`	10000	ignore small bodies (quota pings etc.)
`--once`	false	write one report then keep proxying

How calibration works

Every row in the report shows a qwen_tokens column — the raw Qwen3 BPE count for that component. The proxy captures the response's usage.input_tokens + cache_creation_input_tokens + cache_read_input_tokens from the first message_start SSE event (or the non-stream JSON .usage field). When usage is available (cal_source = auto), each row also gets a claude_tokens column equal to qwen_tokens × (claude_total / qwen_total), so the claude column sums to the ground truth and gives a proportional per-row allocation.

When usage is not captured (cal_source = none), the report shows the qwen_tokens column only — no multiplier is fabricated. Qwen3 and Claude's tokenizer are close but not identical, and the ratio depends heavily on content mix (JSON tool schemas vs. prose), so there is no single "safe" default to substitute.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
claude_boot_stats		claude_boot_stats
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

claude_boot_stats

Install

Use

Outputs

Flags

How calibration works

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

claude_boot_stats

Install

Use

Outputs

Flags

How calibration works

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages