Problem
There's currently no way to see how many tokens Luna has consumed, what the current context window usage is, or whether any rate limits are approaching — without digging into logs manually.
Proposal
Add a /usage owner-only slash command (intercepted by the harness, not Claude) that returns a snapshot of current token and rate-limit state.
Sample output:
📊 Usage snapshot — 2026-05-11 06:02 UTC
Context window: ~320K / 1M tokens (32%)
Session length: 4h 12m (since last restart)
API usage (rolling 24h):
Input tokens: 1,842,300
Output tokens: 214,500
Total: 2,056,800
Rate limits:
Requests/min: 12 / 50
Tokens/min: 48,200 / 200,000
Subagents spawned today: 5
└ trends-uzbekistan: 20,045 tokens
└ trends (global): 18,108 tokens
└ github-watch: 14,225 tokens
└ ...
Backups: 3 prompt backups in data/prompt_backups/
Memory: self/learnings.md — 14.2 KiB / 64 KiB
Implementation notes
- Context window % can be derived from the
usage field in Claude API responses (input_tokens / max_context)
- Rolling 24h API usage requires the harness to accumulate token counts from each CC response
- Rate limit state is available from Anthropic API response headers (
anthropic-ratelimit-*)
- Subagent token usage is already returned in task notifications — harness should aggregate these
- Memory file sizes can be read from disk at command time
Access
Owner-only (same gate as /health, /audit). Should be interceptable by the harness before reaching Luna, so it works even when Luna's context is full.
Related
Problem
There's currently no way to see how many tokens Luna has consumed, what the current context window usage is, or whether any rate limits are approaching — without digging into logs manually.
Proposal
Add a
/usageowner-only slash command (intercepted by the harness, not Claude) that returns a snapshot of current token and rate-limit state.Sample output:
Implementation notes
usagefield in Claude API responses (input_tokens / max_context)anthropic-ratelimit-*)Access
Owner-only (same gate as
/health,/audit). Should be interceptable by the harness before reaching Luna, so it works even when Luna's context is full.Related
/healthalready returns some state —/usageis a deeper drill-down focused on tokens and cost visibility/usagegives the signal needed to decide when to compact