fix: skip synthetic zero-usage assistant entries in billing by jverre · Pull Request #23 · comet-ml/opik-claude-code-plugin

jverre · 2026-06-12T09:14:37Z

Details

Second phantom-token mechanism found after the compaction fix (#22), exposed by the Σ lanes vs usage discrepancy that #22 deliberately left visible.

Claude Code writes locally fabricated assistant entries into the transcript when an API call errors or is interrupted: model: "<synthetic>", isApiErrorMessage: true, and a complete usage object where every field is zero. These were never billed by the API. llmCallsInTurn only checked Usage != nil, so each one was treated as a real LLM call whose "request" (the entire prior conversation) got reconciled against a zero-token measured prompt: estimated pieces scaled to zero and the usage-derived pieces overflowed the positional cut into the fresh-input tier.

Two such entries in a real session produced 665,727 phantom input tokens (lane input summed to 781,745 vs 115,337 actually billed - ~$6.7 of phantom spend at fable-5 input rates).

Fix: skip calls whose usage sums to zero. They bill nothing, so nothing is lost - and this covers any future unbilled-entry variant, not just <synthetic>.

Testing

TestBillingSkipsSyntheticZeroUsageCalls: a zero-usage synthetic entry between two real calls is excluded from llm_calls and Σ lanes still equals usage
Dry-run on the real session that exposed the bug: all four lane columns now reconcile to API usage token-for-token (input 130,000 / cache_read 328,138,781 / cache_creation 3,004,255 / output 733,462) - including output, whose previous small drift also traced to the synthetic blocks
Full suite, gofmt, go vet clean; binaries rebuilt

🤖 Generated with Claude Code

Claude Code writes locally fabricated assistant entries (model "<synthetic>", isApiErrorMessage: true) with an all-zero usage object when an API call errors or is interrupted. They were never billed, but llmCallsInTurn only checked Usage != nil, so each one became a "call" whose layout reconciled against a zero-token prompt: estimates scaled to zero and the usage-derived pieces dumped into the fresh-input tier. Two such entries in a real session produced 665,727 phantom input tokens (lane input 781,745 vs 115,337 actually billed). Skip calls whose usage sums to zero — they bill nothing, so nothing is lost. On the same session all four lane columns now reconcile to API usage token-for-token, including output (the synthetic blocks were also the source of a small output drift).

jverre merged commit 1fe842e into main Jun 12, 2026
1 check passed

jverre mentioned this pull request Jun 12, 2026

feat: flag token-count inconsistencies — cc.billing.reconciliation + feedback score #24

Merged

jverre deleted the jacques/OPIK-6873-skip-synthetic-calls branch June 12, 2026 21:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: skip synthetic zero-usage assistant entries in billing#23

fix: skip synthetic zero-usage assistant entries in billing#23
jverre merged 1 commit into
mainfrom
jacques/OPIK-6873-skip-synthetic-calls

jverre commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jverre commented Jun 12, 2026

Details

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant