Skip to content

fix(langfuse): avoid double-counting cache and reasoning tokens in usage#108

Merged
flore2003 merged 2 commits into
mainfrom
fix/langfuse-usage-double-counting
Apr 14, 2026
Merged

fix(langfuse): avoid double-counting cache and reasoning tokens in usage#108
flore2003 merged 2 commits into
mainfrom
fix/langfuse-usage-double-counting

Conversation

@flore2003
Copy link
Copy Markdown
Member

Summary

  • Langfuse sums all usageDetails keys containing input for "Input usage" and output for "Output usage". Since input already included cache tokens and output already included reasoning tokens, they were being double-counted in the UI breakdown.
  • input now reports inputTokens - cacheReadTokens - cacheWriteTokens (non-cached portion only)
  • output now reports outputTokens - reasoningTokens (non-reasoning portion only)
  • Dropped the explicit total key — Langfuse derives it correctly by summing all keys

Test plan

  • All 9 existing tests updated and passing

Langfuse sums all keys containing "input" for Input usage and "output"
for Output usage. Since inputTokens/outputTokens already included
cache/reasoning tokens, they were double-counted. Now input/output
report only their non-overlapping portion, and total is omitted so
Langfuse derives it correctly by summing all keys.
@flore2003 flore2003 merged commit fb9162a into main Apr 14, 2026
3 checks passed
@flore2003 flore2003 deleted the fix/langfuse-usage-double-counting branch April 14, 2026 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant