feat: capture provider cost details with three-tier dispatch#443
Open
harrisony wants to merge 4 commits into
Open
feat: capture provider cost details with three-tier dispatch#443harrisony wants to merge 4 commits into
harrisony wants to merge 4 commits into
Conversation
…r total cost The previous implementation used a single safeCost() call wrapping a ?? chain: safeCost(details.total_cost ?? usage?.cost ?? usage?.estimated_cost). This had two problems: - upstream_inference_cost was not considered at all - The entire ?? chain was validated as one, so safeCost() could not distinguish which source provided the value Restructure into an explicit three-step fallback, each validated independently: 1. cost_details.total_cost (LLM gateway, most detailed) 2. usage.cost (direct cost for OpenRouter-style providers) 3. cost_details.upstream_inference_cost (upstream OpenRouter-style cost) Steps 2→3 use || (falsy coalescing) to handle the OpenRouter quirk where usage.cost and/or upstream_inference_cost may be populated depending on the upstream provider and key type used (e.g. BYOK keys report 0 for usage.cost, but upstream_inference_cost carries the actual provider cost).
… from standard fields Previously upstream_inference_prompt_cost was aliased directly into input_cost. However, these fields have different semantics: upstream_inference_prompt_cost = input_cost + cached_input_cost (i.e., the combined prompt cost including cached tokens) Aliasing it into input_cost caused applyUsageCostDetails to zero out costCached, silently merging the cached portion into costInput. Changes: - Stop aliasing upstream_inference_prompt_cost → input_cost and upstream_inference_completions_cost → output_cost - Add same-tier aliasing: upstream_inference_input_cost → upstream_inference_prompt_cost and upstream_inference_output_cost → upstream_inference_completions_cost (same semantics, different provider naming conventions) - Update extraction tests to assert upstream fields stay separate - Add tests for BYOK fallback, Responses API input/output variants, and LLM Gateway field priority
Previously applyUsageCostDetails only had two branches: superset (per-bucket breakdown) and minimal (proportional distribution). Now that extractUsageCostDetails no longer aliases normal-tier upstream_inference_prompt_cost into input_cost, the normal-tier case needs explicit handling. Three tiers: 1. Superset: input_cost/cached_input_cost/cache_write_input_cost present → use per-bucket breakdown directly 2. Normal: upstream_inference_prompt_cost/completions_cost present but no input-side superset fields → use upstream prompt/completions split, then distribute the prompt portion by Plexus's own cache ratio 3. Minimal: no breakdown at all → proportional distribution from previously calculated costs
Add tests for: zero-cost non-BYOK requests, OpenRouter markup (cost >> upstream sum), zero prompt tokens, heavy-cache-hit ratio split, end-to-end BYOK and non-BYOK extract+apply flows. Also normalise scientific notation literals and add missing upstream_inference_* fields to existing superset fixtures. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
upstream_inference_costfallback for total cost so BYOK requests (whereusage.cost=0) report the real spendupstream_inference_prompt_cost→input_cost; keep upstream fields separate sinceupstream_inference_prompt_cost = input_cost + cached_input_cost(aliasing silently zeroed outcostCached)applyUsageCostDetails— superset (per-bucket gateway fields) → normal (upstream prompt/completions split with cache ratio) → minimal (proportional fallback)Test plan
bun run test:force-all)bunx biome lint .)git rebase --exec)🤖 Generated with Claude Code