feat(BUB-213): cross-provider USD cost normalization#132
feat(BUB-213): cross-provider USD cost normalization#132
Conversation
…mission
- packages/cli/src/runtime/shared/pricing.ts: new pricing table (8 models),
computeCost(model, usage) returning {input_usd, output_usd, total_usd} or
undefined for unknown models; SPANORY_PRICING_<MODEL_SAFE> env override support
- packages/cli/src/runtime/shared/usage.ts: usageAttributes gains optional model
param; emits gen_ai.cost.{input_usd,output_usd,total_usd,model} when cost is
computable
- packages/cli/src/runtime/shared/turn.ts: pass latestModel / assistant.model to
usageAttributes at all 5 call sites
- telemetry/field-spec.yaml: register 4 new gen_ai.cost.* fields
- packages/cli/test/unit/pricing.spec.ts: 10 unit tests covering known models,
unknown model → undefined, env override, version suffix stripping, zero tokens
Bububuger
left a comment
There was a problem hiding this comment.
审阅后,现不宜批准;有以下问题需先修:
-
packages/cli/src/runtime/shared/pricing.ts:23-25
OpenAI 定价表有确定性错误。gpt-5.4官方当前价格是$2.50 / 1M input、$15.00 / 1M output,代码却写成了10 / 40。这会把 GPT-5.4 成本放大约 4 倍。参考:https://openai.com/api/pricing/ 与 https://developers.openai.com/api/docs/models/gpt-5.4。 -
packages/cli/src/runtime/shared/pricing.ts:19-27
仓内现有 Codex 数据实际使用的是gpt-5.3-codex,而不是gpt-5.4;但 pricing table 没有这个模型。可见packages/cli/test/unit/normalize.spec.ts:24,38与packages/cli/test/fixtures/codex/sessions/session-a.jsonl:4,16。结果是当前 Codex 会话会直接computeCost(...) === undefined,完全不产出gen_ai.cost.*字段,和此 PR 目标不符。OpenAI 当前gpt-5.3-codex价格为$1.75 / 1M input、$14.00 / 1M output:https://developers.openai.com/api/docs/models/gpt-5.3-codex。 -
packages/cli/src/runtime/shared/pricing.ts:107-112
thinking_tokens被无条件加到了output_tokens上。对 Anthropic extended thinking 这可能成立,但对 OpenAI/Codex 不成立:OpenAI 官方 usage 里output_tokens_details.reasoning_tokens是output_tokens的细项,且max_output_tokens明确“including visible output tokens and reasoning tokens”。现在这段会对 OpenAI/Codex 双重计费。参考:https://platform.openai.com/docs/guides/reasoning?api-mode=chat&lang=python。
补充:zero tokens 已覆盖,但 negative tokens / negative env override 仍可能产出负美元值;建议至少 clamp 到 0 或直接 skip,并补测试。
…g_tokens double-count Issue 1: gpt-5.4 pricing was wrong (10/40 → 2.50/15.00 per 1M tokens) Issue 2: gpt-5.3-codex missing from pricing table (actual model used in Codex sessions) Issue 3: thinking_tokens double-counted for OpenAI/Codex — reasoning_tokens is already included in output_tokens; only Anthropic models report thinking separately. Added thinking_in_output flag to ModelPrice to distinguish providers. Also updates golden fixture and pricing.spec.ts with 5 new regression tests.
Summary
packages/cli/src/runtime/shared/pricing.tswith an 8-model pricing table andcomputeCost(model, usage)that returns{input_usd, output_usd, total_usd}orundefinedfor unknown modelsusageAttributesinusage.tsgains optionalmodelparam; emitsgen_ai.cost.*OTLP attributes when cost is computableturn.tspasseslatestModel/assistant.modeltousageAttributesat all 5 call sitesthinking_tokensbilled at output rate (Anthropic extended thinking)SPANORY_PRICING_<MODEL_SAFE>env var override for custom pricingtelemetry/field-spec.yamlexec-plan → docs/exec-plans/bub-213-usd-cost.md
Test plan
npm run check— 0 errors (903 pre-existing warnings)npm test— 244 tests pass (40 test files), including 10 new pricing testsnpm run telemetry:check— ok, 0 validation errors