From 94373e19d4c78029ee07d3586a654832961535c6 Mon Sep 17 00:00:00 2001 From: Sunny Purewal Date: Fri, 5 Jun 2026 13:19:07 -0400 Subject: [PATCH] Trim llm-cost READMEs: drop deprecated calibrate section, cut attribution by ~1/3 - llm-cost-estimation: remove the deprecated `calibrate` API section (calibrateCoverage is the documented replacement). - llm-cost-attribution: condense prose and redundant code examples to cut the README from 388 to 276 lines (~29%) without dropping any documented API, use-case table, or valid import. Co-Authored-By: Claude Opus 4.8 --- packages/llm-cost-attribution/README.md | 198 +++++------------------- packages/llm-cost-estimation/README.md | 5 - 2 files changed, 43 insertions(+), 160 deletions(-) diff --git a/packages/llm-cost-attribution/README.md b/packages/llm-cost-attribution/README.md index 15c3fd7..6abc360 100644 --- a/packages/llm-cost-attribution/README.md +++ b/packages/llm-cost-attribution/README.md @@ -19,15 +19,13 @@ CODEX (4 sessions) Models: gpt-5-codex Turns: 340 Quota (pro, 345 samples): 5h 58%→64% (peak 64%) 7d 56%→57% (peak 57%) ``` -Reading that block: **cache read** is tokens the provider served from its prompt cache (cheap, and usually most of the total); **output (reasoning)** is the model's hidden thinking tokens, billed separately from the **visible** answer; **Quota** is how much of your Codex plan's two rolling rate-limit windows — a 5-hour and a 7-day one — these sessions used. - -Requires Node 20+. Zero runtime dependencies. +Reading that block: **cache read** is tokens served from the provider's prompt cache (cheap, usually most of the total); **output (reasoning)** is hidden thinking tokens, billed separately from the **visible** answer; **Quota** is how much of your Codex plan's two rolling rate-limit windows (5-hour and 7-day) these sessions used. Requires Node 20+, zero runtime dependencies. ## How it works -Both CLIs persist every run as JSONL — Claude Code in `~/.claude/projects//.jsonl` (`` is just the run's working directory with `/` and `.` rewritten to `-`), Codex in `~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl` — and each file records, per turn, the provider-reported token counts (the same numbers your account is billed against) plus, for Codex, its rate-limit usage. This package walks both directories, keeps the sessions whose **working directory** matches the issue ID you ask for, and adds them up. +Both CLIs persist every run as JSONL — Claude Code under `~/.claude/projects/`, Codex under `~/.codex/sessions/` — recording per turn the provider-reported token counts (the same numbers you're billed against) plus, for Codex, rate-limit usage. This package walks both directories, keeps the sessions whose **working directory** matches the issue ID you ask for, and adds them up. -How does a session get matched to an issue? By its **working directory** (`cwd`). Under [Symphony](https://github.com/openai/symphony/blob/main/SPEC.md)'s spec — Symphony being an orchestrator that runs coding agents one issue at a time — each agent runs in a directory dedicated to its issue (`/`), so the issue ID is already baked into every transcript's path; no custom pipeline needed. The default `--cwd-pattern` (the regex that pulls the issue ID out of that path) matches both the spec default (`/symphony_workspaces/`) and the common in-repo layout (`/.symphony/workspaces/`). For any other layout, pass your own regex with one capture group around the ID: +Sessions match an issue by their **working directory** (`cwd`). Under [Symphony](https://github.com/openai/symphony/blob/main/SPEC.md), each agent runs in a per-issue directory, so the issue ID is already baked into every transcript path. The default `--cwd-pattern` matches the Symphony spec default and the in-repo `.symphony/workspaces/` layout. For any other layout, pass your own regex with one capture group around the ID: ```bash llm-cost FOO-12 --cwd-pattern '-([A-Z]+-\d+)$' # ../repo-worktrees/ @@ -77,26 +75,21 @@ llm-cost EPAC-1940 --from-usage ~/llm-cost-history.jsonl rm -rf ~/.claude/projects ~/.codex/sessions # once numbers verified ``` -| | Before | After | -|---|---:|---:| -| Disk | 5.0 GB | 125 MB (40× smaller) | -| Query time | ~3 min | ~0.3 s | - -The bake is lossless for everything the analysis uses (quota windows, Claude cache tiers, Codex reasoning/visible split, totals, models, timestamps, workspace provenance). The format follows the [Symphony Cost Telemetry Extension spec](https://github.com/RiddimSoftware/groove/blob/main/specs/symphony-cost-telemetry-extension/SPEC.md), so a conformant orchestrator can emit `usage.jsonl` directly and skip the bake — optional interop, not required. +Typical result: ~5 GB → ~125 MB (40× smaller), ~3 min → ~0.3 s per query. The bake is lossless for everything the analysis uses (quota windows, cache tiers, reasoning/visible split, totals, models, timestamps, provenance) and follows the [Symphony Cost Telemetry Extension spec](https://github.com/RiddimSoftware/groove/blob/main/specs/symphony-cost-telemetry-extension/SPEC.md), so a conformant orchestrator can emit `usage.jsonl` directly and skip the bake. ## Is the forecast trustworthy? (`calibrate`) -A **P80** is the 80th-percentile cost — the number 80% of comparable issues come in at or below. Claiming "P80 = 12K tokens" is only honest if, on issues the forecaster never saw, the real cost actually lands under 12K about 80% of the time; otherwise it's a horoscope. `calibrate` checks exactly that against a local `usage.jsonl` whose records are **estimate-tagged** (each one carries the issue's size estimate). It sorts the records into **cells** — groups of past issues sharing the same `{ size, model }` — holds out a reproducible slice of each cell (`--seed` makes the split repeatable), forecasts from what's left, and measures how often the held-out actuals really fell at or below the predicted P80. Any cell whose hit-rate drifts from 80% by more than `--threshold` is flagged ⚠. On a small dataset the coverage figures are themselves noisy — a cell with only a few held-out issues can read 0% or 100% by luck — so treat per-cell flags as directional until cells are well-populated. +A **P80** is only honest if, on issues the forecaster never saw, the real cost lands at or below it about 80% of the time. `calibrate` checks exactly that against a local estimate-tagged `usage.jsonl`: it groups past issues into `{ size, model }` **cells**, holds out a reproducible slice per cell (`--seed`), forecasts from the rest, and measures how often the held-out actuals fell under the predicted P80. Cells drifting from 80% by more than `--threshold` are flagged ⚠. On small datasets per-cell coverage is itself noisy, so treat flags as directional until cells fill in. ```bash llm-cost calibrate ~/backfill.out --seed 1 --holdout 0.2 ``` -Read-only and local — the input is never written back or committed (point it at a gitignored file). Committed tests use only synthetic fixtures (`test/forecast-recovers-known-dist.test.mjs`). +Read-only and local — the input is never written back (point it at a gitignored file). ## What drives your cost? (`cost-drivers`) -`cost-drivers` runs an end-to-end correlation analysis: it reads your LLM cost records, reads diff statistics from a local git repo, joins them by issue key, and prints Spearman rank correlation, linear Pearson, log-log Pearson, and a decile table. The goal is to understand which attributes of an issue predict how much it costs — using your own data, not anyone else's benchmarks. +`cost-drivers` runs an end-to-end correlation analysis: it reads your LLM cost records, reads diff statistics from a local git repo, joins them by issue key, and prints Spearman, linear Pearson, and log-log Pearson correlations plus a decile table — so you can see which attributes of an issue predict its cost, using your own data. **Minimal inputs:** a local git repo whose commit subjects include issue keys, and transcripts (or a `usage.jsonl`) for the same issues. @@ -109,34 +102,20 @@ llm-cost cost-drivers --repo ~/code/my-project --from-usage ~/llm-cost-history.j Example readout (synthetic numbers — for illustration only): ``` -════════════════════════════════════════════════════════════════════════ COST DRIVERS — diff churn vs tokens -════════════════════════════════════════════════════════════════════════ -Join strategy: issue -Source: ~/code/my-project n = 42 pairs unjoined: 3 usage, 5 diffs unmatched commits: 11 -Correlations: - Spearman 0.34 - Pearson(linear) 0.21 - Pearson(log-log) 0.40 +Correlations: Spearman 0.34 Pearson(linear) 0.21 Pearson(log-log) 0.40 Decile table: Decile Feature range n Median cost -──────────────────────────────────────────────────────────────────────── 1 14 – 87 4 58.3K 2 91 – 210 4 72.1K -3 215 – 380 4 91.4K -4 384 – 510 4 103.2K -5 512 – 740 5 128.7K -6 744 – 1.1K 4 145.3K -7 1.1K – 1.6K 4 189.6K -8 1.6K – 2.4K 4 224.1K -9 2.5K – 4.1K 5 301.8K +… 10 4.2K – 9.3K 4 512.4K ``` -Reading that block: **Feature range** is diff churn (additions + deletions) in lines; **Median cost** is the median token count for issues in that churn decile. The three correlation coefficients tell the same story from different angles — see "Reading the output" below. +Reading that block: **Feature range** is diff churn (additions + deletions) in lines; **Median cost** is the median token count for issues in that churn decile. See "Reading the output" below. ### Join model @@ -144,54 +123,35 @@ Reading that block: **Feature range** is diff churn (additions + deletions) in l | Strategy | How it joins | When to use | |---|---|---| -| `issue` (default) | Extracts issue keys (e.g. `ABC-123`) from commit subjects and from each cost record's `issueIdentifier` / workspace path | Works out of the box with Symphony's per-issue worktree convention and squash-merge commit messages | -| `worktree` | Joins on the cost record's workspace path vs. the diff record's key | Useful when your diff records carry workspace paths instead of issue keys | -| `time` | Attributes each cost record to the next commit within `--window` (e.g. `30m`, `2h`, `1d`) | Label-free fallback when commit subjects don't contain keys; inherently approximate | +| `issue` (default) | Issue keys (e.g. `ABC-123`) from commit subjects vs. each record's `issueIdentifier` / workspace path | Symphony per-issue worktrees + squash-merge commits | +| `worktree` | Cost record's workspace path vs. the diff record's key | Diff records carry workspace paths, not keys | +| `time` | Each cost record to the next commit within `--window` (`30m`, `2h`, `1d`) | Label-free fallback; approximate | ```bash -# explicit strategies -llm-cost cost-drivers --repo ~/code/my-project --join-by issue # default llm-cost cost-drivers --repo ~/code/my-project --join-by worktree llm-cost cost-drivers --repo ~/code/my-project --join-by time --window 2h - -# override the key-extraction regex if your project uses a different format -llm-cost cost-drivers --repo ~/code/my-project --key-pattern 'TICKET-\d+' +llm-cost cost-drivers --repo ~/code/my-project --key-pattern 'TICKET-\d+' # custom key format ``` -The `keyOfUsage`, `keyOfDiff`, and `join` overrides are available via the library API (`joinCostWithFeature`) for cases the CLI flags don't cover — for example joining on a custom field, or implementing a fully custom reconciliation. +The `keyOfUsage` / `keyOfDiff` / `join` overrides are available via the library API (`joinCostWithFeature`) for cases the CLI flags don't cover. #### Escape hatch: join externally with `dump-* → correlate` -If none of the built-in strategies fit, emit the two streams and join them yourself: +If no built-in strategy fits, dump the two streams, join them yourself, and feed back a `{ feature, cost }` CSV/JSON: ```bash -# 1. dump the cost stream llm-cost dump-usage > usage.jsonl - -# 2. dump the diff stream llm-cost dump-diffs --repo ~/code/my-project > diffs.jsonl - -# 3. join them however you like, then feed back a { feature, cost } CSV -llm-cost correlate --pairs my-pairs.csv # CSV: feature,cost[,key] +llm-cost correlate --pairs my-pairs.csv # CSV: feature,cost[,key] — same readout as cost-drivers ``` -`correlate --pairs` accepts `.csv` (header `feature,cost`) or `.json` (array of `{feature, cost}` objects) and produces the same readout as `cost-drivers`. - ### Reading the output -**Three correlation views, not one.** LLM cost is heavy-tailed — a handful of expensive issues can dominate a linear average. `cost-drivers` therefore reports: +**Three correlation views, not one.** LLM cost is heavy-tailed, so a handful of expensive issues can dominate a linear average. **Spearman** (rank) captures monotonic relationships without being skewed by outliers; **Pearson (linear)** is the raw-value correlation and can read near zero on heavy-tailed data even when Spearman is meaningful; **Pearson (log-log)** is the right view when both axes span orders of magnitude. A large Spearman/linear gap signals a real but nonlinear relationship, not an absent one. -- **Spearman** (rank correlation): captures monotonic relationships without being skewed by outliers. If big issues generally cost more than small ones, Spearman will pick that up even when the raw values vary wildly. -- **Pearson (linear)**: the standard linear correlation on raw values. On heavy-tailed data it can read near zero even when Spearman is meaningful; it is sensitive to a few extreme issues. -- **Pearson (log-log)**: Pearson on log₁₀-transformed values, the right view when both axes span orders of magnitude. If cost and diff size both grow geometrically, this is the coefficient that captures it. +**Always check `n`.** Below ~20 pairs the coefficients are unreliable and decile buckets are thin — treat the output as directional. -A large gap between Spearman and linear Pearson is a signal that the relationship is real but nonlinear or that a few outliers are suppressing the linear view — not that the relationship is absent. - -**Always check `n`.** With a small sample (say n < 20) the coefficients are unreliable and the decile table will have very few rows per bucket. Treat the output as directional until you have more history. - -**Diff size is output, not effort.** A feature that happens to touch many files will show high churn whether or not it was the most complex work. Churn is the most readily available proxy; other features (issue estimate, turn count) may or may not track cost better on your workload. - -**Local-git limits.** `readGitDiffs` only sees commits already in your local checkout — run `git fetch` or `git pull` first if you want remote-only commits. For the default `issue` strategy, commits must also carry issue keys in their subjects (the default pattern matches `ABC-123`-style keys; override with `--key-pattern`). +**Diff size is output, not effort**, and `readGitDiffs` only sees commits already in your local checkout (run `git fetch` first for remote-only commits); for the default `issue` strategy, commit subjects must carry issue keys (override the pattern with `--key-pattern`). ## Library @@ -212,52 +172,31 @@ Pass `{ cwdPattern, claudeProjectsDir, codexSessionsDir }` to override defaults. ### Bring your own sources and sinks -The wrappers above read local Claude/Codex transcripts. If your sessions or usage -records live somewhere else (a database, an HTTP API, a test fixture), wire your -own ports into `createAttributionWorkflow` instead — the same core workflows, with -no filesystem assumptions: +The wrappers above read local transcripts. If your sessions or usage records live elsewhere (a database, an HTTP API, a test fixture), wire your own ports into `createAttributionWorkflow` — same core workflows, no filesystem assumptions: ```js import { createAttributionWorkflow } from 'llm-cost-attribution'; const attribution = createAttributionWorkflow({ sessionSource: { async *listSessions() { /* yield ParsedSession objects */ } }, - issueMatcher: { issueIdentifierForSession: (s) => /* 'EPAC-1940' | null */, - worktreePathForSession: (s) => s.cwd }, + issueMatcher: { issueIdentifierForSession: (s) => /* id | null */, worktreePathForSession: (s) => s.cwd }, usageRecordSource: { async *readUsageRecords() { /* yield usage.jsonl records */ } }, usageRecordSink: { async writeUsageRecords(records) { /* persist them */ } }, }); await attribution.computeIssueCost('EPAC-1940'); // from sessions -await attribution.computeWorktreeCost('/path/to/worktree'); -await attribution.computeIssueCostFromUsage('EPAC-1940'); // from usage records await attribution.backfillUsage(); // sessions → sink ``` -The four ports are `SessionSource`, `IssueMatcher`, `UsageRecordSource`, and -`UsageRecordSink` (see `src/attribution-ports.mjs`). The real transcript/usage -adapters are also exported (`transcriptSessionSource`, `cwdIssueMatcher`, -`usageJsonlRecordSource`, `appendingUsageRecordSink`) if you want the built-in -sources with a custom workflow. Only supply the ports a given call needs. +The four ports — `SessionSource`, `IssueMatcher`, `UsageRecordSource`, `UsageRecordSink` (see `src/attribution-ports.mjs`) — each have a built-in adapter you can also import (`transcriptSessionSource`, `cwdIssueMatcher`, `usageJsonlRecordSource`, `appendingUsageRecordSink`). Only supply the ports a given call needs. ### Diff-size feature records -`readGitDiffs(repoPath, { revRange, keyPattern })` reads local `git log --numstat` -output and yields one aggregated record per issue key found in commit subjects: - -```js -for await (const diff of readGitDiffs('/path/to/repo')) { - console.log(diff.key, diff.additions + diff.deletions, diff.changedFiles); -} -``` - -It is local-first: no GitHub token, network, or API calls. The tradeoff is that it -sees only history already present in the checkout, and commits must carry issue -keys in their subjects, as with squash-merge subjects like `[ABC-12]: add widget`. +`readGitDiffs(repoPath, { revRange, keyPattern })` reads local `git log --numstat` and yields one aggregated record per issue key found in commit subjects (`diff.key`, `diff.additions`, `diff.deletions`, `diff.changedFiles`). Local-first — no GitHub token or network — so it sees only history in the checkout, and commits must carry issue keys (e.g. squash subjects like `[ABC-12]: add widget`). ### Use cases and extension ports -Beyond the rollup workflows above, the package surfaces a second tier of named application-layer use cases for forecasting and correlation. Each one declares its own ports so callers can inject custom data, alternate pricing, or their own quota model without touching the core. The full per-use-case contract — including the rollup-style `ComputeIssueCost` / `BackfillUsage` / `CreateAttributionWorkflow` group already covered above — lives in [`docs/architecture/use-case-catalog.md`](../../docs/architecture/use-case-catalog.md). +Beyond the rollup workflows above, the package surfaces a second tier of named application-layer use cases for forecasting and correlation, each declaring its own injectable ports. The full per-use-case contract lives in [`docs/architecture/use-case-catalog.md`](../../docs/architecture/use-case-catalog.md). | Use case | What it does for callers | Extension ports | |---|---|---| @@ -269,108 +208,57 @@ Beyond the rollup workflows above, the package surfaces a second tier of named a ### Inject a custom usage source -`forecastIssueCost`, `forecastProjectCost`, and the join helpers all accept any iterable, async iterable, or object exposing `records()` / `iterate()` for their cost input. That lets you forecast straight from an in-memory array, a database stream, or a synthetic generator — no `~/.claude/projects` / `~/.codex/sessions` reads, no API tokens. +`forecastIssueCost`, `forecastProjectCost`, and the join helpers accept any iterable, async iterable, or object exposing `records()` / `iterate()` for their cost input — an in-memory array, a database stream, or a synthetic generator, with no transcript reads or API tokens: ```js import { forecastIssueCost, syntheticUsageRecords } from 'llm-cost-attribution'; // 50 spec-shaped records with a known log-normal P50/P80 distribution. -const records = syntheticUsageRecords({ - p50: 1_000_000, p80: 1_800_000, - n: 50, seed: 1, - size: 'L', model: 'claude-sonnet-4-6', -}); +const records = syntheticUsageRecords({ p50: 1_000_000, p80: 1_800_000, n: 50, seed: 1, size: 'L', model: 'claude-sonnet-4-6' }); -const forecast = await forecastIssueCost( - { size: 'L', model: 'claude-sonnet-4-6' }, - records, -); +const forecast = await forecastIssueCost({ size: 'L', model: 'claude-sonnet-4-6' }, records); // → { tokens: { p50, p80, n: 50 }, turns, dollars, quota, lowConfidence, empty } ``` -Any object exposing `records()` works too — useful when wrapping a query, a stream, or a fixture loader: - -```js -const inMemorySource = { - async *records() { - for (const record of myDataset) yield record; - }, -}; -await forecastIssueCost({ size: 'M', model: 'claude-sonnet-4-6' }, inMemorySource); -``` - -The same `inMemorySource` shape is what `joinCostWithFeature({ usage, diffs })` consumes for its `usage` argument, so a single custom source feeds both the forecaster and the correlator. +The same shape (an object with `records()`) is what `joinCostWithFeature({ usage, diffs })` consumes for `usage`, so one custom source feeds both the forecaster and the correlator. ### Inject a custom pricing or quota model -The `PricingTable` and `QuotaModel` ports are injected through `forecastIssueCost`'s options. The library defaults (`DEFAULT_PRICING_TABLE`, `DEFAULT_QUOTA_MODEL`) wrap `pricing.mjs` and `quota.mjs`; substitute your own for an alternate provider, an enterprise rate card, or a synthetic test. +The `PricingTable` (`priceFor(model, buckets)`) and `QuotaModel` (`quotaFractionFor(record)`) ports are injected through `forecastIssueCost`'s options. The library defaults (`DEFAULT_PRICING_TABLE`, `DEFAULT_QUOTA_MODEL`) wrap `pricing.mjs` and `quota.mjs`; substitute your own for an alternate provider, an enterprise rate card, or a synthetic test: ```js import { forecastIssueCost } from 'llm-cost-attribution'; -// Flat-rate $2 per million tokens, regardless of bucket split. -// `buckets` is the spec §5.2.3 TokenBuckets shape: -// { inputUncached, inputCached, cacheCreate5m, cacheCreate1h, outputVisible, outputReasoning }. -const flatRatePricing = { - priceFor(_model, buckets) { - const total = - buckets.inputUncached + buckets.inputCached + - buckets.cacheCreate5m + buckets.cacheCreate1h + - buckets.outputVisible + buckets.outputReasoning; - return total * 0.000_002; - }, -}; - -// Treat each issue's wall-clock as a fraction of a 5-minute SLO budget. -const sloQuotaModel = { - quotaFractionFor(record) { - const elapsedMs = Date.parse(record.endedAt) - Date.parse(record.startedAt); - return elapsedMs / (5 * 60 * 1000); - }, -}; - -await forecastIssueCost( - { size: 'L', model: 'flat-rate-1' }, - records, - { pricingTable: flatRatePricing, quotaModel: sloQuotaModel }, -); +// Flat-rate $2 per million tokens. `buckets` is the spec §5.2.3 TokenBuckets shape. +const flatRatePricing = { priceFor: (_model, b) => Object.values(b).reduce((a, n) => a + n, 0) * 0.000_002 }; + +await forecastIssueCost({ size: 'L', model: 'flat-rate-1' }, records, { pricingTable: flatRatePricing }); ``` -`forecastProjectCost` takes the same `PricingTable` so a project rollup quotes dollars off whichever rate card you injected per-issue. +`forecastProjectCost` takes the same `PricingTable`, so a project rollup quotes dollars off whichever rate card you injected per-issue. ### Compose `JoinCostWithFeature` with `correlateCostWithFeature` The pluggable join produces `{ key, feature, cost: { tokens, turns } }` pairs; the correlator consumes `{ feature, cost }` after picking a single metric: ```js -import { - joinCostWithFeature, - correlateCostWithFeature, - readUsageRecords, - readGitDiffs, -} from 'llm-cost-attribution'; +import { joinCostWithFeature, correlateCostWithFeature, readUsageRecords, readGitDiffs } from 'llm-cost-attribution'; -const usage = readUsageRecords('./usage.jsonl'); -const diffs = readGitDiffs('./my-repo'); - -const { pairs, unjoined } = await joinCostWithFeature({ - usage, - diffs, +const { pairs } = await joinCostWithFeature({ + usage: readUsageRecords('./usage.jsonl'), + diffs: readGitDiffs('./my-repo'), strategy: 'issue-key', // or 'worktree' | 'time' }); -const tokenPairs = pairs.map((p) => ({ feature: p.feature, cost: p.cost.tokens })); -const result = correlateCostWithFeature(tokenPairs); +const result = correlateCostWithFeature(pairs.map((p) => ({ feature: p.feature, cost: p.cost.tokens }))); // → { n, spearman, pearsonLinear, pearsonLogLog, pearsonLogLogDropped, deciles } ``` -For workflows the built-in strategies don't cover, supply a caller-defined `keyOfUsage` / `keyOfDiff` (custom-key join) or a full `join(usage, diffs) → pairs` (escape hatch). Both replace the strategy entirely and are validated against the `{ feature, cost: { tokens, turns } }` contract. +For cases the built-in strategies don't cover, supply a caller-defined `keyOfUsage` / `keyOfDiff` or a full `join(usage, diffs) → pairs`; both replace the strategy and are validated against the `{ feature, cost: { tokens, turns } }` contract. ### Ready vs. planned APIs -Every export named in the use-case table above is implemented and stable, as are the rollup workflows reached through `createAttributionWorkflow`. The package has no planned, unsupported, or deprecated public APIs in this release — story-point–aware enrichment lives in the sibling [`llm-cost-estimation`](../llm-cost-estimation) package, which re-exports `forecastIssueCost`, `forecastProjectCost`, and `calibrateCoverage` and feeds them through the same `EstimateTaggedUsageSource` / `PricingTable` / `QuotaModel` ports rather than reaching across package boundaries. - -The convenience wrappers shipped with this package (`computeIssueCost`, `backfillUsageFromTranscripts`, `iterateUsageFromTranscripts`) still read `~/.claude/projects` / `~/.codex/sessions` locally and make no network call — no API tokens, no telemetry pipeline. Custom-port workflows are opt-in; the transcript and `usage.jsonl` paths stay key-free. +Every export in the use-case table above is implemented and stable, as are the rollup workflows reached through `createAttributionWorkflow`; the package has no planned, unsupported, or deprecated public APIs in this release. Story-point–aware enrichment lives in the sibling [`llm-cost-estimation`](../llm-cost-estimation) package, which re-exports `forecastIssueCost`, `forecastProjectCost`, and `calibrateCoverage` through the same ports. The convenience wrappers (`computeIssueCost`, `backfillUsageFromTranscripts`, `iterateUsageFromTranscripts`) read transcripts locally with no network call; custom-port workflows are opt-in and the transcript / `usage.jsonl` paths stay key-free. ## What it doesn't (and can't) do @@ -381,7 +269,7 @@ The convenience wrappers shipped with this package (`computeIssueCost`, `backfil ## Pricing -`llm-cost` shows API-equivalent dollar cost per bucket from a built-in rate table ([Anthropic](https://www.anthropic.com/pricing), [OpenAI](https://platform.openai.com/docs/pricing)). **This is a counterfactual, not your actual spend:** on a subscription plan (Claude Max, Codex Pro) it's what the same tokens would cost pay-as-you-go — your real marginal cost is the quota readout, not the dollar total. The CLI warns when the table is >90 days old; `--no-pricing` suppresses the block. +`llm-cost` shows API-equivalent dollar cost per bucket from a built-in rate table ([Anthropic](https://www.anthropic.com/pricing), [OpenAI](https://platform.openai.com/docs/pricing)). **This is a counterfactual, not your actual spend:** on a subscription plan it's what the same tokens would cost pay-as-you-go — your real marginal cost is the quota readout. The CLI warns when the table is >90 days old; `--no-pricing` suppresses the block. ## License diff --git a/packages/llm-cost-estimation/README.md b/packages/llm-cost-estimation/README.md index e27a89a..99a4b7a 100644 --- a/packages/llm-cost-estimation/README.md +++ b/packages/llm-cost-estimation/README.md @@ -150,11 +150,6 @@ or below the predicted P80 about 80% of the time? Re-exported from - Returns a per-cell and overall coverage report; cells whose coverage drifts from the target band beyond `deviationThreshold` are flagged. -### `calibrate` — deprecated - -`calibrate` was a never-implemented placeholder. It now throws an error naming -its replacement, `calibrateCoverage` (above). Do not use it. - ### Use cases and extension ports The package is structured around named application-layer use cases — each one with declared ports you can inject (a custom estimate source, a custom usage source, an alternate pricing or quota model). The estimation core stays key-free and tracker-agnostic; only the ports talk to Linear, your filesystem, or your pricing service. The full per-use-case contract lives in [`docs/use-cases.md`](docs/use-cases.md).