Phoebe: add resource_id to rated_usage rollup grain (billing customer attribution, E2) by hhuuggoo · Pull Request #22 · saturncloud/phoebe

hhuuggoo · 2026-06-16T01:53:43Z

The rater currently DROPS resource_id — but billing needs it to identify the customer org (E2: bill the deployment's org via resource_id→org_id). resource_id is already on billing_event (plumbed from the proxy through the drainer); this change carries it through the rater into the rated_usage grain. No prod data exists (rater has no CronJob, nothing deployed), so the existing 0002 migration revision is edited in place, not stacked.

Contracts

rated_usage grain / unique key — changes from (auth_id, model_id, window_start) to (auth_id, resource_id, model_id, window_start). The constraint is renamed rated_usage_auth_resource_model_window_uq. model_id is KEPT (not redundant with resource_id): price resolution is per-model, one deployment can serve multiple models/adapters at different rates, and collapsing models would sum traffic priced at different rates. Two deployments of the same model by the same auth in one hour → two rows (correct — they may bill to different orgs). The ON CONFLICT target, the ORDER BY (deadlock-free lock order), the deterministic md5 surrogate-key natural key (length-prefixed, fixed field order auth_id|resource_id|model_id|epoch), and the reconcile deleted CTE anti-join all move to the new key in lockstep.
New index — rated_usage_resource_id_window_start_ix on (resource_id, window_start). E2 reads rated_usage by deployment over a time window (resolve the org, then sum that deployment's cost); a resource_id-leading index makes that a tight slice rather than a scan over the auth-leading index where resource_id only trails.
NULL-resource_id → unattributable (fail closed) — resource_id is NULLABLE on billing_event but the new key column is NON-NULL. A row that can't name its deployment/org CANNOT be billed. The grouped/priced filter requires resource_id IS NOT NULL; the unattributable partition counts resource_id IS NULL (alongside auth_id/model_id); the unpriced count requires full attribution so a NULL-resource_id unpriced row is counted ONLY as unattributable. Net: a NULL-resource_id row is COUNTED as unattributable (surfaced, exits nonzero), never silently $0-billed or billed to a NULL org. The partition invariant still holds exactly: events_rated + unpriced + unattributable + ambiguous == total in-window events.

Tests

Go oracle (oracleStore) grain + md5 mirror the new key; SQL-shape fragments updated.
TestRater_DistinctDeploymentsBillSeparately — two deployments, same auth+model+hour → two rows (Go oracle); live-PG twin TestIntegration_ResourceIDGrainAndFailClosed.
TestRater_NullResourceIdIsUnattributable + extended TestRater_UnattributableCountedNotSilent — NULL resource_id counted, never billed; partition holds.
Integration conformance asserts every written rollup carries the seeded resource_id; e2e asserts the rollup carries X-Saturn-Resource-Id end-to-end.

Gate (all green)

go build ./..., go vet ./... + go vet -tags=integration ./..., go test -race ./..., golangci-lint v1.64.8 (plain + --build-tags=integration), gofmt -l . empty, and the full live-Postgres integration + e2e suite (PHOEBE_TEST_DATABASE_URL=… on postgres:16).

Note for Hugo

The grain decision (KEEP model_id, ADD resource_id) is flagged for your awareness: Ben endorsed it; there is no prod data; it is the correct grain for E2. Two same-model-same-auth deployments in one hour now bill as two rows by design.

…ibution) The rater dropped resource_id, but billing needs it to identify the customer org (E2: bill the deployment's org via resource_id→org_id). resource_id is already on billing_event (nullable); plumb it through the rater into the rated_usage grain. New grain / unique key: (auth_id, resource_id, model_id, window_start). model_id stays — price resolution is per-model and one deployment can serve multiple models/adapters at different rates, so collapsing models would sum traffic priced differently. Two deployments of the same model by the same auth in one hour now correctly produce two rows (they may bill to different orgs). FAIL-CLOSED ATTRIBUTION: resource_id is NULLABLE on billing_event but the new key column is NON-NULL. A row that can't name its deployment/org CANNOT be billed. The grouped/priced filter requires resource_id IS NOT NULL, and the unattributable partition counts resource_id IS NULL — so a NULL-resource_id row is surfaced (exits nonzero), never silently $0-billed or billed to a NULL org. The partition invariant holds: events_rated + unpriced + unattributable + ambiguous == total in-window events. Touch-points: ev/resolved/grouped/priced CTEs, the md5 surrogate-key natural key (length-prefixed, fixed order), ON CONFLICT target + ORDER BY (lock order), the reconcile `deleted` CTE anti-join, both migrations (0002_rating.sql + alembic, edited in place — unapplied, no prod data), a new (resource_id, window_start) index for E2 per-deployment reads, doc comments, and the oracle + SQL-shape + integration + e2e tests. New negative tests: DistinctDeploymentsBillSeparately and NullResourceIdIsUnattributable (Go oracle + live-PG), plus a resource_id assertion in the e2e rollup read. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

hhuuggoo · 2026-06-16T02:02:10Z

🔋 Battery — Round 1 (status: ESCALATE, but clean)

9 raw → 6 refuted → 3 confirmed (all low, all test files) + 1 persona. Zero production findings — the resource_id grain + fail-closed attribution partition are solid; the verifiers refuted the scary candidates. The two escalations are minor judgment calls, both with clear answers I'm applying (neither is a contract door):

1. Drop the speculative (resource_id, window_start) index. It was added "for E2 reads by deployment" — but no reader of rated_usage by resource_id exists in this repo (that's the Atlas/Stripe consumer, not built). Ben's own "forward intent without speculative build" rule says: don't pay write-amplification on every rated_usage upsert for a reader that may land later. Fix: remove the index here; it lands in the PR that adds the E2 reader, with proven usage. (The auth_id and window_start indexes stay — they have in-repo consumers.)

2. Document the empty-string-vs-NULL invariant at the oracle (comment only). The oracle uses Go "" to model SQL NULL; prod SQL filters resource_id IS NULL. The verifier proved '' is unreachable in prod: the proxy billing gate fails closed on empty ResourceID before metering, and the drainer's nullStr maps ''→NULL on write. So it's a test-fidelity symmetry note, not a bug. Fix: a comment at the oracle noting "" models NULL and prod guarantees ''→NULL at the drain. (Same pre-existing convention auth_id/model_id already rely on.)

Plus two low-sev test cleanups (store_test.go assertion tidies).

The core change is correct: the partition invariant rated + unpriced + unattributable + ambiguous == total holds, NULL-resource_id is fail-loud unattributable (live-PG tested). Applying the two fixes + cleanups, then a confirming dry round.

Battery wf_bc105ec5-a2f. The grain change itself needs no rework — only the speculative index + a doc comment.

hhuuggoo · 2026-06-16T02:15:08Z

Merged to main via the squashed resource_id grain change (e854bba).

hhuuggoo mentioned this pull request Jun 16, 2026

Phoebe: drop speculative resource_id index + document oracle NULL-modeling (battery cleanup) #23

Closed

hhuuggoo closed this Jun 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phoebe: add resource_id to rated_usage rollup grain (billing customer attribution, E2)#22

Phoebe: add resource_id to rated_usage rollup grain (billing customer attribution, E2)#22
hhuuggoo wants to merge 1 commit into
mainfrom
rated-usage-resource-id

hhuuggoo commented Jun 16, 2026

Uh oh!

hhuuggoo commented Jun 16, 2026

Uh oh!

hhuuggoo commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hhuuggoo commented Jun 16, 2026

Contracts

Tests

Gate (all green)

Note for Hugo

Uh oh!

hhuuggoo commented Jun 16, 2026

🔋 Battery — Round 1 (status: ESCALATE, but clean)

Uh oh!

hhuuggoo commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant