Skip to content

fix(data): make phase2 enrichment idempotent (#312)#320

Merged
w7-mgfcode merged 2 commits into
devfrom
fix/seeder-phase2-enrichment-idempotent
May 26, 2026
Merged

fix(data): make phase2 enrichment idempotent (#312)#320
w7-mgfcode merged 2 commits into
devfrom
fix/seeder-phase2-enrichment-idempotent

Conversation

@w7-mgfcode
Copy link
Copy Markdown
Owner

Summary

Resolves #312. A second POST /seeder/phase2-enrichment against an already-enriched scope no longer raises IntegrityError on uq_exogenous_signal_per_store (which previously surfaced as HTTP 500 and blocked PRP-41's manual showcase_rich dogfood).

  • exogenous_signalpg_insert(...).on_conflict_do_nothing() (target-free, covers both partial unique indexes: uq_exogenous_signal_global + uq_exogenous_signal_per_store).
  • replenishment_event / sales_returns — no natural-key unique constraint exists, so each section guards itself with an existence check inside the seeded date window; if any row already exists in scope, the whole insert is skipped.
  • product lifecycle UPDATE — already idempotent under a fixed seed; left in place.
  • Phase2EnrichmentResponse — additive records_skipped: dict[str, int] (default empty dict; existing unit tests still construct the model without it).
  • Defensive net — residual IntegrityError (should not fire after the guards above) is mapped to ConflictError(409) RFC 7807 instead of bubbling as a raw 500.

No new migration: existing constraints are sufficient; the bug was purely in how the inserts handled re-runs.

Test plan

  • uv run ruff check . && uv run ruff format --check .
  • uv run mypy app/ — no new errors (pre-existing xgboost stub gap on dev unchanged)
  • uv run pyright app/features/seeder/ — clean
  • uv run pytest -v -m "not integration" — 1626 passed, 12 skipped
  • uv run pytest -v -m integration app/features/seeder/tests/ — new test_phase2_idempotency.py passes against the live Postgres
  • git diff --check — only the project's pre-existing CRLF noise on touched files (per memory anchor [[repo-line-endings-crlf]]); the new test file is clean LF

The new integration test is non-destructive — it reuses whatever the dev DB already has (skipping cleanly on an empty DB) and only proves the second call doesn't error and reports records_skipped > 0 / records_created == 0 for the three insert tables.

🤖 Manual fix per HANDOFF execution prompt for issue #312

A second POST /seeder/phase2-enrichment against an already-enriched scope
no longer raises IntegrityError (previously: uq_exogenous_signal_per_store
surfaced as HTTP 500, blocking PRP-41's manual showcase_rich dogfood).

- exogenous_signal: pg_insert(...).on_conflict_do_nothing() — target-free
  ON CONFLICT covers both partial unique indexes (global + per-store)
- replenishment_event / sales_returns (no natural-key unique constraint):
  section-level existence check inside the seeded date window; skip the
  whole insert when rows already exist
- product lifecycle UPDATE: already idempotent under a fixed seed; left
  in place
- Phase2EnrichmentResponse gains additive records_skipped: dict[str, int]
- Defensive IntegrityError -> ConflictError(409) wrap as the belt-and-braces
  net (the idempotency guards above should make it unreachable)

Adds a non-destructive integration test (app/features/seeder/tests/
test_phase2_idempotency.py) that calls the endpoint twice against the
live Postgres and asserts records_skipped > 0 + records_created == 0 on
the second pass for all three insert tables.
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @w7-mgfcode, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 26, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1115a7e2-ee46-4326-b5ce-f1cecf2cd883

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/seeder-phase2-enrichment-idempotent

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

CI's freshly-migrated DB has only stray data from earlier tests'
fixtures — too sparse for the ReplenishmentGenerator or ReturnsGenerator
to produce any records. The original assertion required
records_skipped > 0 for all three insert tables, which fails when
nothing was generated.

Relaxed: assert created == 0 on the second call for all three insert
tables (the canonical idempotency proof — no new rows). Sanity guard
soft-skips when none of the three tables exercised the idempotency
path (otherwise the test would be meaningless). The exogenous_signal
ON CONFLICT DO NOTHING path — the original IntegrityError surface — is
exercised by any DB with at least one Store + one date.
@w7-mgfcode w7-mgfcode merged commit b3ba1f4 into dev May 26, 2026
8 checks passed
@w7-mgfcode w7-mgfcode deleted the fix/seeder-phase2-enrichment-idempotent branch May 26, 2026 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant