Skip to content

fix: drop and recreate tables in schema to prevent stale data conflicts#37

Merged
jakebromberg merged 1 commit intomainfrom
fix/schema-drop-recreate
Mar 11, 2026
Merged

fix: drop and recreate tables in schema to prevent stale data conflicts#37
jakebromberg merged 1 commit intomainfrom
fix/schema-drop-recreate

Conversation

@jakebromberg
Copy link
Member

@jakebromberg jakebromberg commented Mar 11, 2026

Closes #36

Summary

  • Change create_database.sql from CREATE TABLE IF NOT EXISTS to DROP TABLE IF EXISTS ... CASCADE + CREATE TABLE
  • Ensures a clean state on every fresh pipeline run, preventing UniqueViolation on release_pkey during re-imports
  • --resume mode skips schema creation entirely, so in-progress work is unaffected
  • Add regression test test_schema_clears_stale_data_on_rerun that inserts data, re-runs schema, and verifies tables are empty
  • Update CI to run all tests with --cov-fail-under=80 coverage enforcement

Test plan

  • Schema is idempotent: running twice produces no errors
  • Regression test verifies stale data is cleared on schema re-run
  • All 594 tests pass (unit + integration + E2E)
  • Coverage at 84%, threshold of 80% enforced in CI

The schema used CREATE TABLE IF NOT EXISTS, which preserved data from previous pipeline runs. On re-runs, import_csv would hit UniqueViolation on release_pkey because old rows conflicted with new COPY data. The in-memory dedup only catches duplicates within the CSV, not against pre-existing table rows.

Change to DROP TABLE IF EXISTS ... CASCADE followed by CREATE TABLE, ensuring a clean state for every fresh pipeline run. The --resume mode skips schema creation entirely, so in-progress work is unaffected.
@jakebromberg jakebromberg merged commit 0f0831e into main Mar 11, 2026
3 checks passed
@jakebromberg jakebromberg deleted the fix/schema-drop-recreate branch March 11, 2026 05:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Schema uses CREATE TABLE IF NOT EXISTS, causing stale data conflicts on re-runs

1 participant