Skip to content

fix: dedup release IDs during CSV import#33

Merged
jakebromberg merged 1 commit intomainfrom
fix/dedup-release-ids
Mar 10, 2026
Merged

fix: dedup release IDs during CSV import#33
jakebromberg merged 1 commit intomainfrom
fix/dedup-release-ids

Conversation

@jakebromberg
Copy link
Member

Summary

  • Add "unique_key": ["id"] to the release table config in import_csv.py, matching the dedup pattern used by release_artist, release_label, and release_track_artist
  • Duplicate release IDs in the CSV are now skipped (first occurrence wins) instead of causing a UniqueViolation on release_pkey

Closes #32

Cross-ref: WXYC/discogs-xml-converter#19 fixes the root cause in the converter.

Test plan

  • pytest tests/unit/test_import_csv.py — new test test_release_table_has_unique_key_on_id and updated test_tables_with_unique_constraints_have_unique_key
  • pytest -m postgres tests/integration/test_import.py::TestDuplicateReleaseIds — imports CSV with duplicate release IDs, verifies only first is kept
  • All existing tests pass (319 unit, integration)

The release table config lacked a unique_key, so duplicate release IDs in the CSV would cause a UniqueViolation on the release_pkey constraint during COPY. Add unique_key: ["id"] to skip duplicates (keeping the first occurrence), matching the pattern already used by release_artist, release_label, and release_track_artist.
@jakebromberg jakebromberg merged commit 66320a4 into main Mar 10, 2026
3 checks passed
@jakebromberg jakebromberg deleted the fix/dedup-release-ids branch March 10, 2026 06:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Duplicate release IDs in CSV cause UniqueViolation on import

1 participant