fix: handle audit_log pkey unique violation in log_changes (Ecto.ConstraintError) (OPS-4622)#80
Closed
palantir-valiot[bot] wants to merge 1 commit into
Closed
Conversation
…traintError)
- Move changelog_changeset to EctoTrail.Changelog.changeset/1
- Add unique_constraint(:id, name: <table>_pkey) derived from :table_name config
- Existing log error paths already swallow the {:error, changeset} and do not rollback caller tx
- Add :db-tagged test that forces pkey collision and asserts user operation succeeds
- Version 1.0.4 + CHANGELOG entry
- Guard test_helper DB setup so suite loads without Postgres (tests still run :db tag in CI)
Closes OPS-4622
Member
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add
unique_constraint/2on the audit log primary key (name derived from the configured table, defaultaudit_log_pkey) insideEctoTrail.Changelog.changeset/1. The internallog_changes/5andlog_changes_alone/6(called byupdate_and_log/4,insert_and_log/4, etc.) perform a plainrepo.insert/1on the changelog row. When a duplicate pkey occurs (race, retry, or concurrent update_and_log for the same resource in the same instant), Postgres raises a unique violation onaudit_logs_pkey. Without the constraint declaration this surfaces as an unhandledEcto.ConstraintErrorthat kills the caller'sRepo.transaction(visible in the stack asElixir.ValiotApp.Repo.transaction/2→EctoTrail.update_and_log/4→log_changes/5).The call sites already treat
{:error, reason}from the log insert as a soft failure: theyLogger.errorand return{:ok, reason}(or equivalent inlog_bulk), deliberately avoidingtx_repo.rollbackso the user's data mutation succeeds. By turning the pkey collision into a normal changeset error viaunique_constraint/2, the existing resilience paths now cover this case as well.No behavior change for the happy path. A new integration test (tagged
:db) reproduces the collision by pre-occupying a pkey value and asserts the userupdate_and_logstill returns{:ok, updated_struct}with noConstraintError.Why
Linear: OPS-4622 — production
Ecto.ConstraintErroronaudit_logs_pkeyduringEctoTrail.update_and_log(first-party package). Triage: NOTIFY+FIX, severity high, category code_bug. The root cause is missingunique_constrainthandling for the pkey in the log_changes path.Test plan
mix format --check-formattedmix credo --strict(0 issues after adding@spec)mix compile(warnings pre-existing and unrelated to this diff)mix test --exclude db(all non-DB tests excluded cleanly; DB tests require Postgres which is not present in this agent pod — they are the ones exercising the fix and the new pkey-collision scenario)test/unit/ecto_trail_test.exsthat forces anaudit_logpkey collision insideupdate_and_logand asserts the user update succeeds without raisingEcto.ConstraintErrormix.exsCHANGELOG.mdpalantir/OPS-4622-handle-audit-log-pkey-collisionFiles changed (from
git diff --stat)Assumptions
id(standard Ecto primary key) and the constraint name in Postgres is<table_name>_pkey(the default Postgres names for acreate table(:foo)withoutprimary_key: false+ explicitadd :id, ... primary_key: true). This matches the migration in the package and the error message in the incident (audit_logs_pkey).:table_namewill get the matching<table_name>_pkeyconstraint name automatically.log_changes*is intentional and stable (it was added for the Ecto.Multi case in 1.0.3).Closes OPS-4622