We have a complete implementation of breaking change detection in src/db/diff/breaking.rs. This module analyzes schema diffs and classifies changes as either safe (can deploy directly) or breaking (requires a mitigation strategy).
MitigationStrategyenum with four strategies:DualWrite,Backfill,Ratchet,DestructiveBreakingChangeKindenum with 17 specific change types, each mapped to a mitigation strategyBreakingChangestruct with kind, mitigation strategy, and human-readable descriptionBreakingChangeAnalysisaggregator with query methods (is_safe(),by_mitigation(),count_by_mitigation())analyze_breaking_changes()function that walks aNamespaceDiff- Type change classification (safe widening vs breaking narrowing)
- 94 passing tests (unit + integration)
use tern::db::diff::{diff_namespaces, NamespaceDiff};
use tern::db::diff::breaking::{analyze_breaking_changes, MitigationStrategy};
let diff = diff_namespaces(&source, &target);
let analysis = analyze_breaking_changes(&diff);
if analysis.is_safe() {
println!("Migration is safe to apply directly");
} else {
println!("Found {} breaking changes:", analysis.len());
for change in analysis.iter() {
println!(" [{}] {}", change.mitigation.as_str(), change.description);
}
// Query by mitigation strategy
let ratchet_count = analysis.count_by_mitigation(MitigationStrategy::Ratchet);
println!("Changes requiring NOT VALID pattern: {}", ratchet_count);
}The original implementation used a ChangeSeverity enum with three levels:
NonBreaking- Safe changesWarning- Might fail depending on dataBreaking- Definitely problematic
The "Warning" category was flawed. If a migration might fail, it IS breaking. You cannot deploy it with confidence.
Rather than classifying by severity, we classify by what kind of process is required to safely execute the change:
| Strategy | Description | Examples |
|---|---|---|
DualWrite |
Requires parallel structures with synchronized writes | Rename column/table, change column type |
Backfill |
Requires populating data before completion | Add NOT NULL to existing column |
Ratchet |
Requires NOT VALID + backfill + VALIDATE pattern | Add UNIQUE/CHECK/FK/PK constraint |
Destructive |
Intentionally removes data/structure (irreversible) | Drop table/column, remove enum value |
- Binary safety: A change is either Safe or it requires mitigation - no ambiguous middle ground
- Actionable: Each strategy implies a specific decomposition pattern
- Pattern-based: Maps directly to known PostgreSQL migration patterns
- Time-aware: Acknowledges that some changes fundamentally cannot be atomic
PostgreSQL provides a mechanism to safely add constraints:
-- Step 1: Add constraint without validating existing data (instant, non-blocking)
ALTER TABLE users ADD CONSTRAINT users_email_unique UNIQUE (email) NOT VALID;
-- Step 2: New inserts/updates are now validated (the "ratchet" is engaged)
-- Meanwhile, fix any existing violations through backfill/cleanup
-- Step 3: Once all data complies, validate the constraint
ALTER TABLE users VALIDATE CONSTRAINT users_email_unique;This pattern creates a ratchet: once engaged, it prevents new violations while giving you time to fix existing ones.
To rename users.email → users.email_address:
- Add new column
email_address(Safe) - Deploy application that writes to BOTH columns
- Backfill:
UPDATE users SET email_address = email WHERE email_address IS NULL - Deploy application that reads from new column
- Deploy application that writes ONLY to new column
- Drop old column
email(Destructive, but now safe because nothing uses it)
To add NOT NULL to users.email:
- Add CHECK constraint with NOT VALID:
CHECK (email IS NOT NULL) NOT VALID(Ratchet) - Backfill any NULL values
- Validate:
VALIDATE CONSTRAINT ... - Add actual NOT NULL:
ALTER COLUMN email SET NOT NULL - Drop the CHECK constraint (now redundant)
To add UNIQUE(email):
- Add constraint NOT VALID:
ADD CONSTRAINT ... UNIQUE (email) NOT VALID - Fix any existing duplicates (application-specific logic)
- Validate:
VALIDATE CONSTRAINT ...
Drop operations are fundamentally different:
- They're often intentional (cleaning up unused structures)
- They can't be "decomposed" - they're the end state
- But they must be verified safe (nothing references the dropped object)
For drops, the decomposition is temporal:
- Stop using the object in application code
- Wait for all old application instances to drain
- Perform the drop
-
How do we handle changes that combine multiple categories?
- Example: Rename column AND change type simultaneously
- Likely answer: Decompose into separate changes, each with its own category
-
Should "Destructive" be further subdivided?
- "Intentional removal" vs "Data loss risk"
- Dropping an unused table is different from dropping a table with data
-
How do we represent the decomposed migration steps?
- This module detects breaking changes
- A separate module would generate the decomposition
- What's the interface between them?
-
What about lock-related concerns?
- Some operations require
ACCESS EXCLUSIVElocks - Adding an index without
CONCURRENTLYblocks writes - Is this a separate axis of classification?
- Some operations require
- PostgreSQL: Adding NOT VALID constraints
- Strong Migrations gem (Ruby) - similar concept in Rails ecosystem
- Expand-Contract Pattern - Martin Fowler on parallel change