Skip to content

fix(planner): emit executable DDL for NOT NULL columns on non-empty tables#241

Merged
jkomyno merged 53 commits intomainfrom
fix/ddl-not-null
Mar 30, 2026
Merged

fix(planner): emit executable DDL for NOT NULL columns on non-empty tables#241
jkomyno merged 53 commits intomainfrom
fix/ddl-not-null

Conversation

@jkomyno
Copy link
Copy Markdown
Contributor

@jkomyno jkomyno commented Mar 16, 2026

closes TML-2067

Intent

Fix Postgres migration planning and execution for ADD COLUMN ... NOT NULL on existing tables during db update.

Before this branch, two things were wrong:

  • the planner could only safely add a NOT NULL column without an explicit default by requiring the table to be empty
  • the runner could skip db update work when the marker already matched the destination, even if the live schema had drifted

This PR fixes that bug, extracts the first reusable multi-step planner recipe, and adds an extension-aware seam for temporary-default resolution. It does not try to ship a public TypeScript migration API yet.

What Changed

Planner

  • buildAddColumnOperation(...) now decides when the temporary-default strategy applies.
  • The actual 2-step recipe is extracted into buildAddNotNullColumnWithTemporaryDefaultOperation(...).
  • Temporary-default lookup now flows through resolveTemporaryDefaultLiteral(...):
    1. ask a codec hook
    2. fall back to built-in Postgres defaults
    3. if neither can provide a safe value, keep the empty-table precheck
  • CodecControlHooks now exposes resolveTemporaryDefaultLiteral(...) so extensions and parameterized codecs can participate in this decision.
  • pgvector implements that hook and returns a dimension-aware zero vector literal.
  • The built-in Postgres fallback now covers more safe cases, including arrays ('{}'), tsvector (''::tsvector), bytea (''::bytea), alias forms like varchar / bpchar / varbit, and length-aware zero literals for bit(n).

Runner

  • db update plans (origin: null) now always apply operations.
  • migration apply plans (origin set) still skip when the marker already matches the destination.

That preserves migration replay idempotency while allowing db update to recover from live drift.

Tests and Evidence

  • planner.behavior.test.ts now verifies:
    • the extracted 2-step recipe for built-in types
    • extension-aware temporary defaults via pgvector
    • fallback-to-empty-table behavior when a codec explicitly declines a temporary default
    • expanded built-in fallback coverage for arrays, tsvector, bytea, and bit(n)
  • drift-schema.e2e.test.ts now exercises the built-in recovery path:
    • initialize the schema
    • insert a row
    • manually drop a NOT NULL column
    • run db update
    • verify the row was backfilled and that the temporary default was removed
  • psl.pgvector-dbinit.test.ts now exercises the extension path against a live database:
    • initialize a schema with a vector(3) NOT NULL column
    • insert a row
    • manually drop the vector column
    • run db update
    • verify the row was backfilled to [0,0,0] and that no default remains on the column
  • runner.idempotency.integration.test.ts now explicitly models migration apply semantics by setting origin, so it still verifies skip-on-marker-match in the right path.

Supporting Changes

  • Migration subsystem docs now describe the temporary-default strategy as:
    • codec hook first
    • built-in fallback second
    • empty-table precheck only when no safe default is known

Behavior Change

For a NOT NULL column without an explicit contract default:

  • if the planner can resolve a safe temporary default, it emits:
    1. ADD COLUMN ... DEFAULT <temporary> NOT NULL
    2. ALTER COLUMN ... DROP DEFAULT
  • existing rows keep the backfilled value
  • future inserts must still provide an explicit value
  • if no safe temporary default is known, the operation still requires an empty table

Scope and Deferrals

This PR intentionally stops at the internal planner boundary.

Deferred:

  • public composable migration utilities for user-authored TypeScript migrations
  • a strategy registry or broader cross-target abstraction layer
  • broader work on user-authored TS contracts / TS migrations

Those are follow-up design and implementation steps. This PR is the bug fix plus the smallest useful extensibility seam.

Summary by CodeRabbit

  • New Features

    • db update live reconciliation for recovering schema drift on non-empty tables.
    • Planner now supports a safe temporary-default strategy to add NOT NULL columns without requiring empty tables.
    • Extensions (e.g., vector) can provide temporary-defaults for reconciliation.
  • Bug Fixes

    • Improved idempotency: migration-apply plans are skipped when already at destination.
  • Documentation

    • Added architecture docs for the db update reconciliation flow and planner behavior.
  • Tests

    • Expanded tests for NOT NULL migrations, pgvector recovery, idempotency, and CLI error reporting.

jkomyno added 30 commits March 12, 2026 14:28
Both commands previously returned raw `VerifyDatabaseSchemaResult` on
schema verification failure instead of the standard `CliErrorEnvelope`.
This made agents parsing `--json` output handle two different shapes.

Now both commands map schema verification failures through a new
`errorSchemaVerificationFailed` factory (PN-RTM-3004), which wraps
the full verification tree in `meta.verificationResult`. Human TTY
output still renders the tree before the error summary.
Add 13 journey-based e2e test files covering 30 CLI scenarios (A-Z + P2-P5)
across greenfield setup, schema evolution, brownfield adoption, drift detection,
migration edge cases, and error scenarios. Each journey is a single it() block
with descriptive assertion labels for step-level failure identification.

Infrastructure:
- journey-test-helpers.ts: CommandResult-based runners for all CLI commands,
  contract swap helper, JSON parsing, SQL helper
- 5 contract fixtures (base, additive, destructive, add-table, v3)
- Config templates with {{DB_URL}} placeholder replacement

Known limitations (marked as TODO in tests):
- db schema-verify and db update hit Vite SSR module resolution error
  after DDL changes (PN-CLI-4999) — affects drift-schema journeys M/N
- Migration chain breakage recovery (P3) requires manual intervention
- Target mismatch (U) needs better simulation approach
The "Vite SSR error" was actually a stale build of @prisma-next/core-control-plane
missing the errorSchemaVerificationFailed export in dist. After rebuild, db
schema-verify and db sign properly return exit code 1 for verification failures.

Journey M: Phantom drift now fully tests verify false-positive → schema-verify
catches drift. db update recovery fails as expected (planner can't re-add
dropped NOT NULL columns — documented as known limitation).

Journey N: Extra column drift now tests verify → tolerant/strict schema-verify →
expand contract → db update (with --no-interactive fallback to -y for
confirmation handling) → tolerant verify passes.

Also restores exact exit code 1 assertions for schema-verify and db sign
failures in brownfield-adoption and drift-marker journeys.
…rrel

The function was defined in errors.ts but missing from the exports barrel,
causing runtime undefined when imported via the package's public API.
This was the root cause of the PN-CLI-4999 "Vite SSR error" in e2e tests —
the function was available in source but not in built dist.
…name)

The test was deleting the wrong migration directory — 'initial' sorted last
alphabetically (i > a) but was the chain root (∅→base), not the intended
target 'add-posts' (additive→v3). After deleting the root, the planner
correctly couldn't reconstruct the graph.

Fix: find the migration dir by name suffix ('_add_posts') instead of
relying on alphabetical sort. Now P3.03–P3.04 fully test the recovery
flow: re-plan the missing edge → apply succeeds.
Instead of tampering with the marker row (which only affects stored JSON,
not the hash comparison), edit contract.json on disk to change the target
field from "postgres" to "mysql". db verify compares contractIR.target
against config.target.targetId, so this triggers PN-RTM-3003.
Remove unused parseJsonOutput import and add commander as devDependency
so the Command type import in journey-test-helpers resolves.
Tolerant vs strict schema-verify is already tested in Journey N
(drift-schema.e2e.test.ts) with extra columns. Journey H tested
the same code path with extra tables, adding no unique coverage.
…test.ts

P4 (partial apply/resume) and P5 (no migration path) were explicitly
noted in the plan as already covered by the command-specific test.
P5's recovery pattern is also identical to P3 (chain breakage).
Q (apply noop), R (plan noop), and X (show variants) are now tail
steps of Journey B. They reuse B's already-applied migration chain
instead of spinning up 3 separate PGlite instances for 4 assertions.
Journey I's JSON assertions are already covered per-command in A.09,
A.10, B.10, and isolated command tests. Journey J (help output)
has near-zero regression prevention value. Journey Y (global flags)
is retained as a lightweight no-DB test.
- 30 → 22 journeys, 13 → 10 test files
- Removed: H (redundant with N), I (JSON covered inline), J (help),
  P4 and P5 (covered by command-specific tests)
- Merged: Q, R, X as tail steps of Journey B
- Updated cross-reference matrix, file structure, acceptance criteria,
  implementation phases, and parallelism estimates
- `pnpm test:journeys` from root runs only journey e2e tests
- Dedicated vitest config with forks pool (4 workers) and
  spinUpPpgDev timeouts for PGlite-backed tests
- Wired through turbo.json for proper build dependency resolution
Add a self-contained README to cli-journeys/ summarizing all 11 test
files by what they cover. Rewrite top-level JSDoc comments in each test
file to describe scenarios in plain words instead of opaque codes.
Also gitignore the vite-plugin output/ fixture directory.
Reverts the errorSchemaVerificationFailed wrapping introduced in 0f57e7f.
The db sign e2e test expects the raw VerifyDatabaseSchemaResult shape
(with `schema` at top level) in JSON failure output, not a CliErrorEnvelope.
- Initialize all closeDb declarations with no-op default to prevent
  TypeError if createDevDatabase() fails in beforeAll
- Use callback form for DB_URL replacement to prevent special character
  corruption in connection strings (e.g., $& in passwords)
- Re-throw unexpected errors in runCommand/runCommandRaw when no CLI
  exit code was captured, preventing silent masking of real regressions
- Run V.02 (dry-run) before V.01 (mutating db init) so the dry-run
  validates the pre-mutation state
- Merge redundant B.05/B.06 steps — both called runMigrationStatus
  with the same DB-connected context; consolidated into one step
- Fix misleading comments in Journey N (age drift stays unresolved;
  test intentionally validates tolerant mode)
- Clarify README: "each journey (describe block)" not just "describe"
- Replace `node:path` with `pathe` for consistent cross-platform paths
- Remove try/catch in getMigrationDirs that masked broken test setup
…nFailed

Replace `Record<string, unknown>` with `VerifyDatabaseSchemaResult` and
inline `{ kind?: string; message?: string }` with `SchemaIssue[]`.
Both types are co-located in the same package — no layering concern.
Extract the shared try/catch/finally logic (mock setup, process.chdir,
exit code handling, cleanup) into a single runCommandCore function.
runCommand and runCommandRaw are now thin wrappers that differ only in
argument construction.

Also documents that pool: 'forks' is a hard requirement for process.chdir safety.
…pDir })

setupJourneyNoDb was functionally identical to setupJourney when called
without a connectionString. Callers now use setupJourney directly.
…tests

Address PR review feedback from saevarb:

- Add useDevDatabase() helper that encapsulates the beforeAll/afterAll
  database lifecycle, eliminating the repeated closeDb boilerplate
  across all 9 journey test files with databases.

- Replace manually maintained ContractVariant union with a contractFixtures
  object that derives the type automatically via keyof typeof.

- Re-export timeouts from journey-test-helpers so tests import from
  a single module.
…ables

When adding a NOT NULL column without an explicit DEFAULT to a table that
may contain rows, the planner now emits a 2-step DDL:

  1. ALTER TABLE … ADD COLUMN … DEFAULT <zero> NOT NULL
  2. ALTER TABLE … ALTER COLUMN … DROP DEFAULT

This uses a type-appropriate zero value (e.g. '' for text, 0 for int4,
false for bool) as a temporary default so PostgreSQL can backfill existing
rows, then immediately drops it so future inserts must provide a value.

The runner is also updated to always apply planned operations when the
plan origin is null (db update mode), even if the marker hash already
matches the destination. This fixes schema drift recovery where manual
DDL changed the database without changing the contract.

Closes TML-2067
jkomyno added 2 commits March 19, 2026 14:24
# Conflicts:
#	test/integration/test/cli-journeys/drift-schema.e2e.test.ts
#	test/integration/test/cli.migration-apply.e2e.test.ts
Copy link
Copy Markdown
Contributor

@wmadden wmadden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix is great and I'm pleased to see you solving more sophisticated problems in the migration the planner than it has before, but the way the logic is structured is coupling things together in ways that will cause problems.

Up until now, the planner has only needed single-step operations — one DDL statement per goal. This is our first multi-step strategy: a coordinated sequence of operations (add column with temporary default, then drop the default) that collectively solve a problem neither step could handle alone. We're going to need more of these — adding indexes concurrently, renaming columns via copy-and-swap, changing column types through shadow columns — so this PR sets the precedent for how we structure them.

The temporary-default logic is Postgres-specific and hardcoded, it doesn't work with our own pgvector extension, and the strategy is buried inside the planner where nobody else can use it. If we keep adding strategies in this form, the coupling gets worse with each one. This is a small PR (~300 lines), so now is the cheapest time to course-correct.

The opportunity

Right now, the strategy logic is inline in buildAddColumnOperation. The method decides when a temporary default is needed, figures out what the temporary default value should be, and builds the entire multi-step operation — all in one place. That works fine for one strategy, but as we add more, a couple of things get harder:

Strategies become hard to reuse. The recipe for "add a NOT NULL column using a temporary default" is a self-contained, useful thing. But because it's woven into the planner's decision logic, nobody else can call it. When we support user-authored migrations (e.g. TypeScript files that produce ops.json), users will want access to these same recipes. Think of how Rails gems like strong_migrations and safe-pg-migrations work — they provide reusable migration strategies that users invoke directly. We should be building toward the same model.

Zero-value resolution is closed to extensions. buildTypeZeroDefaultLiteral is a hardcoded switch over native type strings. It doesn't consult codec hooks and doesn't receive typeParams. This means extension types can't participate. Consider pgvector: a vector(1536) column has a perfectly sensible zero value (a zero vector of the right dimension), but the function only sees the string "vector", can't determine the dimensionality, and returns null. The planner already has CodecControlHooks for exactly this kind of type-specific concern — expandNativeType and planTypeOperations flow through it. Zero-value resolution should too.

The strategy concept is generic, but the implementation is target-specific. "Add a NOT NULL column by using a temporary default, then remove it" applies to any SQL database. The only target-specific parts are the SQL syntax and the zero-value lookup. A future MySQL or SQLite target would need to reimplement the entire recipe from scratch rather than reusing it with different SQL builders.

What to change in this PR

There are two different jobs in buildAddColumnOperation right now: deciding what to do (planner logic) and doing it (strategy recipe). Separating those would make both simpler and make the strategy reusable.

Pull the multi-step recipe out into its own callable unit. The unit takes resolved parameters (table, column, temporary default value) and returns the complete operation (prechecks, execute steps, postchecks). The planner still decides when to call it and what temporary default value to pass.

In pseudocode, the planner goes from this:

// everything inline in buildAddColumnOperation
const temporaryDefault = buildTypeZeroDefaultLiteral(column.nativeType);
const precheck = [ ... column-missing check, maybe empty-table check ... ];
const execute = [ ... ADD COLUMN with default, maybe DROP DEFAULT ... ];
const postcheck = [ ... column-exists, maybe NOT NULL, maybe no-default ... ];

To this:

// planner decides when and with what value
const temporaryDefault = resolveTemporaryDefault(column, codecHooks);

// strategy unit handles the how
if (temporaryDefault !== null) {
  return addNotNullColumnWithTemporaryDefault({ schema, table, column, temporaryDefault, codecHooks });
} else {
  return addColumnWithEmptyTablePrecheck({ schema, table, column, codecHooks });
}

The planner's procedural decision logic stays exactly as it is. The only change is that the operation-building part moves into a function with explicit inputs and outputs.

Not every operation in the planner needs this treatment. Adding a plain nullable column is a single DDL statement — there's no multi-step recipe, no coordination, nothing to extract. The bar for this PR is: does the operation require multiple coordinated steps to achieve a single goal? If yes, extract it. If it's one step, leave it inline.

Optionally (this PR or the next): extend zero-value resolution to check codec hooks before the hardcoded switch. Resolution order: codec hook (has access to typeParams) → built-in fallback → null (empty-table precheck). This follows the same pattern as expandNativeType and unblocks extension types.

Future direction

These are things we do not need to solve now, but they inform why the extraction above matters. Knowing where this is heading helps us make sure the pattern we establish now doesn't paint us into a corner.

A recipe library for user-authored migrations. When users write their own migrations, they'll want a standard library of operation recipes — not just multi-step strategies, but also single-step conveniences like addColumn, renameColumn, addIndex. These are the Active Record add_column / rename_column equivalents: well-formed operations with the right prechecks and postchecks, so users never need to hand-craft operation JSON. Multi-step strategies and single-step recipes are the same thing from the user's perspective — "give me the operations for this goal." The multi-step ones just happen to return more than one step. Extracting the strategy now means the first entry in that library already exists when we get there.

Extensible strategy selection. Eventually users or extension packs may want to say "under these conditions, use this strategy instead of the default." That's a harder problem (registry, priority, conflict resolution) and should wait until we have enough concrete use cases to design the right interface.

Cross-target strategy reuse. Factoring strategies into a target-agnostic recipe structure parameterized by target-specific SQL builders. Worth doing when we have a second database target to triangulate from.

@jkomyno
Copy link
Copy Markdown
Contributor Author

jkomyno commented Mar 25, 2026

@wmadden

  • Extracted the multi-step ADD COLUMN ... NOT NULL temporary-default recipe out of the planner decision path.
  • Added resolveTemporaryDefaultLiteral to codec hooks and wired pgvector into it, with built-in fallback only after hooks.
  • Aligned Journey M with the real recovery path and added live-db pgvector recovery coverage.
  • Fix: fefba880

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
packages/3-targets/3-targets/postgres/test/migrations/planner.behavior.test.ts (1)

128-338: Split the temporary-default cases into a dedicated test file.

This new suite/helper block is a standalone concern, and together they push planner.behavior.test.ts past the 500-line cap. Moving them into something like planner.temporary-defaults.test.ts would keep the existing behavior file easier to scan and maintain.

As per coding guidelines, Keep test files under 500 lines to maintain readability and navigability. If a test file exceeds this limit, it should be split into multiple files.

Also applies to: 404-484

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/3-targets/3-targets/postgres/test/migrations/planner.behavior.test.ts`
around lines 128 - 338, Move the entire temporary-default test suite out of
planner.behavior.test.ts into a new file (e.g.
planner.temporary-defaults.test.ts): specifically extract the describe block
starting "NOT NULL column without default uses temporary default" (which
references planAddColumn, createPlannerControlHookComponent, pgvectorDescriptor,
and qualifiedUserTable) and the following "buildTypeZeroDefaultLiteral (built-in
fallback)" tests (which reference buildTypeZeroDefaultLiteral). In the new file
copy any necessary test helpers/imports used by those tests (planAddColumn,
createPlannerControlHookComponent, pgvectorDescriptor,
buildTypeZeroDefaultLiteral, and any test setup), delete the moved blocks from
the original file, and run tests to ensure imports/exports and test-scoped
fixtures remain correct so both files stay under the 500-line guideline.
packages/3-targets/3-targets/postgres/src/core/migrations/planner.ts (1)

339-410: Use column-scoped target details here.

buildAddColumnOperationIdentity() currently stores { objectType: 'table', name: tableName }, so both add-column paths lose the actual column identity in target.details. I’d keep that metadata lossless with a column-scoped target.

Suggested change
     target: {
       id: 'postgres',
-      details: buildTargetDetails('table', tableName, schema),
+      details: buildTargetDetails('column', columnName, schema, tableName),
     },

Also applies to: 705-721

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/3-targets/3-targets/postgres/src/core/migrations/planner.ts` around
lines 339 - 410, buildAddColumnOperation is currently calling
buildAddColumnOperationIdentity which only records { objectType: 'table', name:
tableName } so target.details loses the column identity; update the planner to
use column-scoped target details (e.g. objectType: 'column', name: columnName,
plus parent table/schema info) when building both the normal add-column and the
add-not-null-with-temporary-default path (the callsite in
buildAddNotNullColumnWithTemporaryDefaultOperation and the other add-column
occurrence mentioned). Either extend buildAddColumnOperationIdentity to accept
columnName (or add a new helper like buildAddColumnTargetDetails) and replace
the identity used in buildAddColumnOperation and the other add-column code path
so target.details contains full column identity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/3-targets/3-targets/postgres/src/core/migrations/planner.ts`:
- Around line 339-410: buildAddColumnOperation is currently calling
buildAddColumnOperationIdentity which only records { objectType: 'table', name:
tableName } so target.details loses the column identity; update the planner to
use column-scoped target details (e.g. objectType: 'column', name: columnName,
plus parent table/schema info) when building both the normal add-column and the
add-not-null-with-temporary-default path (the callsite in
buildAddNotNullColumnWithTemporaryDefaultOperation and the other add-column
occurrence mentioned). Either extend buildAddColumnOperationIdentity to accept
columnName (or add a new helper like buildAddColumnTargetDetails) and replace
the identity used in buildAddColumnOperation and the other add-column code path
so target.details contains full column identity.

In
`@packages/3-targets/3-targets/postgres/test/migrations/planner.behavior.test.ts`:
- Around line 128-338: Move the entire temporary-default test suite out of
planner.behavior.test.ts into a new file (e.g.
planner.temporary-defaults.test.ts): specifically extract the describe block
starting "NOT NULL column without default uses temporary default" (which
references planAddColumn, createPlannerControlHookComponent, pgvectorDescriptor,
and qualifiedUserTable) and the following "buildTypeZeroDefaultLiteral (built-in
fallback)" tests (which reference buildTypeZeroDefaultLiteral). In the new file
copy any necessary test helpers/imports used by those tests (planAddColumn,
createPlannerControlHookComponent, pgvectorDescriptor,
buildTypeZeroDefaultLiteral, and any test setup), delete the moved blocks from
the original file, and run tests to ensure imports/exports and test-scoped
fixtures remain correct so both files stay under the 500-line guideline.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 6a791384-b59c-478d-a447-c527653b3e12

📥 Commits

Reviewing files that changed from the base of the PR and between 3a57f7b and fefba88.

📒 Files selected for processing (8)
  • docs/architecture docs/subsystems/7. Migration System.md
  • packages/2-sql/3-tooling/family/src/core/migrations/types.ts
  • packages/2-sql/3-tooling/family/src/exports/control.ts
  • packages/3-extensions/pgvector/src/exports/control.ts
  • packages/3-targets/3-targets/postgres/src/core/migrations/planner.ts
  • packages/3-targets/3-targets/postgres/test/migrations/planner.behavior.test.ts
  • test/integration/test/authoring/psl.pgvector-dbinit.test.ts
  • test/integration/test/cli-journeys/drift-schema.e2e.test.ts
✅ Files skipped from review due to trivial changes (2)
  • packages/2-sql/3-tooling/family/src/exports/control.ts
  • docs/architecture docs/subsystems/7. Migration System.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/integration/test/cli-journeys/drift-schema.e2e.test.ts

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/3-targets/3-targets/postgres/src/core/migrations/planner.ts (1)

1209-1211: Minor: Type cast inconsistency for json column type.

Both json and jsonb cases return '{}'::jsonb. While PostgreSQL will implicitly cast jsonb to json when inserting into a json column, it would be more precise to match the target type.

♻️ Suggested refinement
     // JSON types — use empty object rather than JSON null to avoid
     // ambiguity with JS null when drivers deserialize the value.
     case 'json':
+      return "'{}'::json";
     case 'jsonb':
       return "'{}'::jsonb";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/3-targets/3-targets/postgres/src/core/migrations/planner.ts` around
lines 1209 - 1211, The switch currently returns the JSON literal cast to jsonb
for both 'json' and 'jsonb'; update the 'json' branch so it returns the empty
JSON literal cast to json (i.e., "'{}'::json") while keeping the 'jsonb' branch
as "'{}'::jsonb" so the returned default precisely matches the declared column
type in the planner switch handling these cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/3-targets/3-targets/postgres/src/core/migrations/planner.ts`:
- Around line 370-385: The check for a temporary default treats empty string as
absent downstream (buildAddColumnSql uses a truthy check), so update the code to
treat "" as a valid temporary default: keep the current creation logic
(temporaryDefault produced by resolveTemporaryDefaultLiteral) but change any
truthy checks to explicit null checks. Specifically, ensure uses of
temporaryDefault (e.g., in buildAddColumnSql and any callers of
temporaryDefault) test temporaryDefault !== null (or temporaryDefault != null)
instead of relying on truthiness so an empty-string literal returned by
resolveTemporaryDefaultLiteral is handled as a real default and not dropped.

---

Nitpick comments:
In `@packages/3-targets/3-targets/postgres/src/core/migrations/planner.ts`:
- Around line 1209-1211: The switch currently returns the JSON literal cast to
jsonb for both 'json' and 'jsonb'; update the 'json' branch so it returns the
empty JSON literal cast to json (i.e., "'{}'::json") while keeping the 'jsonb'
branch as "'{}'::jsonb" so the returned default precisely matches the declared
column type in the planner switch handling these cases.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: b4bb5ed3-e927-4b27-9acc-a70182192482

📥 Commits

Reviewing files that changed from the base of the PR and between fefba88 and fd12d87.

📒 Files selected for processing (2)
  • packages/3-targets/3-targets/postgres/src/core/migrations/planner.ts
  • packages/3-targets/3-targets/postgres/test/migrations/planner.behavior.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/3-targets/3-targets/postgres/test/migrations/planner.behavior.test.ts

@jkomyno jkomyno requested a review from wmadden March 26, 2026 09:06
Copy link
Copy Markdown
Contributor

@wmadden wmadden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review (Round 2)

Ticket: TML-2067

Summary

This revision addresses the correctness and test-coverage issues from the first review. The multi-step recipe is extracted, codec hooks enable extension-aware temporary defaults, pgvector implements the hook end-to-end, and tests cover constraint-aware fallbacks, live data recovery, and the pgvector extension path. One blocking concern remains: the planner file has grown to 1396 lines with ~350 lines added by this PR, conflating strategy selection, recipes, SQL builders, and zero-value resolution under a single overloaded naming convention. Decomposition into separate modules should happen as part of this PR.

What Looks Solid

  • Clean separation of concerns: buildAddColumnOperation (decides when/what) → resolveTemporaryDefaultLiteral (resolves the value) → canUseSharedTemporaryDefaultStrategy (checks constraint safety) → buildAddNotNullColumnWithTemporaryDefaultOperation (builds the recipe). Each has a single, well-defined responsibility.
  • Three-way codec hook return semantics: string | null | undefined gives extensions full control — provide a value, explicitly decline, or defer to built-in logic.
  • Constraint-aware fallback: canUseSharedTemporaryDefaultStrategy catches the subtle case where a new column will become a PK/unique/FK later in the same plan, preventing shared zero values from violating constraints.
  • Expanded type coverage: The built-in fallback now handles arrays, tsvector, bytea, bit(n) with length, timetz with UTC offset, and alias forms like varchar/bpchar/varbit.
  • pgvector integration: The extension implements resolveTemporaryDefaultLiteral with dimension-aware zero vectors, validating that the codec hook interface works for parameterized types.
  • Test quality: The test suite is thorough — unit tests cover each branch in the planner decision tree, integration tests validate against real Postgres, and e2e tests exercise the full db update recovery path with data.
  • Architecture docs updated to reflect the new resolution chain.

Blocking Issues

See inline comment F01 on planner.ts.

Non-Blocking Concerns

See inline comments F02, F03, F04, F05.

Nits

  • F06: buildBitZeroDefaultLiteral returns null for length <= 0, but bit(0) is itself invalid in Postgres. Returning null is fine; the edge case is academic.
  • F07: buildZeroVectorTemporaryDefaultLiteral (pgvector) and buildBitZeroDefaultLiteral share the same typeof length !== 'number' || !Number.isInteger(length) || length <= 0 validation. Minor shared-utility opportunity, not worth extracting at two call sites.

Acceptance-Criteria Traceability

AC Status Evidence
AC1: Planner emits 2-step operation for NOT NULL columns without defaults Met planner.behavior.test.ts L128–L414, planner.integration.test.ts L109–L149
AC2: Error envelope includes fix field Not met See F03. Should be tracked as follow-up.
AC3: Journey test with data rows + missing NOT NULL column Met drift-schema.e2e.test.ts L37–L101, psl.pgvector-dbinit.test.ts L125–L189

@wmadden
Copy link
Copy Markdown
Contributor

wmadden commented Mar 26, 2026

F03 — Linear AC2 (error fix field) not addressed (non-blocking)

The Linear ticket's acceptance criterion AC2 asks for the error envelope to include a fix field with a concrete SQL snippet the user can run for unrecoverable cases (e.g., when no zero value is available and the table is non-empty). This is not implemented in this PR.

This is fine to defer if it's tracked as a follow-up. The planner currently falls back to the empty-table precheck, which gives the user a descriptive failure. Adding a fix field with the SQL the user could run manually would improve the DX but is not a correctness concern.

The values used to backfill NOT NULL columns are identity values in the
algebraic (monoid) sense — the neutral element for each type. Rename to
make this explicit and drop the misleading "temporary" qualifier from
the value-producing layer.
Copy link
Copy Markdown
Contributor

@wmadden wmadden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eagerly approving - address or defer my comments above and then merge, don't wait for a re-review :)

@jkomyno jkomyno merged commit d773f53 into main Mar 30, 2026
15 checks passed
@jkomyno jkomyno deleted the fix/ddl-not-null branch March 30, 2026 07:55
wmadden added a commit that referenced this pull request Apr 2, 2026
The rebase incorrectly replaced Alberto's descriptive
planner-target-details.ts (from PR #241) with our generic
planner-types.ts. The original name is better: it describes what the
module contains (target detail types and builder), whereas
"planner-types" is vague and misleading given it also exports a
function.
wmadden added a commit that referenced this pull request Apr 2, 2026
The rebase incorrectly replaced Alberto's descriptive
planner-target-details.ts (from PR #241) with our generic
planner-types.ts. The original name is better: it describes what the
module contains (target detail types and builder), whereas
"planner-types" is vague and misleading given it also exports a
function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants