Switch database schema to versioned migration files#3
Merged
Conversation
## Summary Replace the monolithic `schema.sql` with ordered migration files under `src/db/migrations/`, executed by `node-pg-migrate` behind the existing `migrate()` / `migrationStatus()` API. Existing v1.0.x databases upgrade without data loss and without operator intervention across three distinct install scenarios. ## Changes - Add `src/db/migrations/0001_baseline.sql` — frozen baseline that captures the full Phase 1 schema. - Add `node-pg-migrate` as a runtime dependency; wire it into `migration-api.ts` behind the unchanged `migrate()` / `MigrateOptions` / `MigrateResult` surface. - Add `migrationStatus()` diagnostics: `appliedMigrationCount`, `latestMigrationName`, `migrationHistoryStatus`, and `embeddingDimension` sourced from `pgmigrations` and the pgvector catalog. - Add advisory-lock wrapper (`migration-lock.ts`) and baseline validator (`migration-baseline-validator.ts`) that implement the three upgrade scenarios (fresh install, pre-Phase-2 install with data, Phase 1 install with data). - Move provenance-only SQL changelog files from `src/db/migrations/` to `docs/db/changelog/`; add `docs/db/migrations.md` operator reference. - Add `DELETE /v1/admin/scope` endpoint (mounted only when `CORE_ADMIN_API_KEY` and `CORE_TEST_SCOPE_ALLOW_PATTERN` are both set) for disposable smoke/eval scope cleanup; document `adminBearerAuth` security scheme in OpenAPI. - Add `session_id` field to ingest, search, and list endpoints for symmetric per-thread scoping. - Add `meta-fact-filter.ts` service and extraction integration to suppress internal bookkeeping facts from search results. - Add `scripts/generate-schema-hash.ts` for deterministic `dist/db/schema-sha256.json` manifest; add `scripts/cleanup-meta-facts.ts` one-off. - Expose `atomicmemory-core` CLI binary via `package.json` `bin` field. ## Why A single `schema.sql` file cannot express incremental, auditable schema evolution. Operators running `migrate()` across rolling replicas have no per-file audit trail, no deterministic ordering guarantee, and no safe path to apply additive changes without re-running the entire baseline against existing tables. `node-pg-migrate` provides per-file tracking via `pgmigrations`, advisory-lock serialization, and a `fake-apply` path that lets the runner stamp the baseline as already-applied on a pre-migration database without touching live rows. The `schema_version` table is preserved alongside `pgmigrations` so operators can answer both "which migration files ran" and "which `@atomicmemory/core` semver this DB corresponds to". ## Validation - **`baseline-schema-equivalence.test.ts`** — CI gate. Builds a fresh Phase 2 schema on one database and an upgrade-path schema (legacy schema fixture applied, then `migrate()` invoked) on another; asserts the structural snapshot is identical modulo framework bookkeeping tables. - **Phase 1 data-preservation suite** (`migration-data-preservation.test.ts`, `migration-backcompat.test.ts`) — carried forward unchanged. Seeds representative legacy rows, snapshots them, runs `migrate()`, and asserts every row, primary key, foreign-key relationship, JSON metadata field, timestamp, and representative vector survives. - **`migration-lock.test.ts`** — verifies concurrent replica boots serialize correctly under the advisory lock. - **`dag-sanity.test.ts`** — asserts migration filenames are strictly monotonic and no shipped file has been edited after its first appearance. - **`admin.test.ts`** — covers `DELETE /v1/admin/scope` auth, pattern enforcement, and scope-scoped deletion.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace the monolithic
schema.sqlwith ordered migration files undersrc/db/migrations/, executed bynode-pg-migratebehind the existingmigrate()/migrationStatus()API. Existing v1.0.x databases upgrade without data loss and without operator intervention across three distinct install scenarios.Changes
src/db/migrations/0001_baseline.sql— frozen baseline that captures the full Phase 1 schema.node-pg-migrateas a runtime dependency; wire it intomigration-api.tsbehind the unchangedmigrate()/MigrateOptions/MigrateResultsurface.migrationStatus()diagnostics:appliedMigrationCount,latestMigrationName,migrationHistoryStatus, andembeddingDimensionsourced frompgmigrationsand the pgvector catalog.migration-lock.ts) and baseline validator (migration-baseline-validator.ts) that implement the three upgrade scenarios (fresh install, pre-Phase-2 install with data, Phase 1 install with data).src/db/migrations/todocs/db/changelog/; adddocs/db/migrations.mdoperator reference.DELETE /v1/admin/scopeendpoint (mounted only whenCORE_ADMIN_API_KEYandCORE_TEST_SCOPE_ALLOW_PATTERNare both set) for disposable smoke/eval scope cleanup; documentadminBearerAuthsecurity scheme in OpenAPI.session_idfield to ingest, search, and list endpoints for symmetric per-thread scoping.meta-fact-filter.tsservice and extraction integration to suppress internal bookkeeping facts from search results.scripts/generate-schema-hash.tsfor deterministicdist/db/schema-sha256.jsonmanifest; addscripts/cleanup-meta-facts.tsone-off.atomicmemory-coreCLI binary viapackage.jsonbinfield.Why
A single
schema.sqlfile cannot express incremental, auditable schema evolution. Operators runningmigrate()across rolling replicas have no per-file audit trail, no deterministic ordering guarantee, and no safe path to apply additive changes without re-running the entire baseline against existing tables.node-pg-migrateprovides per-file tracking viapgmigrations, advisory-lock serialization, and afake-applypath that lets the runner stamp the baseline as already-applied on a pre-migration database without touching live rows.The
schema_versiontable is preserved alongsidepgmigrationsso operators can answer both "which migration files ran" and "which@atomicmemory/coresemver this DB corresponds to".Validation
baseline-schema-equivalence.test.ts— CI gate. Builds a fresh Phase 2 schema on one database and an upgrade-path schema (legacy schema fixture applied, thenmigrate()invoked) on another; asserts the structural snapshot is identical modulo framework bookkeeping tables.migration-data-preservation.test.ts,migration-backcompat.test.ts) — carried forward unchanged. Seeds representative legacy rows, snapshots them, runsmigrate(), and asserts every row, primary key, foreign-key relationship, JSON metadata field, timestamp, and representative vector survives.migration-lock.test.ts— verifies concurrent replica boots serialize correctly under the advisory lock.dag-sanity.test.ts— asserts migration filenames are strictly monotonic and no shipped file has been edited after its first appearance.admin.test.ts— coversDELETE /v1/admin/scopeauth, pattern enforcement, and scope-scoped deletion.