Skip to content

Switch database schema to versioned migration files#3

Merged
ethanj merged 1 commit into
mainfrom
sync/core-87e89a0
May 17, 2026
Merged

Switch database schema to versioned migration files#3
ethanj merged 1 commit into
mainfrom
sync/core-87e89a0

Conversation

@ethanj
Copy link
Copy Markdown
Contributor

@ethanj ethanj commented May 17, 2026

Summary

Replace the monolithic schema.sql with ordered migration files under src/db/migrations/, executed by node-pg-migrate behind the existing migrate() / migrationStatus() API. Existing v1.0.x databases upgrade without data loss and without operator intervention across three distinct install scenarios.

Changes

  • Add src/db/migrations/0001_baseline.sql — frozen baseline that captures the full Phase 1 schema.
  • Add node-pg-migrate as a runtime dependency; wire it into migration-api.ts behind the unchanged migrate() / MigrateOptions / MigrateResult surface.
  • Add migrationStatus() diagnostics: appliedMigrationCount, latestMigrationName, migrationHistoryStatus, and embeddingDimension sourced from pgmigrations and the pgvector catalog.
  • Add advisory-lock wrapper (migration-lock.ts) and baseline validator (migration-baseline-validator.ts) that implement the three upgrade scenarios (fresh install, pre-Phase-2 install with data, Phase 1 install with data).
  • Move provenance-only SQL changelog files from src/db/migrations/ to docs/db/changelog/; add docs/db/migrations.md operator reference.
  • Add DELETE /v1/admin/scope endpoint (mounted only when CORE_ADMIN_API_KEY and CORE_TEST_SCOPE_ALLOW_PATTERN are both set) for disposable smoke/eval scope cleanup; document adminBearerAuth security scheme in OpenAPI.
  • Add session_id field to ingest, search, and list endpoints for symmetric per-thread scoping.
  • Add meta-fact-filter.ts service and extraction integration to suppress internal bookkeeping facts from search results.
  • Add scripts/generate-schema-hash.ts for deterministic dist/db/schema-sha256.json manifest; add scripts/cleanup-meta-facts.ts one-off.
  • Expose atomicmemory-core CLI binary via package.json bin field.

Why

A single schema.sql file cannot express incremental, auditable schema evolution. Operators running migrate() across rolling replicas have no per-file audit trail, no deterministic ordering guarantee, and no safe path to apply additive changes without re-running the entire baseline against existing tables. node-pg-migrate provides per-file tracking via pgmigrations, advisory-lock serialization, and a fake-apply path that lets the runner stamp the baseline as already-applied on a pre-migration database without touching live rows.

The schema_version table is preserved alongside pgmigrations so operators can answer both "which migration files ran" and "which @atomicmemory/core semver this DB corresponds to".

Validation

  • baseline-schema-equivalence.test.ts — CI gate. Builds a fresh Phase 2 schema on one database and an upgrade-path schema (legacy schema fixture applied, then migrate() invoked) on another; asserts the structural snapshot is identical modulo framework bookkeeping tables.
  • Phase 1 data-preservation suite (migration-data-preservation.test.ts, migration-backcompat.test.ts) — carried forward unchanged. Seeds representative legacy rows, snapshots them, runs migrate(), and asserts every row, primary key, foreign-key relationship, JSON metadata field, timestamp, and representative vector survives.
  • migration-lock.test.ts — verifies concurrent replica boots serialize correctly under the advisory lock.
  • dag-sanity.test.ts — asserts migration filenames are strictly monotonic and no shipped file has been edited after its first appearance.
  • admin.test.ts — covers DELETE /v1/admin/scope auth, pattern enforcement, and scope-scoped deletion.

## Summary

Replace the monolithic `schema.sql` with ordered migration files under `src/db/migrations/`, executed by `node-pg-migrate` behind the existing `migrate()` / `migrationStatus()` API. Existing v1.0.x databases upgrade without data loss and without operator intervention across three distinct install scenarios.

## Changes

- Add `src/db/migrations/0001_baseline.sql` — frozen baseline that captures the full Phase 1 schema.
- Add `node-pg-migrate` as a runtime dependency; wire it into `migration-api.ts` behind the unchanged `migrate()` / `MigrateOptions` / `MigrateResult` surface.
- Add `migrationStatus()` diagnostics: `appliedMigrationCount`, `latestMigrationName`, `migrationHistoryStatus`, and `embeddingDimension` sourced from `pgmigrations` and the pgvector catalog.
- Add advisory-lock wrapper (`migration-lock.ts`) and baseline validator (`migration-baseline-validator.ts`) that implement the three upgrade scenarios (fresh install, pre-Phase-2 install with data, Phase 1 install with data).
- Move provenance-only SQL changelog files from `src/db/migrations/` to `docs/db/changelog/`; add `docs/db/migrations.md` operator reference.
- Add `DELETE /v1/admin/scope` endpoint (mounted only when `CORE_ADMIN_API_KEY` and `CORE_TEST_SCOPE_ALLOW_PATTERN` are both set) for disposable smoke/eval scope cleanup; document `adminBearerAuth` security scheme in OpenAPI.
- Add `session_id` field to ingest, search, and list endpoints for symmetric per-thread scoping.
- Add `meta-fact-filter.ts` service and extraction integration to suppress internal bookkeeping facts from search results.
- Add `scripts/generate-schema-hash.ts` for deterministic `dist/db/schema-sha256.json` manifest; add `scripts/cleanup-meta-facts.ts` one-off.
- Expose `atomicmemory-core` CLI binary via `package.json` `bin` field.

## Why

A single `schema.sql` file cannot express incremental, auditable schema evolution. Operators running `migrate()` across rolling replicas have no per-file audit trail, no deterministic ordering guarantee, and no safe path to apply additive changes without re-running the entire baseline against existing tables. `node-pg-migrate` provides per-file tracking via `pgmigrations`, advisory-lock serialization, and a `fake-apply` path that lets the runner stamp the baseline as already-applied on a pre-migration database without touching live rows.

The `schema_version` table is preserved alongside `pgmigrations` so operators can answer both "which migration files ran" and "which `@atomicmemory/core` semver this DB corresponds to".

## Validation

- **`baseline-schema-equivalence.test.ts`** — CI gate. Builds a fresh Phase 2 schema on one database and an upgrade-path schema (legacy schema fixture applied, then `migrate()` invoked) on another; asserts the structural snapshot is identical modulo framework bookkeeping tables.
- **Phase 1 data-preservation suite** (`migration-data-preservation.test.ts`, `migration-backcompat.test.ts`) — carried forward unchanged. Seeds representative legacy rows, snapshots them, runs `migrate()`, and asserts every row, primary key, foreign-key relationship, JSON metadata field, timestamp, and representative vector survives.
- **`migration-lock.test.ts`** — verifies concurrent replica boots serialize correctly under the advisory lock.
- **`dag-sanity.test.ts`** — asserts migration filenames are strictly monotonic and no shipped file has been edited after its first appearance.
- **`admin.test.ts`** — covers `DELETE /v1/admin/scope` auth, pattern enforcement, and scope-scoped deletion.
@ethanj ethanj merged commit 496b78b into main May 17, 2026
2 checks passed
@ethanj ethanj deleted the sync/core-87e89a0 branch May 17, 2026 05:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant