Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@
# the bundled single-container Postgres/pgvector instance. Source checkouts and
# production deployments should set a real Postgres connection string.
DATABASE_URL=postgresql://atomicmemory:atomicmemory@localhost:5433/atomicmemory
# Docker entrypoint only: set false when migrations run from a separate
# pre-deploy job. Defaults to true when omitted.
# ATOMICMEMORY_RUN_MIGRATIONS_ON_STARTUP=true
# Docker entrypoint only: forwarded to `migrate({ lockTimeoutMs })`.
# Increase for rolling deploys where another replica may hold the migration lock.
# MIGRATION_LOCK_TIMEOUT_MS=120000

# --- Provider credentials ---
# Required when either EMBEDDING_PROVIDER=openai or LLM_PROVIDER=openai.
Expand All @@ -19,6 +25,13 @@ OPENAI_API_KEY=<openai-api-key>
# Docker image local mode defaults this to `local-dev-key` when omitted.
CORE_API_KEY=replace-with-a-strong-random-secret

# Optional admin-only cleanup endpoint for disposable smoke/eval scopes.
# When both values are set, DELETE /v1/admin/scope accepts a JSON body
# `{ "user_id": "..." }` and deletes only matching test scopes.
# Use a different secret from CORE_API_KEY. Do not enable for general clients.
# CORE_ADMIN_API_KEY=<admin-cleanup-secret>
# CORE_TEST_SCOPE_ALLOW_PATTERN=^(smoke-|docker-|test-).+

# Hex-encoded HMAC secret used to derive PII-safe storage-key
# prefixes. Must be at least 64 hex chars (32 bytes of entropy).
# Rotating this invalidates existing managed-blob storage paths;
Expand Down
3 changes: 3 additions & 0 deletions .env.test.example
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
DATABASE_URL=postgresql://atomicmemory:atomicmemory@localhost:5433/atomicmemory
OPENAI_API_KEY=test-placeholder
CORE_API_KEY=test-core-api-key
# Optional admin cleanup endpoint used by external smoke harnesses.
# CORE_ADMIN_API_KEY=test-admin-api-key
# CORE_TEST_SCOPE_ALLOW_PATTERN=^(smoke-|docker-|test-).+
STORAGE_KEY_HMAC_SECRET=000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f
EMBEDDING_DIMENSIONS=1024
PORT=3051
Expand Down
8 changes: 7 additions & 1 deletion .fallowrc.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,13 @@
"$schema": "https://fallow.tools/schema.json",
"ignorePatterns": [
"**/one-offs/**",
"scripts/smoke-openapi-export.mjs"
"scripts/**"
],
"ignoreDependencies": [
"@helia/unixfs",
"blockstore-core",
"pino",
"yaml"
],
"rules": {
"unused-types": "off",
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -72,3 +72,6 @@ scripts/one-offs/
# Platform-specific deploy configs (canonical copies live in deploy/)
railway.toml
*.tgz

# Superpowers skill plugin output — agent-generated specs/plans, internal-only.
docs/superpowers/
56 changes: 56 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,64 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

## [Unreleased]

### Added
- Phase 1 migration hardening now packages a deterministic
`dist/db/schema-sha256.json` manifest for the shipped DB schema bytes.
- Phase 2 versioned migrations. Schema is now expressed as ordered files
under `src/db/migrations/` (shipped as `dist/db/migrations/`), executed
by `node-pg-migrate` and tracked per-file in the `pgmigrations` table.
The Phase 1 `schema_version` table is preserved alongside `pgmigrations`
so operators can answer both "which migration files ran" and "which
`@atomicmemory/core` semver this DB corresponds to".
- `migrationStatus()` surfaces two new read-only fields,
`appliedMigrationCount` and `latestMigrationName`, sourced from
`pgmigrations`. The existing `status` enum (`up_to_date` / `older_db` /
`newer_db` / `unstamped` / `no_schema`) is unchanged.

### Changed
- **BREAKING**: All API endpoints are now mounted under `/v1/` (e.g. `POST /v1/memories/ingest`, `PUT /v1/agents/trust`). Update clients to prefix requests with `/v1`. The unversioned `/health` liveness probe is unchanged.
- Phase 2 removes `src/db/schema.sql`; the migrations folder is now the
single source of truth. The build-time `dist/db/schema-sha256.json`
manifest is preserved but now describes the ordered migration directory.
Library and CLI surfaces are unchanged: `migrate()` and
`migrationStatus()` keep their Phase 1 signatures, and
`MigrateResult.ranSchemaSql` is preserved as "this call executed the
migration runner path" (the Phase 1 semantics for the no-op-loser path
still hold).
- Moved provenance-only SQL changelog files from `src/db/migrations/` to
`docs/db/changelog/`. Phase 2 reclaims `src/db/migrations/` as the
runtime migration folder.

### Migration (Phase 2 — lossless guarantee)

Three install paths reach the same end state without data loss, without
unexpected DDL, and without operator intervention. All three run inside
the same advisory-lock wrapper used by Phase 1 (`MIGRATION_LOCK_ID`
unchanged), so concurrent replica boots remain safe.

- **Scenario A — fresh install on Phase 2.** `migrate()` creates
`pgmigrations`, runs `0001_baseline.sql` against the empty database,
runs any later migration files, runs the embedding-dimension reconciler
(against now-empty tables; no-op or a single `ALTER COLUMN`), and stamps
`schema_version`.
- **Scenario B — v1.0.x with data → Phase 2.** `migrate()` detects that
core tables exist but `pgmigrations` does not. It creates `pgmigrations`,
**stamps `0001_baseline` as applied without executing it**, runs any
post-baseline migrations against the live schema, runs the reconciler
on the live (possibly populated) tables with the same Phase 1
semantics, and inserts the first `schema_version` row. Baseline DDL
does not touch existing tables.
- **Scenario C — Phase 1 → Phase 2.** Same as B except `schema_version`
already exists; the upgrade appends a new row instead of creating the
table.

Enforcement: `baseline-schema-equivalence.test.ts` (the CI gate) builds
both end states on fresh databases and asserts the schema-only structural
snapshot is identical modulo the framework-bookkeeping tables. The Phase 1
data-preservation suite is carried forward unchanged and continues to
seed legacy rows, snapshot them, run `migrate()`, and assert the rows,
primary keys, foreign keys, JSON metadata, timestamps, and representative
vector fields survive the Phase 1 → Phase 2 cutover.

## [1.0.0] - 2026-04-15

Expand Down
72 changes: 71 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,76 @@ npm run dev

Health check: `curl http://localhost:3050/v1/memories/health`

### Migrations

Core uses versioned migration files as the single source of truth for
PostgreSQL schema. The files live under `src/db/migrations/` and ship to the
package as `dist/db/migrations/`. There is no `schema.sql` — the migrations
folder is the schema, in order. To regenerate the equivalent full-schema dump
locally, replay the migrations against an empty DB and run
`pg_dump --schema-only`.

Run migrations once before deploy or during a single startup step before
serving traffic:

```bash
npm run migrate
```

Docker image users can keep the default startup migration step for local or
single-replica deployments. For rolling production deploys, run migrations as
a pre-deploy job and start app containers with
`ATOMICMEMORY_RUN_MIGRATIONS_ON_STARTUP=false`. If startup migrations remain
enabled, `MIGRATION_LOCK_TIMEOUT_MS` raises the advisory-lock wait window.

Applications that embed Core can call the programmatic API directly instead
of shelling out:

```ts
import { migrate, migrationStatus } from '@atomicmemory/core';

const status = await migrationStatus({ pool });
if (status.status !== 'up_to_date') {
await migrate({ pool });
}
```

The `migrate()` and `migrationStatus()` signatures are unchanged from Phase 1;
only their internals were rewritten on top of `node-pg-migrate`. `MigrateResult`
fields are populated the same way — `ranSchemaSql` now means "this call
executed the migration runner path" rather than "this call executed the legacy
`schema.sql` file".
`MigrationStatus` adds read-only diagnostics sourced from the framework and
pgvector catalogs: `appliedMigrationCount`, `latestMigrationName`,
`migrationHistoryStatus`, and `embeddingDimension`.

To inspect a running database, two tables answer different questions:

| Table | Question it answers |
|------------------|----------------------------------------------------------|
| `pgmigrations` | Which migration files have been applied, and in what order |
| `schema_version` | Which `@atomicmemory/core` semver this DB corresponds to |

Both are kept on purpose. `pgmigrations` is the framework's audit trail;
`schema_version` is the operator-friendly "what code matches this DB" stamp.
Querying either is safe from any client.

```sql
SELECT id, name, run_on FROM pgmigrations ORDER BY id;
SELECT sdk_version, schema_sha256, applied_at FROM schema_version
ORDER BY applied_at DESC LIMIT 1;
```

Upgrades are lossless. A v1.0.x or Phase-1 database with existing rows takes
the same `migrate()` call as a fresh install — `migrate()` detects the
pre-migration install state, stamps the baseline migration as already-applied
without re-executing it, and runs only the migrations after the baseline.
See [`docs/db/migrations.md`](docs/db/migrations.md) for the scenario-by-scenario
guarantees and inspection runbook.

The provenance SQL files under `docs/db/changelog/` are references only;
runtime schema execution is owned entirely by the `src/db/migrations/` folder.

### npm CLI

The npm package also ships a thin CLI for environments where you already have
Expand Down Expand Up @@ -279,7 +349,7 @@ The compose file includes Postgres with pgvector. The app container runs migrati
src/
routes/ # Express route handlers
services/ # Business logic (extraction, retrieval, packaging)
db/ # Repository layer, schema, migrations
db/ # Repository layer and canonical schema
adapters/ # Type contracts for external integrations
config.ts # Environment-driven configuration
server.ts # Express app bootstrap
Expand Down
5 changes: 5 additions & 0 deletions docker-compose.smoke-isolated.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,11 @@ services:
# Schemathesis, so the values are dummy placeholders that match
# `.env.test`. NEVER reuse these strings in a real deployment.
CORE_API_KEY: test-core-api-key-do-not-leak
# Schemathesis fuzzes the whole OpenAPI surface, including the optional
# admin cleanup route. Enable it only in this disposable smoke stack so
# the documented /v1/admin/scope path is mounted during schema fuzzing.
CORE_ADMIN_API_KEY: test-core-api-key-do-not-leak
CORE_TEST_SCOPE_ALLOW_PATTERN: "^schemathesis-fuzz-user$"
STORAGE_KEY_HMAC_SECRET: 000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f
env_file:
- .env
Expand Down
16 changes: 16 additions & 0 deletions docs/db/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Database documentation

In-repo documentation for the PostgreSQL + pgvector layer that ships with
`@atomicmemory/core`. Public operator and contributor guidance lives here;
internal planning history stays in the private research workspace.

## Contents

- [`migrations.md`](./migrations.md) — operator and contributor reference
for the Phase 2 versioned migration system. Covers the folder layout,
inspection queries against `pgmigrations` and `schema_version`, the
Scenario A/B/C lossless guarantee, the `migrate()` / `migrationStatus()`
API surface, and the workflow for adding a new migration.
- [`changelog/`](./changelog) — provenance-only SQL files from the
pre-Phase-2 schema evolution. Reference material, not executed at
runtime.
Loading
Loading