Skip to content

feat(mediorum): bound ops table via dormant cleanup, gap signal, and opt-in retention#277

Closed
RolfAris wants to merge 1 commit into
OpenAudio:mainfrom
RolfAris:feat/mediorum-ops-retention
Closed

feat(mediorum): bound ops table via dormant cleanup, gap signal, and opt-in retention#277
RolfAris wants to merge 1 commit into
OpenAudio:mainfrom
RolfAris:feat/mediorum-ops-retention

Conversation

@RolfAris
Copy link
Copy Markdown
Contributor

The mediorum CRUD ops table grows monotonically and is the largest
relation on a stock-spec validator (docs.openaudio.org documents 200 GB
disk; observed ~84 GB ops at val001). This change makes the table
sustainable by addressing three concerns in one PR.

1. One-time dormant-table cleanup (default on)

At startup, drop ops rows for CRUD-registered tables whose newest op
is older than OPENAUDIO_MEDIORUM_DORMANT_OPS_THRESHOLD (default 90d,
clamped up to a 24h safety floor). The motivating case is
qm_audio_analyses: producer code paths for that table no longer
write at the historical cadence; observed last op on val001 is
2025-11-07 (~6 months stale at default threshold).

  • Only tables registered via RegisterModels are eligible. An
    unregistered orphan (a table whose Go model was removed in a prior
    PR) is left alone.
  • Deletes are batched (10k rows per statement) and bounded by
    ulid < cutoffULID. The cutoff bound is load-bearing: if a producer
    writes a new op between dormancy classification and a later batch,
    that op sits above the cutoff and is never touched.
  • Idempotent: re-running on an already-cleaned table is a no-op.
  • Opt out with OPENAUDIO_MEDIORUM_KEEP_DORMANT_OPS=true.
  • Future-maintainer note in code: a new CRUD table added with a
    low write cadence (e.g. quarterly metrics) belongs outside CRUDR
    or behind a non-default threshold.

2. Sweep-handler retention-gap signal (default on, wire-compatible)

This is a correctness fix independent of the retention work itself.
Today's serve_crud.go + client.go pair has a latent silent-skip:
when a peer's cursor after= is below the server's smallest available
ULID, the server returns ops starting just above its floor, and the
client sets its cursor to ops[len(ops)-1].ULID — silently jumping
past the gap with no signal. On any long-running fleet the cursor
table already has stalled cursors from peers offline for months or
years that this path is silently feeding incomplete history.

  • Server: when after != "" && after < min(ulid), set
    X-Mediorum-Retention-Gap: true and X-Mediorum-Available-Min-Ulid
    response headers. Response body is unchanged ops array, so older
    binaries see no protocol difference and continue with the existing
    silent-skip behavior — only upgraded clients benefit.
  • Client: detect the header, log an explicit operator-visible event,
    persist the cursor advance to the advertised floor before applying
    any ops (so a crash mid-apply doesn't leave us stuck below the gap),
    then continue applying the response.
  • Client: validate the advertised ulid. A candidate that fails to parse
    or decodes to a time more than 30 minutes ahead of the local wall
    clock is rejected; otherwise a hostile or misconfigured peer could
    silence one of our sweep streams with a forged far-future ulid.
  • crudr.Stats().SweepGapAdvances increments only on successful
    cursor persist, so the counter matches durable state.

3. Opt-in ongoing retention sweep

Gated entirely by OPENAUDIO_MEDIORUM_OPS_RETENTION_DAYS. Unset (the
default) means archive mode: the goroutine starts, sees the unset
config, blocks on ctx.Done(), and never deletes. Operators who want
bounded ops storage set the env to a positive integer.

The cutoff respects the cursor-floor invariant: no op whose ulid is
greater than the slowest reachable peer's cursor (minus a safety
margin) may be deleted. Empty cursors block all deletion (a peer that
has never advanced is the most conservative cursor). The self-cursor
row (if any) is skipped.

Configuration (defaults shown):

Env Default Purpose
OPENAUDIO_MEDIORUM_OPS_RETENTION_DAYS unset enable sweep at N days
OPENAUDIO_MEDIORUM_OPS_RETENTION_SWEEP_INTERVAL 1h per-tick cadence
OPENAUDIO_MEDIORUM_OPS_RETENTION_BATCH_LIMIT 10000 rows per batch
OPENAUDIO_MEDIORUM_OPS_RETENTION_CURSOR_MARGIN 1h safety floor below slowest cursor
OPENAUDIO_MEDIORUM_DORMANT_OPS_THRESHOLD 90d dormancy window (clamped >= 24h)
OPENAUDIO_MEDIORUM_KEEP_DORMANT_OPS false opt out of one-time cleanup

Each tick loops up to 10 batches so a backlogged node makes
measurable progress without monopolizing the DB (upper bound 100k
rows/tick at defaults, 2.4M rows/day at 1h cadence — exceeds the
observed ~1.1M ops/day write rate).

Crudr.DryRunRetention returns the same plan without executing any
DELETE; operators can call it from a debug endpoint or audius-ctl
subcommand before flipping retention on.

Compatibility

  • Default behavior is unchanged. A node that doesn't set
    OPENAUDIO_MEDIORUM_OPS_RETENTION_DAYS keeps every op it has today
    except dormant-table sediment.
  • The gap signal is wire-compatible: response body is unchanged, only
    two new optional headers. Older binaries ignore them and behave as
    before.

Tests

go test -race -count=10 ./pkg/mediorum/crudr/... is green (~56s
wall-clock, no flakes):

  • Dormant cleanup: opt-out, idempotency, active-table protection,
    unregistered-table protection, context cancellation, race-guard
    (new op above cutoff during cleanup), batched large-table deletion,
    threshold-floor clamping.
  • Gap signal: server-side classification at/above/below min,
    empty-after-is-not-a-gap, empty-ops-table, stub round-trip, real
    httptest.Server end-to-end, double-tick suppression, hostile
    far-future ulid rejected, malformed ulid rejected, IsValidGapULID
    table tests.
  • Retention sweep: disabled-is-no-op, cursor floor, slow peer pins
    deletion to its cursor minus margin, empty cursor blocks all
    deletion, safety margin honored, batch limit, multi-table sweep,
    no-peers age-cutoff-only, self-cursor skipped, ancient cursor pins
    everything, per-tick max-batches cap, concurrent sweep + delete.
  • DryRunRetention: dormant-only preview, retention-skip-on-empty-cursor.

…opt-in retention

The mediorum CRUD ops table grows monotonically and is the largest
relation on a stock-spec validator (docs.openaudio.org documents 200 GB
disk; observed ~84 GB ops at val001). This change makes the table
sustainable by addressing three concerns in one PR.

## 1. One-time dormant-table cleanup (default on)

At startup, drop ops rows for CRUD-registered tables whose newest op
is older than `OPENAUDIO_MEDIORUM_DORMANT_OPS_THRESHOLD` (default 90d,
clamped up to a 24h safety floor). The motivating case is
`qm_audio_analyses`: producer code paths for that table no longer
write at the historical cadence; observed last op on val001 is
2025-11-07 (~6 months stale at default threshold).

- Only tables registered via `RegisterModels` are eligible. An
  unregistered orphan (a table whose Go model was removed in a prior
  PR) is left alone.
- Deletes are batched (10k rows per statement) and bounded by
  `ulid < cutoffULID`. The cutoff bound is load-bearing: if a producer
  writes a new op between dormancy classification and a later batch,
  that op sits above the cutoff and is never touched.
- Idempotent: re-running on an already-cleaned table is a no-op.
- Opt out with `OPENAUDIO_MEDIORUM_KEEP_DORMANT_OPS=true`.
- Future-maintainer note in code: a new CRUD table added with a
  low write cadence (e.g. quarterly metrics) belongs outside CRUDR
  or behind a non-default threshold.

## 2. Sweep-handler retention-gap signal (default on, wire-compatible)

This is a correctness fix independent of the retention work itself.
Today's `serve_crud.go` + `client.go` pair has a latent silent-skip:
when a peer's cursor `after=` is below the server's smallest available
ULID, the server returns ops starting just above its floor, and the
client sets its cursor to `ops[len(ops)-1].ULID` — silently jumping
past the gap with no signal. On any long-running fleet the cursor
table already has stalled cursors from peers offline for months or
years that this path is silently feeding incomplete history.

- Server: when `after != "" && after < min(ulid)`, set
  `X-Mediorum-Retention-Gap: true` and `X-Mediorum-Available-Min-Ulid`
  response headers. Response body is unchanged ops array, so older
  binaries see no protocol difference and continue with the existing
  silent-skip behavior — only upgraded clients benefit.
- Client: detect the header, log an explicit operator-visible event,
  persist the cursor advance to the advertised floor *before* applying
  any ops (so a crash mid-apply doesn't leave us stuck below the gap),
  then continue applying the response.
- Client: validate the advertised ulid. A candidate that fails to parse
  or decodes to a time more than 30 minutes ahead of the local wall
  clock is rejected; otherwise a hostile or misconfigured peer could
  silence one of our sweep streams with a forged far-future ulid.
- `crudr.Stats().SweepGapAdvances` increments only on successful
  cursor persist, so the counter matches durable state.

## 3. Opt-in ongoing retention sweep

Gated entirely by `OPENAUDIO_MEDIORUM_OPS_RETENTION_DAYS`. Unset (the
default) means archive mode: the goroutine starts, sees the unset
config, blocks on `ctx.Done()`, and never deletes. Operators who want
bounded ops storage set the env to a positive integer.

The cutoff respects the cursor-floor invariant: no op whose ulid is
greater than the slowest reachable peer's cursor (minus a safety
margin) may be deleted. Empty cursors block all deletion (a peer that
has never advanced is the most conservative cursor). The self-cursor
row (if any) is skipped.

Configuration (defaults shown):

| Env | Default | Purpose |
|---|---|---|
| `OPENAUDIO_MEDIORUM_OPS_RETENTION_DAYS` | unset | enable sweep at N days |
| `OPENAUDIO_MEDIORUM_OPS_RETENTION_SWEEP_INTERVAL` | `1h` | per-tick cadence |
| `OPENAUDIO_MEDIORUM_OPS_RETENTION_BATCH_LIMIT` | `10000` | rows per batch |
| `OPENAUDIO_MEDIORUM_OPS_RETENTION_CURSOR_MARGIN` | `1h` | safety floor below slowest cursor |
| `OPENAUDIO_MEDIORUM_DORMANT_OPS_THRESHOLD` | `90d` | dormancy window (clamped >= 24h) |
| `OPENAUDIO_MEDIORUM_KEEP_DORMANT_OPS` | `false` | opt out of one-time cleanup |

Each tick loops up to 10 batches so a backlogged node makes
measurable progress without monopolizing the DB (upper bound 100k
rows/tick at defaults, 2.4M rows/day at 1h cadence — exceeds the
observed ~1.1M ops/day write rate).

`Crudr.DryRunRetention` returns the same plan without executing any
DELETE; operators can call it from a debug endpoint or audius-ctl
subcommand before flipping retention on.

## Compatibility

- Default behavior is unchanged. A node that doesn't set
  `OPENAUDIO_MEDIORUM_OPS_RETENTION_DAYS` keeps every op it has today
  except dormant-table sediment.
- The gap signal is wire-compatible: response body is unchanged, only
  two new optional headers. Older binaries ignore them and behave as
  before.

## Tests

`go test -race -count=10 ./pkg/mediorum/crudr/...` is green (~56s
wall-clock, no flakes):

- Dormant cleanup: opt-out, idempotency, active-table protection,
  unregistered-table protection, context cancellation, race-guard
  (new op above cutoff during cleanup), batched large-table deletion,
  threshold-floor clamping.
- Gap signal: server-side classification at/above/below min,
  empty-after-is-not-a-gap, empty-ops-table, stub round-trip, real
  `httptest.Server` end-to-end, double-tick suppression, hostile
  far-future ulid rejected, malformed ulid rejected, IsValidGapULID
  table tests.
- Retention sweep: disabled-is-no-op, cursor floor, slow peer pins
  deletion to its cursor minus margin, empty cursor blocks all
  deletion, safety margin honored, batch limit, multi-table sweep,
  no-peers age-cutoff-only, self-cursor skipped, ancient cursor pins
  everything, per-tick max-batches cap, concurrent sweep + delete.
- DryRunRetention: dormant-only preview, retention-skip-on-empty-cursor.
@RolfAris
Copy link
Copy Markdown
Contributor Author

Superseded by #304, which is the same scope as a single squashed commit with the cursor-invariant correctness fixes folded in.

@RolfAris RolfAris closed this May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant