Skip to content

feat: hook versioning, per-row provenance, and M2M deploy #145

@rorybyrne

Description

@rorybyrne

Problem

Two problems, one root cause. Hooks are embedded inline and immutably inside a convention (Convention.hooks: list[HookDefinition], each carrying OCI image + digest + feature-table column spec — osa/domain/shared/model/hook.py), and conventions are immutable.

  1. No way to ship a hook bug-fix except creating a whole new convention. No history, no rollback, no machine-to-machine deploy path. This blocks automated deploy tooling that builds hook images out of band and must register the built image with a running instance.
  2. No usable provenance. Records carry a convention reference, but nothing records which exact hook image computed which feature row. ingest_run stores no per-hook digest.

The provenance requirement (why the model is shaped this way)

A downstream consumer that builds artifacts from exported features must later demonstrate exactly what produced them:

  1. What exactly was used? → the precise set of feature rows.
  2. What computed them, and was it correct? → if a hook had a bug in v2 fixed in v3, which rows came from v2?
  3. Recall on a bad input → if a source is later withdrawn, which rows (hence which downstream artifacts) touched it? A recall query, run on provenance.
  4. Reproducibility → same records + same hook code + same config → same features.

These force the spine: per-feature-row provenance, stored on the row, anchored to an immutable record of what ran (so it can't drift and survives reconciliation), with build source and config captured.

Data model

  • hooks (name PK, feature_spec, live_release_id FK, created_at) — identity + contract + live pointer. Owns the feature table. The feature spec is fixed across releases (releases are image-only).
  • hook_releases (id PK, hook_name FK, version int, image, digest, config, source_ref, built_by, built_at)immutable. version is monotonic per hook (integers, not SemVer — a hook has no compat contract beyond its fixed feature spec). source_ref = git SHA / build id from the build step (reproducibility anchor). UNIQUE(hook_name, version); optional UNIQUE(hook_name, digest) for idempotent re-push.
  • hook_runs (id PK, release_id FK, ingest_run_id|deposition_id, batch_index, status, started_at, finished_at, duration_s, oom_retries, log_ref)append-only. Execution record; logs/timing/status attach here. (Eng review: extend/replace the transient HookResult + existing validation_runs, don't duplicate.)
  • Feature tables (features.*): add run_id FK→hook_runs. Per-row provenance: feature row → run → release → (version, digest, config, source_ref).

Versioning & resolution

  • Conventions reference hooks by name. A live pointer on the hooks table selects the active release. Deploy = create release + advance pointer; rollback = repoint. The convention is not touched on a hook version change.
  • The live release is resolved once at ingest-run start and snapshotted for that run, so a mid-run deploy can't split one run across two versions.
  • Resolution cost is one indexed lookup per ingest run (single WHERE hook_name IN (...)), amortized over thousands of records — negligible. The live pointer is also the concept reconciliation (future) will reuse.

(Version-pinning in the convention was considered and rejected: the run/release ledger makes provenance independent of how the convention points, removing pinning's only advantage, while per-row provenance removes the live pointer's only cost. Live pointer wins on deploy ergonomics, rollback, stable convention versions, and reconciliation-fit.)

Identity

  • Convention gets an SDK-supplied human-readable slug as its id: ConventionId = "<slug>@<version>" (mirrors the existing SchemaId pattern). No server-generated opaque SRN as the primary handle.
  • Schema keeps its SDK-supplied slug; SchemaId = "<slug>@<version>".
  • Records reference conventions by internal ConventionId, not SRN. SRN is reserved for federation-edge surfaces. (record/model/aggregate.py convention_srnconvention_id.)

API surface

Principle: a deploy is a single POST /conventions whose body is a composition of the same sub-structures the standalone endpoints accept (schema, each hook's release). One call for instrumentation; the server fans it out into the schema registry, hook registry, and convention in one transaction. Standalone endpoints exist only for incremental updates, rollback, and reads.

POST /api/v1/conventions                 # bundled deploy (schema + hooks + convention)   [conventions:write | ADMIN]
GET  /api/v1/conventions/{id}            # detail
POST /api/v1/schemas                      # create/version a schema (== deploy "schema" block) [ADMIN]
POST /api/v1/hooks/{name}/releases        # create release vN+1; advances live pointer        [hooks:write | ADMIN]
PUT  /api/v1/hooks/{name}/live            # repoint live (rollback / pin)                      [hooks:write | ADMIN]
GET  /api/v1/hooks                         # catalog: hooks + live release
GET  /api/v1/hooks/{name}/releases         # release history
GET  /api/v1/hooks/{name}/releases/{v}     # inspect a release

Deploy body

The schema block and each hook's release block are byte-identical to the standalone POST /schemas / POST /hooks/{name}/releases bodies.

{
  "id": "proteins",                  // convention slug (caller-supplied) → ConventionId = "proteins@1.0.0"
  "version": "1.0.0",
  "title": "Protein structures",
  "file_requirements": { /* ... */ },

  "schema": {                        // fully inline, nested; == POST /schemas body
    "id": "protein_fields",          // schema slug (caller-supplied)
    "version": "1.0.0",
    "fields": [ /* field definitions */ ]
  },

  "hooks": [
    {
      "name": "pocket_detect",       // hook identity (<=40 chars, [a-z][a-z0-9_]*)
      "feature": { /* TableFeatureSpec: columns, cardinality — set once, fixed forever */ },
      "release": {                   // == POST /hooks/{name}/releases body
        "image": "registry/.../pocket_detect:abc",
        "digest": "sha256:...",
        "config": {},
        "limits": {},
        "source_ref": "git-sha-or-build-id"   // REQUIRED — reproducibility anchor
      }
    }
  ],

  "ingester": null
}

Server fan-out: upsert schema (id@version); for each hook upsert identity (name + feature, creating the feature table on first sight) + create the release (advances live pointer); create the convention referencing hooks by name.

  • Idempotent on (hook_name, digest): re-sending an unchanged release is a no-op; a changed digest creates a new version + advances live.
  • Rejected: a different feature for an existing hook (the column contract is fixed).
  • Re-deploy: same convention id + new version = a new version of the same convention; same id+version = conflict unless byte-identical.

Incremental hook release (convention untouched)

POST /api/v1/hooks/{name}/releases
{ "image": "...", "digest": "sha256:...", "config": {}, "limits": {}, "source_ref": "git-sha" }

No feature — columns are fixed at the hook's first release. Creates release vN+1, advances the live pointer; future ingest runs pick it up.

Auth: additional M2M token issuer (ships in this issue)

Deploys are driven by external automation, not interactive sessions. Today validation supports only a single symmetric secret (HS256, config.auth.jwt.secret).

  • New config: an optional additional trusted issuer (asymmetric — RS256/EdDSA — via static public key for v1; JWKS only if out-of-band rotation is actually needed).
  • validate_access_token routes on the iss claim: extra-issuer tokens verify against its key; existing user tokens are unaffected.
  • Extra-issuer tokens resolve to a scope-limited principal: Principal gains scopes (today only roles). The bundled deploy (POST /conventions) requires conventions:write; release/live endpoints require hooks:write; both also accept ADMIN. Keeps long-lived broad credentials out of automation.
  • Existing single-secret behavior unchanged when no additional issuer is configured.

Deploy flow this enables

External deploy tooling holds the convention metadata + hook source, builds images out of band, then makes a single POST /api/v1/conventions carrying the inline schema + each hook's identity and release block (real digests + source commit). The server fans it out. A later hook bugfix is one small POST /hooks/{name}/releases: convention untouched, live pointer moves, future runs pick it up, provenance records exactly which version ran. Instrumenting an instance is one call; maintaining a hook is one call.

Acceptance criteria

  • Each feature row carries a run_id; row → run → release yields version, digest, config, and build source for that row.
  • A single POST /conventions deploy creates the schema, hooks (+ first releases + feature tables), and the convention in one transaction; sub-structures match the standalone endpoints.
  • POST /hooks/{name}/releases creates an immutable, integer-versioned release and advances the live pointer; the convention is untouched.
  • Rollback repoints the live pointer to a prior release; release history is listable per hook.
  • Within an ingest run, all rows for a hook share one resolved release (resolve-at-run-start snapshot; no mid-run split).
  • Conventions reference hooks by name; ingest resolves the live release; concurrent deploys advance the pointer atomically (row lock).
  • Convention and schema use SDK-supplied human-readable slugs; records reference conventions by internal ConventionId, not SRN.
  • Deploys are idempotent on (hook_name, digest); re-sending an unchanged release is a no-op; a different feature for an existing hook is rejected.
  • Optional second issuer configurable; iss-routed; scope-limited principal enforced (conventions:write for deploy, hooks:write for release/live); existing single-secret auth unchanged when not configured.
  • Migration backfills existing inline hooks as release 1, reuses existing feature tables, and hard-fails with a report on any hook-name collision across conventions with divergent specs.

Scope notes

  • The issuer support and the registry ship together (this issue): the registry is unusable for automated deploys without a scoped M2M credential.
  • Build source commit must be passed in the release payload by the build step — it is the reproducibility anchor and only the build step has it.
  • Releases are image-only; a hook's feature spec (columns) is fixed at first release. Changing columns would need a feature-table migration — out of scope.

Deferred (designed-for, not built here)

  • Reconciliation — running a newer hook version against existing records to backfill/update features. The run/release/live-pointer model fits it directly (a reconciliation pass is just another run).
  • Convention hook-list mutability — adding a new hook to an existing convention post-creation; depends on reconciliation.
  • Reproducible export/snapshot identity — a stable handle for "the exact set of rows that were exported" (changefeed cursor / snapshot); completes the downstream audit story.
  • Tamper-evidence / signing — signing provenance manifests via the Node Document keys so the chain is verifiable, not just present.

Migration

  • Backfill each convention's inline hooks → hooks row (with feature spec) + hook_releases v1 (embedded image/digest) + set live; rewrite convention to name refs; switch records convention_srnconvention_id.
  • Feature tables already exist (old ConventionRegistered path) — point new rows at them; do not re-fire feature-table DDL for backfilled v1s.
  • Audit for hook-name collisions across conventions before migrating; hard-fail with a report if found.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions