Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 26 additions & 20 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,32 +73,38 @@ Full diagram and concurrency model: [docs/dev/architecture.md](docs/dev/architec
| **Lance docs index — fetch upstream Lance docs by problem domain** | **[docs/dev/lance.md](docs/dev/lance.md)** |
| **Test coverage map — what's covered, what helpers to reuse, before-every-task checklist** | **[docs/dev/testing.md](docs/dev/testing.md)** |
| Architecture, L1/L2 framing, concurrency model | [docs/dev/architecture.md](docs/dev/architecture.md) |
| Storage layout, `__manifest` schema, URI schemes, S3 env vars | [docs/user/storage.md](docs/user/concepts/storage.md) |
| `.pg` schema language, types, constraints, annotations, migration planning | [docs/user/schema-language.md](docs/user/schema/index.md) |
| Schema-lint codes (`OG-XXX-NNN`), families, severity, suppression | [docs/user/schema-lint.md](docs/user/schema/lint.md) |
| `.gq` query language, MATCH/RETURN/ORDER, search funcs, mutations, IR ops, lint codes | [docs/user/query-language.md](docs/user/queries/index.md) |
| Indexes (BTREE / inverted / vector / graph topology) | [docs/user/indexes.md](docs/user/search/indexes.md) |
| Embeddings (compiler + engine clients, env vars, `@embed`) | [docs/user/embeddings.md](docs/user/search/embeddings.md) |
| Branches, commit graph, snapshots, system branches | [docs/user/branches-commits.md](docs/user/branching/index.md) |
| Transactions and atomicity (per-query atomic; branches as multi-query transactions) | [docs/user/transactions.md](docs/user/branching/transactions.md) |
| Storage layout, `__manifest` schema, URI schemes, S3 env vars | [docs/user/concepts/storage.md](docs/user/concepts/storage.md) |
| `.pg` schema language, types, constraints, annotations, migration planning | [docs/user/schema/index.md](docs/user/schema/index.md) |
| Schema-lint codes (`OG-XXX-NNN`), families, severity, suppression | [docs/user/schema/lint.md](docs/user/schema/lint.md) |
| `.gq` query language, MATCH/RETURN/ORDER, IR ops, lint codes | [docs/user/queries/index.md](docs/user/queries/index.md) |
| Mutations — insert/update/delete, D2, atomicity | [docs/user/mutations/index.md](docs/user/mutations/index.md) |
| Search funcs (`nearest`/`bm25`/`rrf`), hybrid ranking | [docs/user/search/index.md](docs/user/search/index.md) |
| Indexes (BTREE / inverted / vector / graph topology) | [docs/user/search/indexes.md](docs/user/search/indexes.md) |
| Embeddings (compiler + engine clients, env vars, `@embed`) | [docs/user/search/embeddings.md](docs/user/search/embeddings.md) |
| Concepts — what OmniGraph is, L1/L2 framing | [docs/user/concepts/index.md](docs/user/concepts/index.md) |
| Quickstart — init → load → query → branch | [docs/user/quickstart.md](docs/user/quickstart.md) |
| Branches, commit graph, system branches | [docs/user/branching/index.md](docs/user/branching/index.md) |
| Snapshots & time travel | [docs/user/branching/time-travel.md](docs/user/branching/time-travel.md) |
| Three-way merge and conflict kinds (user-facing) | [docs/user/branching/merge.md](docs/user/branching/merge.md) |
| Transactions and atomicity (per-query atomic; branches as multi-query transactions) | [docs/user/branching/transactions.md](docs/user/branching/transactions.md) |
| Direct-publish write path (staging, D2, recovery sidecars; the former Run state machine) | [docs/dev/writes.md](docs/dev/writes.md) |
| Three-way merge and conflict kinds | [docs/dev/merge.md](docs/dev/merge.md) |
| Diff / change feed (`diff_between`, `diff_commits`) | [docs/user/changes.md](docs/user/branching/changes.md) |
| Diff / change feed (`diff_between`, `diff_commits`) | [docs/user/branching/changes.md](docs/user/branching/changes.md) |
| Query execution, mutation execution, bulk loader, `load` vs `ingest` | [docs/dev/execution.md](docs/dev/execution.md) |
| `optimize` (compaction) and `cleanup` (version GC) | [docs/user/maintenance.md](docs/user/operations/maintenance.md) |
| Cluster operator guide (deploy/manage clusters, approvals, recovery, serving) | [docs/user/cluster.md](docs/user/clusters/index.md) |
| Cedar policy actions, scopes, CLI | [docs/user/policy.md](docs/user/operations/policy.md) |
| HTTP server endpoints, auth, error model, body limits | [docs/user/server.md](docs/user/operations/server.md) |
| CLI quick-start | [docs/user/cli.md](docs/user/cli/index.md) |
| CLI command surface and config schemas (`~/.omnigraph/config.yaml`, legacy `omnigraph.yaml`) | [docs/user/cli-reference.md](docs/user/cli/reference.md) |
| Audit / actor tracking | [docs/user/audit.md](docs/user/operations/audit.md) |
| Error taxonomy and result serialization | [docs/user/errors.md](docs/user/operations/errors.md) |
| `optimize` (compaction) and `cleanup` (version GC) | [docs/user/operations/maintenance.md](docs/user/operations/maintenance.md) |
| Cluster operator guide (deploy/manage clusters, approvals, recovery, serving) | [docs/user/clusters/index.md](docs/user/clusters/index.md) |
| Cedar policy actions, scopes, CLI | [docs/user/operations/policy.md](docs/user/operations/policy.md) |
| HTTP server endpoints, auth, error model, body limits | [docs/user/operations/server.md](docs/user/operations/server.md) |
| CLI quick-start | [docs/user/cli/index.md](docs/user/cli/index.md) |
| CLI command surface and config schemas (`~/.omnigraph/config.yaml`, legacy `omnigraph.yaml`) | [docs/user/cli/reference.md](docs/user/cli/reference.md) |
| Audit / actor tracking | [docs/user/operations/audit.md](docs/user/operations/audit.md) |
| Error taxonomy and result serialization | [docs/user/operations/errors.md](docs/user/operations/errors.md) |
| Install (binary / Homebrew / source / channels) | [docs/user/install.md](docs/user/install.md) |
| Deployment (binary / container / RustFS bootstrap / auth / build variants) | [docs/user/deployment.md](docs/user/deployment.md) |
| CI / release workflows | [docs/dev/ci.md](docs/dev/ci.md) |
| Code ownership (CODEOWNERS source of truth, roles, regeneration) | [docs/dev/codeowners.md](docs/dev/codeowners.md) |
| Branch protection policy (declarative, applied via `scripts/apply-branch-protection.sh`) | [docs/dev/branch-protection.md](docs/dev/branch-protection.md) |
| Constants & tunables cheat sheet | [docs/user/constants.md](docs/user/reference/constants.md) |
| Constants & tunables cheat sheet | [docs/user/reference/constants.md](docs/user/reference/constants.md) |
| Per-version release notes | [docs/releases/](docs/releases/) |

---
Expand Down Expand Up @@ -257,7 +263,7 @@ omnigraph policy explain --actor act-alice --action change --branch main
| Per-query atomic writes | — | In-memory `MutationStaging.pending` accumulator + `stage_*` / `commit_staged` per touched table at end-of-query + publisher CAS via `commit_with_expected` (single manifest commit per `mutate_as` / `load`); D₂ parse-time rule keeps inserts/updates and deletes from mixing |
| Three-way row-level merge | — | `OrderedTableCursor` + `StagedTableWriter`, structured `MergeConflictKind` |
| Change feeds | — | `diff_between` / `diff_commits` with manifest fast path + ID streaming |
| Cedar policy | — | Per-graph actions plus server-scoped actions (see [docs/user/policy.md](docs/user/operations/policy.md) for the current list), branch / target_branch / protected scopes, validate/test/explain CLI. **Engine-wide enforcement** (MR-722): every `_as` writer (`apply_schema_as`, `mutate_as`, `load_as` — the deprecated `ingest_as` shims route through it — `branch_create_as` / `branch_create_from_as`, `branch_delete_as`, `branch_merge_as`) calls `Omnigraph::enforce(action, scope, actor)` — HTTP, CLI, embedded SDK all hit the same gate. |
| Cedar policy | — | Per-graph actions plus server-scoped actions (see [docs/user/operations/policy.md](docs/user/operations/policy.md) for the current list), branch / target_branch / protected scopes, validate/test/explain CLI. **Engine-wide enforcement** (MR-722): every `_as` writer (`apply_schema_as`, `mutate_as`, `load_as` — the deprecated `ingest_as` shims route through it — `branch_create_as` / `branch_create_from_as`, `branch_delete_as`, `branch_merge_as`) calls `Omnigraph::enforce(action, scope, actor)` — HTTP, CLI, embedded SDK all hit the same gate. |
| HTTP server | — | Axum, OpenAPI via utoipa, bearer auth (SHA-256, AWS Secrets Manager option), `authorize_request` at the HTTP boundary (resolves bearer→actor, applies admission control), NDJSON streaming export, **multi-graph mode (v0.6.0+) with cluster routes + read-only `GET /graphs` enumeration + per-graph + server-level Cedar policies. Multi-graph boots from a cluster directory (`--cluster`) or the legacy `omnigraph.yaml`; add/remove graphs via `cluster apply` (or by editing the legacy file) and restarting.** |
| CLI with config | — | two-surface config (team `cluster.yaml` dir + per-operator `~/.omnigraph/config.yaml`; legacy `omnigraph.yaml` deprecated per RFC-008), aliases, multi-format output (json/jsonl/csv/kv/table) |
| Audit / actor tracking | — | `_as` write APIs + actor map in commit graph |
Expand All @@ -282,7 +288,7 @@ Rules:
7. **Re-verify before recommending.** If you cite a flag, env var, endpoint, or constant to the user or in code, grep for it in source first. Memory and docs go stale; the code is authoritative.
8. **Keep AGENTS.md short.** This file is always loaded into agent context, so every added line has a recurring context-window cost. Prefer pointers and terse invariants here; put detail in `docs/`.
9. **Keep AGENTS.md a map, not an encyclopedia.** New deep content goes into `docs/`. Add an entry to "Where to find each topic" instead of pasting prose into this file. The "Always-on rules" section is the exception — it's for invariants that should always be in scope.
10. **Re-read on schema/query/IR changes.** Edits to `schema.pest`, `query.pest`, `ir/lower.rs`, `query/typecheck.rs`, or `query/lint.rs` should trigger a re-read of [docs/user/schema-language.md](docs/user/schema/index.md), [docs/user/query-language.md](docs/user/queries/index.md), and [docs/dev/execution.md](docs/dev/execution.md) to confirm they still describe reality.
10. **Re-read on schema/query/IR changes.** Edits to `schema.pest`, `query.pest`, `ir/lower.rs`, `query/typecheck.rs`, or `query/lint.rs` should trigger a re-read of [docs/user/schema/index.md](docs/user/schema/index.md), [docs/user/queries/index.md](docs/user/queries/index.md), and [docs/dev/execution.md](docs/dev/execution.md) to confirm they still describe reality.
11. **Always make smaller commits.** Each commit does one thing, compiles, and passes tests; mechanical refactors land separately from the behavior changes they enable.
12. **Test-first for bug fixes.** When fixing an identified bug, write a regression test that reproduces the failure first. Confirm it fails against the current code with the predicted symptom (not an unrelated error). Then land the fix in a separate commit and confirm the test turns green. The test commit lands just before the fix commit so the red → green pair is visible in `git log` and a reviewer can check out the test commit alone and reproduce the failure.
13. **Correct by design over symptomatic patches.** When a bug surfaces, identify the root cause and make the fix correct by construction. Don't patch the symptom. If the design admits the bug class, the fix is to close the class, not to add a guard around the latest instance. A symptomatic patch is acceptable only as a stop-gap, with an explicit note in the commit message and a follow-up issue tracking the design fix.
Expand Down
8 changes: 3 additions & 5 deletions docs/user/branching/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,9 @@ Notes:

## L2 — Snapshots & time travel

- `snapshot()` — current snapshot for the bound branch; cached.
- `snapshot_of(target)` — snapshot at a `ReadTarget` (branch | snapshot id).
- `snapshot_at_version(v: u64)` — historical snapshot from any manifest version.
- `entity_at(table_key, id, version)` — single-entity time travel without building a full snapshot.
- A `Snapshot` is a `(version, HashMap<table_key, SubTableEntry>)` — cheap to build, snapshot-isolated cross-table reads.
Reading a branch at a past version, or a single entity at a past version, is
covered on the [time travel](time-travel.md) page. Merging branches and the
conflict kinds are on the [merge](merge.md) page.

## L2 — Internal system branches

Expand Down
47 changes: 47 additions & 0 deletions docs/user/branching/merge.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Merging Branches

Merging integrates the changes on one branch into another. OmniGraph merges are
**three-way and row-level**: it compares both branches against their common
ancestor and merges each node/edge table row by row, then publishes the result as
**one atomic commit** across the whole graph.

```bash
omnigraph branch merge review/2026-04-25 --into main s3://bucket/graph.omni

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Same URI-placement issue as in quickstart.md: BranchCommand::Merge uses uri as a --uri long flag, not a positional. Passing the S3 path as a trailing positional will be rejected by clap.

Suggested change
omnigraph branch merge review/2026-04-25 --into main s3://bucket/graph.omni
omnigraph branch merge review/2026-04-25 --into main --uri s3://bucket/graph.omni

Fix in Claude Code

```

`branch merge <source> [--into <target>]` merges `<source>` into `<target>`
(default `main`).

## Outcomes

A merge resolves to one of three outcomes:

- **Already up to date** — the target already contains every change on the source;
nothing to do.
- **Fast-forward** — the target has no changes the source lacks, so the target
simply advances to the source.
- **Merged** — both sides diverged; a new merge commit is created with two parents.

## Conflicts

When both branches changed the same data incompatibly, the merge fails with a
structured list of conflicts (the HTTP server returns `409` with a
`merge_conflicts[]` array). No partial result is published — the merge is
all-or-nothing. The conflict kinds are:

| Kind | Meaning |
|---|---|
| `DivergentInsert` | The same id was inserted on both branches. |
| `DivergentUpdate` | The same row was updated differently on both branches. |
| `DeleteVsUpdate` | One side deleted a row the other side updated. |
| `OrphanEdge` | An edge references a node the other side deleted. |
| `UniqueViolation` | The merged result would violate a unique constraint. |
| `CardinalityViolation` | The merged result would violate an edge cardinality constraint. |
| `ValueConstraintViolation` | The merged result would violate a value constraint (enum/range). |

Each conflict carries the table, the row id (when applicable), the kind, and a
message. Resolve conflicts by reconciling the two branches — typically by making
the conflicting change on one side and re-merging.

See [branches & commits](index.md) for the branch and commit-DAG model, and
[changes](changes.md) for diffing two branches before you merge.
31 changes: 31 additions & 0 deletions docs/user/branching/time-travel.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Snapshots & Time Travel

Every read in OmniGraph happens against a **snapshot** — a consistent, cross-table
view of the graph at one manifest version. A query holds one snapshot for its whole
lifetime, so it never sees a partial write from a concurrent commit (see
[transactions](transactions.md)).

## Reading the past

- **Current head** — by default a read targets the current head of the bound branch.
- **By snapshot id** — read a branch or a specific snapshot id (`--snapshot` on
`omnigraph read`).
- **By version** — reconstruct a historical snapshot from any past manifest version.
- **Single entity** — look up one entity at a past version without building a full
snapshot (cheaper when you only need one node or edge).

Snapshots are cheap to build: a snapshot is just the set of visible sub-table
versions at a manifest version, so cross-table reads stay snapshot-isolated.

## CLI

```bash
# Read a query against a past snapshot
omnigraph read --query ./q.gq --name find --snapshot <snapshot-id> s3://bucket/graph.omni

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 omnigraph read prints warning: \omnigraph read` is deprecated; use `omnigraph query` instead(seecrates/omnigraph-cli/src/helpers.rs`). This time-travel page is new and should use the canonical command from the start.

Suggested change
omnigraph read --query ./q.gq --name find --snapshot <snapshot-id> s3://bucket/graph.omni
omnigraph query --query ./q.gq --name find --snapshot <snapshot-id> s3://bucket/graph.omni

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

```

Time travel composes with branches: every branch has its own version history, and
you can read any branch at any of its past versions. Commits and the commit DAG
that these versions correspond to are described in
[branches & commits](index.md); diffing two versions is on the
[changes](changes.md) page.
49 changes: 49 additions & 0 deletions docs/user/concepts/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Concepts

OmniGraph is a typed property-graph engine built as a coordination layer over the
[Lance](https://lance.org) columnar storage format. It gives you a schema-checked
graph with vector, full-text, and graph queries in one runtime, plus Git-style
branches and commits across the whole graph.

## The data model

- A graph has **node types** and **edge types**, declared in a
[schema](../schema/index.md).
- Each node type and each edge type is stored as its **own Lance dataset** —
columnar, versioned, on local disk or object storage.
- A single `__manifest` table coordinates all of those datasets, so the graph has
one coherent version even though it spans many datasets.

This split is what lets a graph commit be **atomic across every type at once**: a
publish flips every relevant dataset's version together in one manifest write, so
readers never see a half-applied change. See [storage](storage.md) for the layout.

## Two layers: inherited vs. added

Throughout the docs, capabilities are framed as **L1** (inherited from Lance) or
**L2** (added by OmniGraph):

| | L1 — from Lance | L2 — added by OmniGraph |
|---|---|---|
| Storage | Columnar Arrow datasets on object storage | Per-type datasets coordinated as one graph |
| Versioning | Per-dataset versions + time travel | [Snapshots](../branching/time-travel.md) across all types at once |
| Branches | Per-dataset branches | [Graph-level branches](../branching/index.md), atomic across types |
| Commits | Per-dataset commits | [Commit DAG](../branching/index.md) for the whole graph; three-way [merge](../branching/merge.md) |
| Indexes | Scalar / vector / full-text indexes | Built per relevant column; graph topology index for traversal |
| Search | Vector + full-text primitives | [`nearest` / `bm25` / `rrf`](../search/index.md) in one query, plus graph traversal |
| Querying | — | The [`.gq` query language](../queries/index.md) and [`.pg` schema language](../schema/index.md) |

## How the pieces fit

- The **schema** (`.pg`) and **query** (`.gq`) languages are compiled to a typed
intermediate representation.
- The **engine** runs queries and mutations against Lance, coordinates the manifest,
maintains the commit graph, and builds indexes.
- The **CLI** ([`omnigraph`](../cli/index.md)) and the
**HTTP server** ([`operations/server.md`](../operations/server.md)) are two front
ends over the same engine, so embedded and remote behavior match.
- [Cedar policy](../operations/policy.md) enforcement is engine-wide — every writer
goes through the same authorization gate regardless of front end.

For deployment-scale topics — multi-graph servers, control-plane operations,
recovery — see [clusters](../clusters/index.md).
Loading
Loading