Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@ Full diagram and concurrency model: [docs/dev/architecture.md](docs/dev/architec
| Error taxonomy and result serialization | [docs/user/operations/errors.md](docs/user/operations/errors.md) |
| Install (binary / Homebrew / source / channels) | [docs/user/install.md](docs/user/install.md) |
| Deployment (binary / container / RustFS bootstrap / auth / build variants) | [docs/user/deployment.md](docs/user/deployment.md) |
| Task guides (hybrid search, cluster on S3, review workflow) | [docs/user/guides/index.md](docs/user/guides/index.md) |
| CI / release workflows | [docs/dev/ci.md](docs/dev/ci.md) |
| Code ownership (CODEOWNERS source of truth, roles, regeneration) | [docs/dev/codeowners.md](docs/dev/codeowners.md) |
| Branch protection policy (declarative, applied via `scripts/apply-branch-protection.sh`) | [docs/dev/branch-protection.md](docs/dev/branch-protection.md) |
Expand Down
98 changes: 98 additions & 0 deletions docs/user/guides/cluster-on-s3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Run a Cluster on S3

This guide takes a cluster from a local config directory to a server that boots
**config-free from an object-storage bucket** — the bucket is the whole
deployment artifact. For the full control-plane reference, see
[operating a cluster](../clusters/index.md) and
[cluster config](../clusters/config.md).

## 1. Declare the cluster

Lay out a config directory. The one S3-specific line is `storage:` — it puts the
state ledger, catalog, and graph data on the bucket instead of in the folder:

```
company-brain/
├── cluster.yaml
├── people.pg
├── queries/
│ └── people.gq
└── base.policy.yaml
```

```yaml
# cluster.yaml
version: 1
storage: s3://my-bucket/clusters/company-brain # the deployment lives here
metadata:
name: company-brain
graphs:
knowledge:
schema: people.pg
queries: queries/
policies:
base:
file: base.policy.yaml
applies_to: [knowledge]
```

Set the S3 credentials in the environment (for a non-AWS S3-compatible store such
as MinIO or RustFS, also set `AWS_ENDPOINT_URL_S3`):

```bash
export AWS_ACCESS_KEY_ID=... AWS_SECRET_ACCESS_KEY=... AWS_REGION=us-east-1
# export AWS_ENDPOINT_URL_S3=https://... # non-AWS S3-compatible stores
```

## 2. Validate, plan, apply

`apply` is the only command that changes the world; `plan` previews it:

```bash
omnigraph cluster validate --config company-brain # parse + typecheck
omnigraph cluster import --config company-brain # create the state ledger
omnigraph cluster plan --config company-brain # preview the diff
omnigraph cluster apply --config company-brain # converge onto the bucket
```

`apply` creates the graph at the derived root
(`s3://my-bucket/clusters/company-brain/graphs/knowledge.omni`), applies its
schema, and publishes the query and policy into the content-addressed catalog.
`converged: true` means there is nothing left to do — re-running `apply` is always
safe.

## 3. Load data

The control plane manages *definitions*; rows go through the normal data plane.
Address the graph by its storage URI (the derived `graphs/<id>.omni` root):

```bash
omnigraph load --data seed.jsonl --mode overwrite \
s3://my-bucket/clusters/company-brain/graphs/knowledge.omni
```

## 4. Serve config-free from the bucket

A serving host needs only the storage-root URI and credentials — no checkout of
the config repo:

```bash
OMNIGRAPH_SERVER_BEARER_TOKENS_JSON='{"act-reader":"s3cret"}' \
omnigraph-server --cluster s3://my-bucket/clusters/company-brain --bind 0.0.0.0:8080
```

The server boots from the **applied revision** recorded in the ledger — never from
config that was merely written. Roll out a change by `apply`-ing again, then
restarting replicas.

## 5. Maintain it

Storage maintenance runs out-of-band, addressed by cluster + graph name (it
resolves the graph's storage URI from the served state):

```bash
omnigraph optimize --cluster company-brain --cluster-graph knowledge
omnigraph cleanup --cluster company-brain --cluster-graph knowledge --keep 10 --confirm
```

See [maintenance](../operations/maintenance.md) for what each command does.
99 changes: 99 additions & 0 deletions docs/user/guides/hybrid-search.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Hybrid Search End to End

This guide builds a small document graph and runs a **hybrid** query that fuses
full-text (BM25) and vector (k-NN) rankings with Reciprocal Rank Fusion. You do
not build indexes by hand — the engine maintains them; a freshly loaded row is
searchable immediately.

See [search](../search/index.md) for the function reference and
[embeddings](../search/embeddings.md) for the full provider/env matrix.

## 1. Schema

A document with a text body for full-text search and a vector for similarity.
`@embed("body")` tells the engine to embed the `body` text into `embedding` at
load time:

```
node Document {
title: String,
body: String,
embedding: Vector(768) @embed("body"),
}
```

```bash
omnigraph init --schema schema.pg docs.omni
```

## 2. Configure embeddings

Ingest-time embedding uses the engine's embedding client. Point it at your
provider (see [embeddings](../search/embeddings.md) for every variable):

```bash
export GEMINI_API_KEY=... # ingest-time document embeddings
# For local experimentation without a provider, deterministic mock vectors:
# export OMNIGRAPH_EMBEDDINGS_MOCK=1 NANOGRAPH_EMBEDDINGS_MOCK=1
Comment on lines +35 to +37

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The real-credentials path should surface both API keys. The mock comment already reveals that two clients are active — OMNIGRAPH_EMBEDDINGS_MOCK for the engine (ingest-time @embed) and NANOGRAPH_EMBEDDINGS_MOCK for the compiler (query-time text-to-vector for nearest()). Without an OpenAI-compatible key, any query that passes a string to nearest() will fail at runtime even if load succeeded.

Suggested change
export GEMINI_API_KEY=... # ingest-time document embeddings
# For local experimentation without a provider, deterministic mock vectors:
# export OMNIGRAPH_EMBEDDINGS_MOCK=1 NANOGRAPH_EMBEDDINGS_MOCK=1
export GEMINI_API_KEY=... # engine-side: ingest-time @embed (RETRIEVAL_DOCUMENT)
export OPENAI_API_KEY=... # compiler-side: query-time nearest() normalization (RETRIEVAL_QUERY)
# For a self-hosted OpenAI-compatible endpoint: OPENAI_BASE_URL=... NANOGRAPH_EMBED_MODEL=...
# For local experimentation without a provider, deterministic mock vectors:
# export OMNIGRAPH_EMBEDDINGS_MOCK=1 NANOGRAPH_EMBEDDINGS_MOCK=1

Fix in Claude Code

```
Comment on lines +33 to +38

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Non-mock path is missing the compiler-side API key required for nearest()

The guide configures only GEMINI_API_KEY (engine-side ingest client), but the mock comment itself discloses that two clients are active: OMNIGRAPH_EMBEDDINGS_MOCK=1 covers the engine-side and NANOGRAPH_EMBEDDINGS_MOCK=1 covers the compiler-side (query-time normalization via the NanoGraph/OpenAI client). When running against a real provider the corresponding real credentials are also needed on both sides. A user who copies just export GEMINI_API_KEY=... will succeed at load but then fail at runtime when vector_search or hybrid tries to embed $q. The fix is to add OPENAI_API_KEY=... (or OPENAI_BASE_URL / NANOGRAPH_EMBED_MODEL for a self-hosted OpenAI-compatible endpoint) to the real-credentials example, or at minimum call out the two-client requirement inline rather than deferring entirely to the embeddings reference link.

Fix in Claude Code


If you would rather supply vectors yourself, drop `@embed` and include the
`embedding` array in each input record instead.

## 3. Load

```bash
omnigraph load --data docs.jsonl --mode overwrite docs.omni
```

Each row's `body` is embedded into `embedding` as it loads. The BM25 (full-text)
and vector indexes are maintained by the engine — there is no separate build step.

## 4. Query — full-text, vector, then hybrid

Full-text only:

```gq
query text_search($q: String) {
match { $d: Document { } }
return { $d.title, bm25($d.body, $q) as score }
order { score desc }
limit 10
}
```

Vector only (the query text is embedded at query time; `nearest` requires a
`limit`):

```gq
query vector_search($q: String) {
match { $d: Document { } }
return { $d.title, nearest($d.embedding, $q) as score }
order { score desc }
limit 10
}
```

Hybrid — fuse both rankings with `rrf`:

```gq
query hybrid($q: String) {
match { $d: Document { } }
return {
$d.title,
rrf( nearest($d.embedding, $q), bm25($d.body, $q) ) as score
}
order { score desc }
limit 10
}
```

Run it:

```bash
omnigraph read --query queries.gq --name hybrid \
--params '{"q":"trends in AI safety"}' --format table docs.omni
Comment on lines +94 to +95

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Use the canonical omnigraph query command; omnigraph read is a deprecated alias since v0.6.0 and prints a stderr warning on every invocation.

Suggested change
omnigraph read --query queries.gq --name hybrid \
--params '{"q":"trends in AI safety"}' --format table docs.omni
omnigraph query --query queries.gq --name hybrid \
--params '{"q":"trends in AI safety"}' --format table docs.omni

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

```

`rrf` combines the two rankings without needing their score scales to match, so
you get a single fused ordering from a lexical signal and a semantic one.
Comment on lines +98 to +99

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 omnigraph read is a deprecated alias — guides should use omnigraph query

The CLI renamed read to query in v0.6.0 (docs/releases/v0.6.0.md). The read spelling remains as a visible_alias but prints a one-line deprecation warning to stderr on every invocation, and the CLI reference explicitly says "New integrations should target the canonical names." A guide that is the first thing users touch should not immediately trigger a deprecation warning that suggests they're doing something wrong.

Same issue appears in review-workflow.md line 34.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

14 changes: 14 additions & 0 deletions docs/user/guides/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Guides

Task-oriented walkthroughs that compose the building blocks from the reference
docs into real workflows. Each one is a runnable sequence of commands.

- [Hybrid search end to end](hybrid-search.md) — combine full-text and vector
search in one query.
- [Run a cluster on S3](cluster-on-s3.md) — go from a config directory to a
config-free server booting from a bucket.
- [Branch-based review workflow](review-workflow.md) — stage data on a branch,
review it, and merge.

New to OmniGraph? Start with the [quickstart](../quickstart.md) and
[concepts](../concepts/index.md) first.
63 changes: 63 additions & 0 deletions docs/user/guides/review-workflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Branch-Based Review Workflow

Branches let you stage changes off `main`, inspect them in isolation, and merge
only once they look right — Git-style, atomic across the whole graph. This guide
walks a typical "review an incoming batch before it hits main" flow.

See [branches & commits](../branching/index.md) and [merging](../branching/merge.md)
for the underlying model.

## 1. Stage the batch on its own branch

Loading into a branch that does not exist is an error unless you pass `--from`,
which forks it from a base first. So one command both forks the branch and loads
into it:

```bash
omnigraph load --data batch.jsonl --mode merge \
--branch review/2026-04-25 --from main graph.omni
```

(Equivalently, create the branch first with
`omnigraph branch create review/2026-04-25 --from main graph.omni`, then `load`
without `--from`.)

`main` is untouched — the batch lives only on `review/2026-04-25`.

## 2. Inspect the branch in isolation

Run any read query against the branch with `--branch`:

```bash
omnigraph read --query checks.gq --name count_by_type \
--branch review/2026-04-25 --format table graph.omni
Comment on lines +32 to +33

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Use the canonical omnigraph query command; omnigraph read is a deprecated alias since v0.6.0 and prints a stderr warning on every invocation. The cookbooks and cli/reference.md both use omnigraph query.

Suggested change
omnigraph read --query checks.gq --name count_by_type \
--branch review/2026-04-25 --format table graph.omni
omnigraph query --query checks.gq --name count_by_type \
--branch review/2026-04-25 --format table graph.omni

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

```

Compare it against `main` — list each branch's commits, or diff them:

```bash
omnigraph branch list graph.omni
omnigraph commit list --branch review/2026-04-25 graph.omni
```

## 3. Merge when it looks right

```bash
omnigraph branch merge review/2026-04-25 --into main graph.omni
```

The merge is three-way and atomic. If both `main` and the branch changed the same
data incompatibly, the merge fails with a structured list of conflicts and
publishes nothing — resolve them and re-merge. See
[merging](../branching/merge.md) for the conflict kinds.

## 4. Clean up

Once merged, delete the review branch:

```bash
omnigraph branch delete review/2026-04-25 graph.omni
```

Branch storage is reclaimed; if a transient error interrupts reclamation, the
[`cleanup`](../operations/maintenance.md) command sweeps the leftovers later.
11 changes: 11 additions & 0 deletions docs/user/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,17 @@ start with install, then follow the section that matches your task.
| Understand graph layout and URI support | [concepts/storage.md](concepts/storage.md) |
| Look up constants and tunables | [reference/constants.md](reference/constants.md) |

## Guides

Task-oriented walkthroughs that compose the building blocks above:

| Guide | Read |
|---|---|
| All guides | [guides/index.md](guides/index.md) |
| Hybrid search end to end | [guides/hybrid-search.md](guides/hybrid-search.md) |
| Run a cluster on S3 | [guides/cluster-on-s3.md](guides/cluster-on-s3.md) |
| Branch-based review workflow | [guides/review-workflow.md](guides/review-workflow.md) |

## Releases

Release notes live in [releases/](../releases/). Use them for user-visible
Expand Down
Loading