Skip to content

docs(user): add task guides — hybrid search, cluster on S3, review workflow (Phase 3b)#227

Open
aaltshuler wants to merge 1 commit into
mainfrom
docs/guides
Open

docs(user): add task guides — hybrid search, cluster on S3, review workflow (Phase 3b)#227
aaltshuler wants to merge 1 commit into
mainfrom
docs/guides

Conversation

@aaltshuler

@aaltshuler aaltshuler commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

Phase 3b of the docs restructure — task-oriented guides. Docs only.

Four new pages under docs/user/guides/, each a runnable, code-verified
command sequence:

  • hybrid-search.md — schema with an @embed vector + text body, load, then
    a query fusing bm25 and nearest with rrf. Notes that indexes are
    engine-maintained (no manual build step).
  • cluster-on-s3.mdcluster.yaml with a storage: s3:// root, the
    validate→import→plan→apply flow, loading via the graph's storage URI, and
    config-free serving with omnigraph-server --cluster s3://….
  • review-workflow.md — load onto a branch with --from, inspect with
    --branch reads / commit list, merge with --into, then delete + cleanup.
  • index.md — the section landing page.

Every command was checked against crates/omnigraph-cli/src/cli.rs — e.g. caught
that load has no --cluster/--cluster-graph (those are storage-plane
only) and used the positional storage URI instead.

Wired into docs/user/index.md (new Guides section) and AGENTS.md's topic table.

Verification

  • Zero broken .md links.
  • scripts/check-agents-md.sh green (61 links, 58 docs).

This completes the docs restructure (Phase 1 #223 · Phase 2 #225 · Phase 3a #226 ·
Phase 3b here). Remaining follow-up tracked separately: the website
import-docs.mjs subdir-aware update before the next site re-sync.

🤖 Generated with Claude Code

Greptile Summary

Phase 3b of the docs restructure adds four task-oriented guide pages under docs/user/guides/ (hybrid search, S3 cluster, review workflow, index) and wires them into docs/user/index.md and AGENTS.md. The command sequences were verified against the CLI source, catching real issues like the absence of --cluster on load.

  • hybrid-search.md uses deprecated omnigraph read instead of omnigraph query, and the non-mock credentials section omits OPENAI_API_KEY — the compiler-side client needs it for nearest() query-time text embedding, even though the mock path correctly shows both OMNIGRAPH_EMBEDDINGS_MOCK and NANOGRAPH_EMBEDDINGS_MOCK.
  • review-workflow.md also uses deprecated omnigraph read; all other branch/commit/load commands match the CLI struct.
  • cluster-on-s3.md and the index/navigation files look correct and consistent with the existing cluster operator guide.

Confidence Score: 3/5

Two of the four new guides ship runnable examples that will either emit deprecation warnings or fail outright at the query step; fix both before merging.

The hybrid-search.md guide has two issues that break the described workflow: using omnigraph read (deprecated, prints warning every run) and showing only GEMINI_API_KEY in the non-mock path while the mock comment itself reveals that a second client and therefore a second API key is required for nearest() to work. The review-workflow.md also uses omnigraph read. These are first-party guides intended as the primary on-ramp for new users, so shipping examples that warn or fail is a meaningful problem.

docs/user/guides/hybrid-search.md (deprecated command + incomplete credentials example) and docs/user/guides/review-workflow.md (deprecated command)

Important Files Changed

Filename Overview
docs/user/guides/hybrid-search.md New guide for hybrid search; uses deprecated omnigraph read alias and omits the compiler-side OpenAI key needed for query-time nearest() in the non-mock path
docs/user/guides/review-workflow.md New branch review workflow guide; uses deprecated omnigraph read alias instead of canonical omnigraph query; all other commands match CLI struct
docs/user/guides/cluster-on-s3.md New S3 cluster guide; commands match CLI struct (positional load URI, --cluster/--cluster-graph for maintenance, --mode required on load); workflow mirrors clusters/index.md pattern
docs/user/guides/index.md Section landing page listing the three new guides; links are valid
docs/user/index.md Adds a Guides section table to the user index; links are correct and wiring is consistent with existing table style
AGENTS.md Single row added to the topic table pointing to guides/index.md; correct placement and link

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph hybrid["hybrid-search.md"]
        H1["omnigraph init --schema schema.pg docs.omni"]
        H2["export GEMINI_API_KEY + OPENAI_API_KEY"]
        H3["omnigraph load --mode overwrite docs.omni"]
        H4["omnigraph query --name hybrid"]
        H1 --> H2 --> H3 --> H4
    end
    subgraph cluster["cluster-on-s3.md"]
        C1["cluster validate / import / plan / apply"]
        C2["omnigraph load s3://bucket/.../knowledge.omni"]
        C3["omnigraph-server --cluster s3://bucket/..."]
        C4["omnigraph optimize / cleanup --cluster X --cluster-graph Y"]
        C1 --> C2 --> C3
        C3 --> C4
    end
    subgraph review["review-workflow.md"]
        R1["omnigraph load --branch review/X --from main"]
        R2["omnigraph query --branch review/X"]
        R3["omnigraph branch merge review/X --into main"]
        R4["omnigraph branch delete review/X"]
        R1 --> R2 --> R3 --> R4
    end
Loading

Fix All in Claude Code

Reviews (1): Last reviewed commit: "docs(user): add task guides — hybrid sea..." | Re-trigger Greptile

Greptile also left 5 inline comments on this PR.

…rkflow (Phase 3b)

Four new pages under docs/user/guides/, each a runnable, code-verified command
sequence that composes the reference docs into a real workflow:

- guides/hybrid-search.md — schema with a @embed vector + text body, load, then
  a query fusing bm25 and nearest with rrf. Notes that indexes are engine-
  maintained (no manual build step) and links embeddings.md for the provider env.
- guides/cluster-on-s3.md — cluster.yaml with a storage: s3:// root, the
  validate→import→plan→apply flow, loading via the graph's storage URI, and
  config-free serving with `omnigraph-server --cluster s3://…`.
- guides/review-workflow.md — load onto a branch with --from, inspect it with
  --branch reads / commit list, merge with --into, then delete + cleanup.
- guides/index.md — the section landing page.

Every command was checked against crates/omnigraph-cli/src/cli.rs (e.g. caught
that `load` has no --cluster/--cluster-graph — those are storage-plane only — and
used the positional storage URI instead).

Wired into docs/user/index.md (new Guides section) and AGENTS.md's topic table.

Verified: zero broken links; check-agents-md.sh green (61 links, 58 docs).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@aaltshuler aaltshuler requested a review from ragnorc as a code owner June 14, 2026 11:44
Comment on lines +98 to +99
`rrf` combines the two rankings without needing their score scales to match, so
you get a single fused ordering from a lexical signal and a semantic one.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 omnigraph read is a deprecated alias — guides should use omnigraph query

The CLI renamed read to query in v0.6.0 (docs/releases/v0.6.0.md). The read spelling remains as a visible_alias but prints a one-line deprecation warning to stderr on every invocation, and the CLI reference explicitly says "New integrations should target the canonical names." A guide that is the first thing users touch should not immediately trigger a deprecation warning that suggests they're doing something wrong.

Same issue appears in review-workflow.md line 34.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

Comment on lines +33 to +38

```bash
export GEMINI_API_KEY=... # ingest-time document embeddings
# For local experimentation without a provider, deterministic mock vectors:
# export OMNIGRAPH_EMBEDDINGS_MOCK=1 NANOGRAPH_EMBEDDINGS_MOCK=1
```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Non-mock path is missing the compiler-side API key required for nearest()

The guide configures only GEMINI_API_KEY (engine-side ingest client), but the mock comment itself discloses that two clients are active: OMNIGRAPH_EMBEDDINGS_MOCK=1 covers the engine-side and NANOGRAPH_EMBEDDINGS_MOCK=1 covers the compiler-side (query-time normalization via the NanoGraph/OpenAI client). When running against a real provider the corresponding real credentials are also needed on both sides. A user who copies just export GEMINI_API_KEY=... will succeed at load but then fail at runtime when vector_search or hybrid tries to embed $q. The fix is to add OPENAI_API_KEY=... (or OPENAI_BASE_URL / NANOGRAPH_EMBED_MODEL for a self-hosted OpenAI-compatible endpoint) to the real-credentials example, or at minimum call out the two-client requirement inline rather than deferring entirely to the embeddings reference link.

Fix in Claude Code

Comment on lines +94 to +95
omnigraph read --query queries.gq --name hybrid \
--params '{"q":"trends in AI safety"}' --format table docs.omni

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Use the canonical omnigraph query command; omnigraph read is a deprecated alias since v0.6.0 and prints a stderr warning on every invocation.

Suggested change
omnigraph read --query queries.gq --name hybrid \
--params '{"q":"trends in AI safety"}' --format table docs.omni
omnigraph query --query queries.gq --name hybrid \
--params '{"q":"trends in AI safety"}' --format table docs.omni

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

Comment on lines +32 to +33
omnigraph read --query checks.gq --name count_by_type \
--branch review/2026-04-25 --format table graph.omni

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Use the canonical omnigraph query command; omnigraph read is a deprecated alias since v0.6.0 and prints a stderr warning on every invocation. The cookbooks and cli/reference.md both use omnigraph query.

Suggested change
omnigraph read --query checks.gq --name count_by_type \
--branch review/2026-04-25 --format table graph.omni
omnigraph query --query checks.gq --name count_by_type \
--branch review/2026-04-25 --format table graph.omni

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

Comment on lines +35 to +37
export GEMINI_API_KEY=... # ingest-time document embeddings
# For local experimentation without a provider, deterministic mock vectors:
# export OMNIGRAPH_EMBEDDINGS_MOCK=1 NANOGRAPH_EMBEDDINGS_MOCK=1

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The real-credentials path should surface both API keys. The mock comment already reveals that two clients are active — OMNIGRAPH_EMBEDDINGS_MOCK for the engine (ingest-time @embed) and NANOGRAPH_EMBEDDINGS_MOCK for the compiler (query-time text-to-vector for nearest()). Without an OpenAI-compatible key, any query that passes a string to nearest() will fail at runtime even if load succeeded.

Suggested change
export GEMINI_API_KEY=... # ingest-time document embeddings
# For local experimentation without a provider, deterministic mock vectors:
# export OMNIGRAPH_EMBEDDINGS_MOCK=1 NANOGRAPH_EMBEDDINGS_MOCK=1
export GEMINI_API_KEY=... # engine-side: ingest-time @embed (RETRIEVAL_DOCUMENT)
export OPENAI_API_KEY=... # compiler-side: query-time nearest() normalization (RETRIEVAL_QUERY)
# For a self-hosted OpenAI-compatible endpoint: OPENAI_BASE_URL=... NANOGRAPH_EMBED_MODEL=...
# For local experimentation without a provider, deterministic mock vectors:
# export OMNIGRAPH_EMBEDDINGS_MOCK=1 NANOGRAPH_EMBEDDINGS_MOCK=1

Fix in Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant