Skip to content

Commit 35fce33

Browse files
committed
docs: update CLAUDE.md and CHANGELOG for v1.0.1
CLAUDE.md: updated description, serve.rs intelligence wiring, replaced stale ModelBackend pattern with orchestrator/reranker/ llama.cpp/prompt formatting docs, updated common tasks. CHANGELOG: clarified v1.0.0 note about candle→llama.cpp transition.
1 parent a721fbb commit 35fce33

File tree

2 files changed

+21
-11
lines changed

2 files changed

+21
-11
lines changed

CHANGELOG.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,12 @@
2626

2727
## [1.0.0] - 2026-03-25
2828

29+
Intelligence release. Replaced ONNX with GGUF model inference, added LLM-powered search intelligence. Immediately followed by v1.0.1 which switched the inference backend from candle to llama.cpp for Metal GPU support.
30+
2931
### Added
30-
- **Candle runtime** — replaced ONNX (`ort`) with candle (pure Rust ML framework). Loads GGUF quantized models. Metal acceleration on macOS.
32+
- **GGUF model inference** — replaced ONNX (`ort`) with GGUF quantized models for all ML inference
3133
- **Research orchestrator** — LLM-based query classification (exact/conceptual/relationship/exploratory) with adaptive lane weights. Single LLM call returns intent + 2-4 query expansions.
32-
- **Cross-encoder reranker** — 4th RRF lane using qwen3-reranker for relevance scoring. Two-pass fusion: 3-lane retrieval → reranker scores top 30 → 4-lane RRF.
34+
- **Cross-encoder reranker** — 4th RRF lane using Qwen3-Reranker for relevance scoring. Two-pass fusion: 3-lane retrieval → reranker scores top 30 → 4-lane RRF.
3335
- **Query expansion** — each search runs multiple expanded queries through all retrieval lanes, merged via deduplication.
3436
- **Heuristic orchestrator** — fast-path intent classification via pattern matching (docids, ticket IDs, "who" queries) when intelligence is disabled. Zero latency.
3537
- **Intelligence onboarding** — opt-in prompt during `engraph init` and first `engraph index`. Downloads ~1.3GB of optional models.
@@ -43,7 +45,6 @@
4345
- Search pipeline: hardcoded 3-lane weights → adaptive per-query-intent weights
4446
- `--explain` output now shows query intent and 4-lane breakdown (semantic, FTS, graph, rerank)
4547
- `status` command shows intelligence enabled/disabled state
46-
- `run_search` accepts `Config` parameter (no redundant config load)
4748

4849
### Removed
4950
- `ort` (ONNX Runtime) dependency

CLAUDE.md

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# engraph
22

3-
Local hybrid search CLI for Obsidian vaults. Rust, MIT licensed.
3+
Local knowledge graph + intelligence layer for Obsidian vaults. Rust CLI + MCP server. llama.cpp inference with Metal GPU. MIT licensed.
44

55
## Architecture
66

@@ -19,14 +19,14 @@ Single binary with 19 modules behind a lib crate:
1919
- `placement.rs` — folder placement engine. Uses folder centroids (online mean of embeddings per folder) to suggest the best folder for new notes. Falls back to inbox when confidence is low. Includes placement correction detection (`detect_correction_from_frontmatter`) and frontmatter stripping for moved files
2020
- `writer.rs` — write pipeline orchestrator. 5-step pipeline: resolve tags (fuzzy match + register new), discover links (exact + fuzzy), place in folder, atomic file write (temp + rename), and index update. Supports create, append, update_metadata, move_note, archive, and unarchive operations with mtime-based conflict detection and crash recovery via temp file cleanup
2121
- `watcher.rs` — file watcher for `engraph serve`. OS thread producer (notify-debouncer-full, 2s debounce) sends `Vec<WatchEvent>` over tokio::mpsc to async consumer task. Two-pass batch processing: mutations (index_file/remove_file/rename_file) then edge rebuild. Move detection via content hash matching. Placement correction on file moves. Centroid adjustment on file add/remove. Startup reconciliation via `run_index_shared`
22-
- `serve.rs` — MCP stdio server via rmcp SDK. Exposes 13 tools: 7 read (search, read, list, vault_map, who, project, context) + 6 write (create, append, update_metadata, move_note, archive, unarchive). EngraphServer struct with Arc+Mutex wrapping for async handlers. Spawns file watcher on startup
22+
- `serve.rs` — MCP stdio server via rmcp SDK. Exposes 13 tools: 7 read (search, read, list, vault_map, who, project, context) + 6 write (create, append, update_metadata, move_note, archive, unarchive). EngraphServer struct with Arc+Mutex wrapping for async handlers. Loads intelligence models (orchestrator + reranker) when enabled, wires into `search_with_intelligence`. Spawns file watcher on startup
2323
- `graph.rs` — vault graph agent. Extracts wikilink targets, expands search results by following graph connections 1-2 hops. Relevance filtering via FTS5 term check and shared tags
2424
- `profile.rs` — vault profile detection. Auto-detects PARA/Folders/Flat structure, vault type (Obsidian/Logseq/Plain), wikilinks, frontmatter, tags. Writes/loads `vault.toml`
2525
- `store.rs` — SQLite persistence. Tables: `meta`, `files` (with docid, created_by), `chunks` (with vector BLOBs), `chunks_fts` (FTS5), `edges` (vault graph), `tombstones`, `tag_registry`, `folder_centroids`, `placement_corrections`, `link_skiplist` (reserved), `llm_cache` (orchestrator result cache). `vec_chunks` virtual table (sqlite-vec) for KNN search. Dynamic embedding dimension stored in meta. `has_dimension_mismatch()` and `reset_for_reindex()` for migration
2626
- `indexer.rs` — orchestrates vault walking (via `ignore` crate for `.gitignore` support), diffing, chunking, embedding, writes to store + sqlite-vec + FTS5, vault graph edge building (wikilinks + people detection), and folder centroid computation. Exposes `index_file`, `remove_file`, `rename_file` as public per-file functions. `run_index_shared` accepts external store/embedder for watcher FullRescan. Dimension migration on model change.
2727
- `search.rs` — hybrid search orchestrator. `search_with_intelligence()` runs the full pipeline: orchestrate (intent + expansions) → 3-lane retrieval per expansion → RRF pass 1 → reranker 4th lane → RRF pass 2. `search_internal()` is a thin wrapper without intelligence models. Adaptive lane weights per query intent.
2828

29-
`main.rs` is a thin clap CLI (async via `#[tokio::main]`). Subcommands: `index`, `search` (with `--explain`), `status`, `clear`, `init`, `configure`, `models`, `graph` (show/stats), `context` (read/list/vault-map/who/project/topic), `write` (create/append/update-metadata/move), `serve` (MCP stdio server with file watcher).
29+
`main.rs` is a thin clap CLI (async via `#[tokio::main]`). Subcommands: `index` (with progress bar), `search` (with `--explain`, loads intelligence models when enabled), `status` (shows intelligence state), `clear`, `init` (intelligence onboarding prompt), `configure` (`--enable-intelligence`, `--disable-intelligence`, `--model`), `models`, `graph` (show/stats), `context` (read/list/vault-map/who/project/topic), `write` (create/append/update-metadata/move), `serve` (MCP stdio server with file watcher + intelligence).
3030

3131
## Key patterns
3232

@@ -42,7 +42,10 @@ Single binary with 19 modules behind a lib crate:
4242
- **Centroid updates:** Online mean math (`adjust_folder_centroid`). Incremented on file add, decremented on file remove. Full recompute during bulk indexing
4343
- **Docids:** Each file gets a deterministic 6-char hex ID. Displayed in search results
4444
- **Vault profiles:** `engraph init` auto-detects vault structure and writes `vault.toml`
45-
- **Pluggable models:** `ModelBackend` trait enables future model swapping
45+
- **LLM orchestrator:** Single Qwen3-0.6B call classifies query intent (exact/conceptual/relationship/exploratory), generates 2-4 query expansions, and sets adaptive lane weights. Results cached in `llm_cache` SQLite table (keyed by query SHA256). Falls back to heuristic pattern matching when intelligence is off
46+
- **Cross-encoder reranker:** Qwen3-Reranker-0.6B scores query-document relevance via Yes/No logit softmax. Runs as 4th RRF lane on top-30 candidates from pass 1
47+
- **llama.cpp backend:** Global `LlamaBackend` singleton via `OnceLock`. `LlamaModel` is `Send+Sync`, `LlamaContext` is `!Send` (created per-call). Metal GPU auto-detected on macOS. Models download from HuggingFace on first use with progress bar
48+
- **Prompt formatting:** `PromptFormat` auto-detects model family from GGUF filename. embeddinggemma uses asymmetric prefixes (`search_query:` / `search_document:`). Applied in `embed_one` (queries) and `embed_batch` (documents)
4649

4750
## Data directory
4851

@@ -75,20 +78,26 @@ Single vault only. Re-indexing a different vault path triggers a confirmation pr
7578

7679
- CI: `cargo fmt --check` + `cargo clippy -- -D warnings` + `cargo test --lib` on macOS + Ubuntu. Ubuntu step installs CMake.
7780
- Release: native builds on macOS arm64 (macos-14) + Linux x86_64 (ubuntu-latest). Triggered by `v*` tags
78-
- Homebrew: `devwhodevs/homebrew-tap` — formula builds from source tarball
81+
- Homebrew: `devwhodevs/homebrew-tap` — formula builds from source tarball. Depends on `cmake` + `rust`.
7982

8083
## Common tasks
8184

8285
```bash
83-
# Run tests
86+
# Run tests (requires CMake)
8487
cargo test --lib
8588

86-
# Run integration tests (downloads model)
89+
# Run integration tests (downloads GGUF model)
8790
cargo test --test integration -- --ignored
8891

8992
# Build release
9093
cargo build --release
9194

9295
# Release: tag and push
93-
git tag v0.x.y && git push origin v0.x.y
96+
git tag v1.x.y && git push origin v1.x.y
97+
98+
# Enable intelligence (downloads ~1.3GB)
99+
engraph configure --enable-intelligence
100+
101+
# Re-record demo GIF
102+
vhs assets/demo.tape
94103
```

0 commit comments

Comments
 (0)