diff --git a/CHANGELOG.md b/CHANGELOG.md index 2b4110a..c55e604 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,29 @@ # Changelog +## [1.1.0] - 2026-03-26 — Complete Vault Gateway + +### Added +- **Section parser** (`markdown.rs`) — heading detection, section extraction, frontmatter splitting +- **Obsidian CLI wrapper** (`obsidian.rs`) — process detection, circuit breaker (Closed/Degraded/Open), async CLI delegation +- **Vault health** (`health.rs`) — orphan detection, broken link detection, stale notes, tag hygiene +- **Section-level editing** — `edit_note()` with replace/prepend/append modes targeting specific headings +- **Note rewriting** — `rewrite_note()` with frontmatter preservation +- **Frontmatter mutations** — `edit_frontmatter()` with granular set/remove/add_tag/remove_tag/add_alias/remove_alias ops +- **Hard delete** — `delete_note()` with soft (archive) and hard (permanent) modes +- **Section reading** — `read_section()` in context engine for targeted note section access +- **Enhanced file resolution** — fuzzy Levenshtein matching as final fallback in `resolve_file()` +- **6 new MCP tools** — `read_section`, `health`, `edit`, `rewrite`, `edit_frontmatter`, `delete` +- **CLI events table** — audit log for CLI operations +- **Watcher coordination** — `recent_writes` map prevents double re-indexing of MCP-written files +- **Content-based role detection** — detect people/daily/archive folders by content patterns, not just names +- **Enhanced onboarding** — `engraph init` detects Obsidian CLI + AI agents, `engraph configure` has new flags +- **Config sections** — `[obsidian]` and `[agents]` in config.toml + +### Changed +- Module count: 19 → 22 +- MCP tools: 13 → 19 +- Test count: 270 → 318 + ## [1.0.2] - 2026-03-26 ### Fixed diff --git a/CLAUDE.md b/CLAUDE.md index b410827..19b4840 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -4,29 +4,32 @@ Local knowledge graph + intelligence layer for Obsidian vaults. Rust CLI + MCP s ## Architecture -Single binary with 19 modules behind a lib crate: +Single binary with 22 modules behind a lib crate: -- `config.rs` — loads `~/.engraph/config.toml` and `vault.toml`, merges CLI args, provides `data_dir()`. Includes `intelligence: Option` and `[models]` section for model overrides. `Config::save()` writes back to disk. +- `config.rs` — loads `~/.engraph/config.toml` and `vault.toml`, merges CLI args, provides `data_dir()`. Includes `intelligence: Option`, `[models]` section for model overrides, `[obsidian]` section (CLI path, enabled flag), and `[agents]` section (registered AI agent names). `Config::save()` writes back to disk. - `chunker.rs` — smart chunking with break-point scoring algorithm. Finds optimal split points considering headings, code fences, blank lines, and thematic breaks. `split_oversized_chunks()` handles token-aware secondary splitting with overlap - `docid.rs` — deterministic 6-char hex IDs for files (SHA-256 of path, truncated). Shown in search results for quick reference - `llm.rs` — ML inference via llama.cpp (Rust bindings: `llama-cpp-2`). Three traits: `EmbedModel` (embeddings), `RerankModel` (cross-encoder scoring), `OrchestratorModel` (query intent + expansion). Three llama.cpp implementations: `LlamaEmbed` (embeddinggemma-300M GGUF on Metal GPU), `LlamaOrchestrator` (Qwen3-0.6B for query analysis + expansion), `LlamaRerank` (Qwen3-Reranker-0.6B for relevance scoring). Global `LlamaBackend` via `OnceLock`. Also: `MockLlm` for testing, `HfModelUri` for model download, `FlexTokenizer` (HuggingFace tokenizers + shimmytok GGUF fallback), `PromptFormat` for model-family prompt templates, `heuristic_orchestrate()` fast path, `LaneWeights` per query intent - `fts.rs` — FTS5 full-text search support. Re-exports `FtsResult` from store. BM25-ranked keyword search - `fusion.rs` — Reciprocal Rank Fusion (RRF) engine. Merges semantic + FTS5 + graph + reranker results. Supports per-lane weighting, `--explain` output with intent + per-lane detail -- `context.rs` — context engine. Six functions: `read` (full note content + metadata), `list` (filtered note listing with `created_by` filter), `vault_map` (structure overview), `who` (person context bundle), `project` (project context bundle), `context_topic` (rich topic context with budget trimming). Pure functions taking `ContextParams` — no model loading except `context_topic` which reuses `search_internal` +- `markdown.rs` — section parser. Heading detection (ATX `#` headings with level tracking), section extraction by heading text, frontmatter splitting (YAML block between `---` fences). Powers section-level reading and editing +- `obsidian.rs` — Obsidian CLI wrapper. Process detection (checks if Obsidian is running), circuit breaker state machine (Closed/Degraded/Open) for resilient CLI delegation, async subprocess execution with timeout. Falls back gracefully when Obsidian is unavailable +- `health.rs` — vault health diagnostics. Orphan detection (notes with no incoming or outgoing wikilinks), broken link detection (wikilinks pointing to nonexistent notes), stale note detection (notes not modified within configurable threshold), tag hygiene (unused/rare tags). Returns structured health report +- `context.rs` — context engine. Seven functions: `read` (full note content + metadata), `read_section` (targeted section extraction by heading), `list` (filtered note listing with `created_by` filter), `vault_map` (structure overview), `who` (person context bundle), `project` (project context bundle), `context_topic` (rich topic context with budget trimming). Pure functions taking `ContextParams` — no model loading except `context_topic` which reuses `search_internal` - `vecstore.rs` — sqlite-vec virtual table integration. Manages the `vec_chunks` vec0 table for vector storage and KNN search. Handles insert, delete, and search operations against the virtual table - `tags.rs` — tag registry module. Maintains a `tag_registry` table tracking known tags with source attribution. Supports fuzzy matching for tag suggestions during note creation - `links.rs` — link discovery module. Three match types: exact basename, fuzzy (sliding window Levenshtein, 0.92 threshold), and first-name (People folder, suggestion-only at 650bp). Overlap resolution via type priority (exact > alias > fuzzy > first-name) - `placement.rs` — folder placement engine. Uses folder centroids (online mean of embeddings per folder) to suggest the best folder for new notes. Falls back to inbox when confidence is low. Includes placement correction detection (`detect_correction_from_frontmatter`) and frontmatter stripping for moved files -- `writer.rs` — write pipeline orchestrator. 5-step pipeline: resolve tags (fuzzy match + register new), discover links (exact + fuzzy), place in folder, atomic file write (temp + rename), and index update. Supports create, append, update_metadata, move_note, archive, and unarchive operations with mtime-based conflict detection and crash recovery via temp file cleanup -- `watcher.rs` — file watcher for `engraph serve`. OS thread producer (notify-debouncer-full, 2s debounce) sends `Vec` over tokio::mpsc to async consumer task. Two-pass batch processing: mutations (index_file/remove_file/rename_file) then edge rebuild. Move detection via content hash matching. Placement correction on file moves. Centroid adjustment on file add/remove. Startup reconciliation via `run_index_shared` -- `serve.rs` — MCP stdio server via rmcp SDK. Exposes 13 tools: 7 read (search, read, list, vault_map, who, project, context) + 6 write (create, append, update_metadata, move_note, archive, unarchive). EngraphServer struct with Arc+Mutex wrapping for async handlers. Loads intelligence models (orchestrator + reranker) when enabled, wires into `search_with_intelligence`. Spawns file watcher on startup +- `writer.rs` — write pipeline orchestrator. 5-step pipeline: resolve tags (fuzzy match + register new), discover links (exact + fuzzy), place in folder, atomic file write (temp + rename), and index update. Supports create, append, update_metadata, move_note, archive, unarchive, edit (section-level replace/prepend/append), rewrite (full content with frontmatter preservation), edit_frontmatter (granular set/remove/add_tag/remove_tag/add_alias/remove_alias ops), and delete (soft archive or hard permanent) operations with mtime-based conflict detection and crash recovery via temp file cleanup +- `watcher.rs` — file watcher for `engraph serve`. OS thread producer (notify-debouncer-full, 2s debounce) sends `Vec` over tokio::mpsc to async consumer task. Two-pass batch processing: mutations (index_file/remove_file/rename_file) then edge rebuild. Move detection via content hash matching. Placement correction on file moves. Centroid adjustment on file add/remove. Startup reconciliation via `run_index_shared`. `recent_writes` map coordination with MCP server to prevent double re-indexing of files written through the write pipeline +- `serve.rs` — MCP stdio server via rmcp SDK. Exposes 19 tools: 8 read (search, read, read_section, list, vault_map, who, project, context) + 10 write (create, append, update_metadata, move_note, archive, unarchive, edit, rewrite, edit_frontmatter, delete) + 1 diagnostic (health). `edit_frontmatter` replaces `update_metadata` for granular frontmatter mutations. EngraphServer struct with Arc+Mutex wrapping for async handlers. Loads intelligence models (orchestrator + reranker) when enabled, wires into `search_with_intelligence`. Spawns file watcher on startup. CLI events table provides audit log for write operations. `recent_writes` map prevents double re-indexing of MCP-written files - `graph.rs` — vault graph agent. Extracts wikilink targets, expands search results by following graph connections 1-2 hops. Relevance filtering via FTS5 term check and shared tags -- `profile.rs` — vault profile detection. Auto-detects PARA/Folders/Flat structure, vault type (Obsidian/Logseq/Plain), wikilinks, frontmatter, tags. Writes/loads `vault.toml` -- `store.rs` — SQLite persistence. Tables: `meta`, `files` (with docid, created_by), `chunks` (with vector BLOBs), `chunks_fts` (FTS5), `edges` (vault graph), `tombstones`, `tag_registry`, `folder_centroids`, `placement_corrections`, `link_skiplist` (reserved), `llm_cache` (orchestrator result cache). `vec_chunks` virtual table (sqlite-vec) for KNN search. Dynamic embedding dimension stored in meta. `has_dimension_mismatch()` and `reset_for_reindex()` for migration +- `profile.rs` — vault profile detection. Auto-detects PARA/Folders/Flat structure, vault type (Obsidian/Logseq/Plain), wikilinks, frontmatter, tags. Content-based role detection for people/daily/archive folders by content patterns (not just names). Writes/loads `vault.toml` +- `store.rs` — SQLite persistence. Tables: `meta`, `files` (with docid, created_by), `chunks` (with vector BLOBs), `chunks_fts` (FTS5), `edges` (vault graph), `tombstones`, `tag_registry`, `folder_centroids`, `placement_corrections`, `link_skiplist` (reserved), `llm_cache` (orchestrator result cache), `cli_events` (audit log for CLI operations). `vec_chunks` virtual table (sqlite-vec) for KNN search. Dynamic embedding dimension stored in meta. `has_dimension_mismatch()` and `reset_for_reindex()` for migration. Enhanced `resolve_file()` with fuzzy Levenshtein matching as final fallback - `indexer.rs` — orchestrates vault walking (via `ignore` crate for `.gitignore` support), diffing, chunking, embedding, writes to store + sqlite-vec + FTS5, vault graph edge building (wikilinks + people detection), and folder centroid computation. Exposes `index_file`, `remove_file`, `rename_file` as public per-file functions. `run_index_shared` accepts external store/embedder for watcher FullRescan. Dimension migration on model change. - `search.rs` — hybrid search orchestrator. `search_with_intelligence()` runs the full pipeline: orchestrate (intent + expansions) → 3-lane retrieval per expansion → RRF pass 1 → reranker 4th lane → RRF pass 2. `search_internal()` is a thin wrapper without intelligence models. Adaptive lane weights per query intent. -`main.rs` is a thin clap CLI (async via `#[tokio::main]`). Subcommands: `index` (with progress bar), `search` (with `--explain`, loads intelligence models when enabled), `status` (shows intelligence state), `clear`, `init` (intelligence onboarding prompt), `configure` (`--enable-intelligence`, `--disable-intelligence`, `--model`), `models`, `graph` (show/stats), `context` (read/list/vault-map/who/project/topic), `write` (create/append/update-metadata/move), `serve` (MCP stdio server with file watcher + intelligence). +`main.rs` is a thin clap CLI (async via `#[tokio::main]`). Subcommands: `index` (with progress bar), `search` (with `--explain`, loads intelligence models when enabled), `status` (shows intelligence state), `clear`, `init` (intelligence onboarding prompt, detects Obsidian CLI + AI agents), `configure` (`--enable-intelligence`, `--disable-intelligence`, `--model`, `--obsidian-cli`, `--no-obsidian-cli`, `--agent`), `models`, `graph` (show/stats), `context` (read/list/vault-map/who/project/topic), `write` (create/append/update-metadata/move/edit/rewrite/edit-frontmatter/delete), `serve` (MCP stdio server with file watcher + intelligence). ## Key patterns @@ -70,7 +73,7 @@ Single vault only. Re-indexing a different vault path triggers a confirmation pr ## Testing -- Unit tests in each module (`cargo test --lib`) — 270 tests, no network required +- Unit tests in each module (`cargo test --lib`) — 318 tests, no network required - Integration tests (`cargo test --test integration -- --ignored`) — require GGUF model download - Build requires CMake (for llama.cpp C++ compilation) diff --git a/Cargo.lock b/Cargo.lock index 2cc55a9..9be9607 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -632,6 +632,7 @@ dependencies = [ "rusqlite", "serde", "serde_json", + "serde_yaml", "sha2", "shimmytok", "sqlite-vec", @@ -2019,6 +2020,19 @@ dependencies = [ "serde", ] +[[package]] +name = "serde_yaml" +version = "0.9.34+deprecated" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6a8b1a1a2ebf674015cc02edccce75287f1a0130d394307b36743c2f5d504b47" +dependencies = [ + "indexmap", + "itoa", + "ryu", + "serde", + "unsafe-libyaml", +] + [[package]] name = "sha2" version = "0.10.9" @@ -2057,6 +2071,16 @@ version = "1.3.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "0fda2ff0d084019ba4d7c6f371c95d8fd75ce3524c3cb8fb653a3023f6323e64" +[[package]] +name = "signal-hook-registry" +version = "1.4.8" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "c4db69cba1110affc0e9f7bcd48bbf87b3f4fc7c61fc9155afd4c469eb3d6c1b" +dependencies = [ + "errno", + "libc", +] + [[package]] name = "simd-adler32" version = "0.3.8" @@ -2273,8 +2297,12 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "27ad5e34374e03cfffefc301becb44e9dc3c17584f414349ebe29ed26661822d" dependencies = [ "bytes", + "libc", + "mio", "pin-project-lite", + "signal-hook-registry", "tokio-macros", + "windows-sys 0.61.2", ] [[package]] @@ -2448,6 +2476,12 @@ version = "0.1.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "39ec24b3121d976906ece63c9daad25b85969647682eee313cb5779fdd69e14e" +[[package]] +name = "unsafe-libyaml" +version = "0.2.11" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "673aac59facbab8a9007c7f6108d11f63b603f7cabff99fabf650fea5c32b861" + [[package]] name = "untrusted" version = "0.9.0" diff --git a/Cargo.toml b/Cargo.toml index ebf5433..2cfc549 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -14,6 +14,7 @@ categories = ["command-line-utilities", "database", "text-processing"] clap = { version = "4", features = ["derive"] } serde = { version = "1", features = ["derive"] } serde_json = "1" +serde_yaml = "0.9" toml = "0.8" dirs = "5" anyhow = "1" @@ -31,7 +32,7 @@ time = "0.3" strsim = "0.11" ignore = "0.4" rmcp = { version = "1.2", features = ["transport-io"] } -tokio = { version = "1", features = ["macros", "rt-multi-thread"] } +tokio = { version = "1", features = ["macros", "rt-multi-thread", "process", "time"] } notify = "7.0" notify-debouncer-full = "0.4" llama-cpp-2 = "0.1" diff --git a/README.md b/README.md index 83e5a17..646132e 100644 --- a/README.md +++ b/README.md @@ -17,9 +17,12 @@ engraph turns your markdown vault into a searchable knowledge graph that AI agen Plain vector search treats your notes as isolated documents. But knowledge isn't flat — your notes link to each other, share tags, reference the same people and projects. engraph understands these connections. - **4-lane hybrid search** — semantic embeddings + BM25 full-text + graph expansion + cross-encoder reranking, fused via [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf). An LLM orchestrator classifies queries and adapts lane weights per intent. -- **MCP server for AI agents** — `engraph serve` exposes 13 tools (search, read, context bundles, note creation) that Claude, Cursor, or any MCP client can call directly. +- **MCP server for AI agents** — `engraph serve` exposes 19 tools (search, read, section-level editing, frontmatter mutations, vault health, context bundles, note creation) that Claude, Cursor, or any MCP client can call directly. +- **Section-level editing** — AI agents can read, replace, prepend, or append to specific sections by heading. Full note rewriting with frontmatter preservation. Granular frontmatter mutations (set/remove fields, add/remove tags and aliases). +- **Vault health diagnostics** — detect orphan notes, broken wikilinks, stale content, and tag hygiene issues. Available as MCP tool and CLI command. +- **Obsidian CLI integration** — auto-detects running Obsidian and delegates compatible operations. Circuit breaker (Closed/Degraded/Open) ensures graceful fallback. - **Real-time sync** — file watcher keeps the index fresh as you edit in Obsidian. No manual re-indexing needed. -- **Smart write pipeline** — AI agents can create notes with automatic tag resolution, wikilink discovery, and folder placement based on semantic similarity. +- **Smart write pipeline** — AI agents can create, edit, rewrite, and delete notes with automatic tag resolution, wikilink discovery, and folder placement based on semantic similarity. - **Fully local** — [llama.cpp](https://github.com/ggml-org/llama.cpp) inference with GGUF models (~300MB mandatory, ~1.3GB optional for intelligence). Metal GPU-accelerated on macOS (88 files indexed in 70s). No API keys, no cloud. ## What problem it solves @@ -52,8 +55,9 @@ Your vault (markdown files) │ Search: Orchestrator → 4-lane retrieval │ │ → Reranker → Two-pass RRF fusion │ │ │ -│ 13 tools: search, read, list, context, │ -│ who, project, create, append, move... │ +│ 19 tools: search, read, read_section, │ +│ edit, rewrite, edit_frontmatter, delete, │ +│ health, context, who, project, create... │ └─────────────────────────────────────────────┘ │ ▼ @@ -191,6 +195,45 @@ engraph write create --content "# Meeting Notes\n\nDiscussed auth timeline with engraph resolves tags against the registry (fuzzy matching), discovers potential wikilinks (`[[Sarah Chen]]`), suggests the best folder based on semantic similarity to existing notes, and writes atomically. +**Edit a specific section:** + +```bash +engraph write edit --file "Meeting Notes" --heading "Action Items" --mode append --content "- [ ] Follow up with Sarah" +``` + +Targets the "Action Items" section by heading, appends content without touching the rest of the note. + +**Rewrite a note (preserves frontmatter):** + +```bash +engraph write rewrite --file "Meeting Notes" --content "# Meeting Notes\n\nRevised content here." +``` + +Replaces the entire body while keeping existing frontmatter (tags, dates, metadata) intact. + +**Edit frontmatter:** + +```bash +engraph write edit-frontmatter --file "Meeting Notes" --op add_tag --value "actionable" +``` + +Granular frontmatter mutations: `set`, `remove`, `add_tag`, `remove_tag`, `add_alias`, `remove_alias`. + +**Delete a note:** + +```bash +engraph write delete --file "Old Draft" --mode soft # moves to archive +engraph write delete --file "Old Draft" --mode hard # permanent removal +``` + +**Check vault health:** + +```bash +engraph context health +``` + +Returns orphan notes (no links in or out), broken wikilinks, stale notes, and tag hygiene issues. + ## Use cases **AI-assisted knowledge work** — Give Claude or Cursor deep access to your personal knowledge base. Instead of copy-pasting context, the agent searches, reads, and cross-references your notes directly. @@ -208,8 +251,9 @@ engraph resolves tags against the registry (fuzzy matching), discovers potential | Search method | 4-lane RRF (semantic + BM25 + graph + reranker) | Vector similarity only | Keyword only | | Query understanding | LLM orchestrator classifies intent, adapts weights | None | None | | Understands note links | Yes (wikilink graph traversal) | No | Limited (backlinks panel) | -| AI agent access | MCP server (13 tools) | Custom API needed | No | -| Write capability | Create/append/move with smart filing | No | Manual | +| AI agent access | MCP server (19 tools) | Custom API needed | No | +| Write capability | Create/edit/rewrite/delete with smart filing | No | Manual | +| Vault health | Orphans, broken links, stale notes, tag hygiene | No | Limited | | Real-time sync | File watcher, 2s debounce | Manual re-index | N/A | | Runs locally | Yes, llama.cpp + Metal GPU | Depends | Yes | | Setup | One binary, one command | Framework + code | Built-in | @@ -222,24 +266,33 @@ engraph is not a replacement for Obsidian — it's the intelligence layer that s - LLM research orchestrator: query intent classification + query expansion + adaptive lane weights - llama.cpp inference via Rust bindings (GGUF models, Metal GPU on macOS, CUDA on Linux) - Intelligence opt-in: heuristic fallback when disabled, LLM-powered when enabled -- MCP server with 13 tools (7 read, 6 write) via stdio -- Real-time file watching with 2s debounce and startup reconciliation +- MCP server with 19 tools (8 read, 10 write, 1 diagnostic) via stdio +- Section-level reading and editing: target specific headings with replace/prepend/append modes +- Full note rewriting with automatic frontmatter preservation +- Granular frontmatter mutations: set/remove fields, add/remove tags and aliases +- Soft delete (archive) and hard delete (permanent) with audit logging +- Vault health diagnostics: orphan notes, broken wikilinks, stale content, tag hygiene +- Obsidian CLI integration with circuit breaker (Closed/Degraded/Open) for resilient delegation +- Real-time file watching with 2s debounce, startup reconciliation, and watcher coordination to prevent double re-indexing - Write pipeline: tag resolution, fuzzy link discovery, semantic folder placement - Context engine: topic bundles, person bundles, project bundles with token budgets - Vault graph: bidirectional wikilink + mention edges with multi-hop expansion - Placement correction learning from user file moves +- Enhanced file resolution with fuzzy Levenshtein matching fallback +- Content-based folder role detection (people, daily, archive) by content patterns - Configurable model overrides for multilingual support -- 270 unit tests, CI on macOS + Ubuntu +- 318 unit tests, CI on macOS + Ubuntu ## Roadmap - [x] ~~Research orchestrator — query classification and adaptive lane weighting~~ (v1.0) - [x] ~~LLM reranker — optional local model for result quality~~ (v1.0) -- [ ] MCP edit/rewrite tools — full note editing for AI agents (v1.1) +- [x] ~~MCP edit/rewrite tools — full note editing for AI agents~~ (v1.1) +- [x] ~~Vault health monitor — orphan notes, broken links, stale content, tag hygiene~~ (v1.1) +- [x] ~~Obsidian CLI integration — auto-detect and delegate with circuit breaker~~ (v1.1) - [ ] Temporal search — find notes by time period, detect trends (v1.2) - [ ] HTTP/REST API — complement MCP with a standard web API (v1.3) - [ ] Multi-vault — search across multiple vaults (v1.4) -- [ ] Vault health monitor — surface orphan notes, broken links, stale content ## Configuration @@ -257,6 +310,15 @@ intelligence = true [models] # embed = "hf:Qwen/Qwen3-Embedding-0.6B-GGUF/qwen3-embedding-0.6b-q8_0.gguf" # rerank = "hf:ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-reranker-0.6b-q8_0.gguf" + +# Obsidian CLI integration (auto-detected during init) +[obsidian] +# enabled = true +# cli_path = "/usr/local/bin/obsidian" + +# Registered AI agents +[agents] +# names = ["claude-code", "cursor"] ``` All data stored in `~/.engraph/` — single SQLite database (~10MB typical), GGUF models, and vault profile. @@ -264,7 +326,7 @@ All data stored in `~/.engraph/` — single SQLite database (~10MB typical), GGU ## Development ```bash -cargo test --lib # 270 unit tests, no network (requires CMake for llama.cpp) +cargo test --lib # 318 unit tests, no network (requires CMake for llama.cpp) cargo clippy -- -D warnings cargo fmt --check @@ -276,7 +338,7 @@ cargo test --test integration -- --ignored Contributions welcome. Please open an issue first to discuss what you'd like to change. -The codebase is 19 Rust modules behind a lib crate. `CLAUDE.md` in the repo root has detailed architecture documentation for AI-assisted development. +The codebase is 22 Rust modules behind a lib crate. `CLAUDE.md` in the repo root has detailed architecture documentation for AI-assisted development. ## License diff --git a/src/config.rs b/src/config.rs index 676cecf..80b9f01 100644 --- a/src/config.rs +++ b/src/config.rs @@ -14,6 +14,26 @@ pub struct ModelConfig { pub expand: Option, } +/// Obsidian integration configuration. +#[derive(Debug, Clone, Serialize, Deserialize, Default)] +pub struct ObsidianConfig { + #[serde(default)] + pub enabled: bool, + pub vault_name: Option, + pub cli_path: Option, +} + +/// Agent integration configuration. +#[derive(Debug, Clone, Serialize, Deserialize, Default)] +pub struct AgentsConfig { + #[serde(default)] + pub claude_code: bool, + #[serde(default)] + pub cursor: bool, + #[serde(default)] + pub windsurf: bool, +} + /// Application configuration, loaded from `~/.engraph/config.toml` with CLI overrides. #[derive(Debug, Clone, Serialize, Deserialize)] #[serde(default)] @@ -30,6 +50,12 @@ pub struct Config { pub intelligence: Option, /// Model override URIs. pub models: ModelConfig, + /// Obsidian integration settings. + #[serde(default)] + pub obsidian: ObsidianConfig, + /// Agent integration settings. + #[serde(default)] + pub agents: AgentsConfig, } impl Default for Config { @@ -41,6 +67,8 @@ impl Default for Config { batch_size: 64, intelligence: None, models: ModelConfig::default(), + obsidian: ObsidianConfig::default(), + agents: AgentsConfig::default(), } } } @@ -216,6 +244,29 @@ rerank = "hf:ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-reranker-0.6b-q8_0.ggu assert!(!cfg.intelligence_enabled()); } + #[test] + fn test_config_backward_compat() { + // Old format: intelligence = true at top level + let toml = r#"intelligence = true"#; + let config: Config = toml::from_str(toml).unwrap(); + assert_eq!(config.intelligence, Some(true)); + // New fields default to None/false + assert!(!config.obsidian.enabled); + } + + #[test] + fn test_config_with_obsidian() { + let toml = r#" +intelligence = true +[obsidian] +enabled = true +vault_name = "Personal" +"#; + let config: Config = toml::from_str(toml).unwrap(); + assert!(config.obsidian.enabled); + assert_eq!(config.obsidian.vault_name.as_deref(), Some("Personal")); + } + #[test] fn test_config_roundtrip_with_intelligence() { let dir = tempfile::tempdir().unwrap(); diff --git a/src/context.rs b/src/context.rs index 6fd2a59..c7c2b34 100644 --- a/src/context.rs +++ b/src/context.rs @@ -408,6 +408,41 @@ fn get_mention_snippet(params: &ContextParams, file_id: i64, name: &str) -> Stri String::new() } +// --------------------------------------------------------------------------- +// Section reading +// --------------------------------------------------------------------------- + +#[derive(Debug, Serialize)] +pub struct SectionResult { + pub path: String, + pub heading: String, + pub content: String, + pub line_start: usize, + pub line_end: usize, +} + +pub fn read_section( + store: &Store, + vault_root: &Path, + file: &str, + heading: &str, +) -> Result { + let record = store + .resolve_file(file)? + .ok_or_else(|| anyhow::anyhow!("Not found: {file}"))?; + let path = vault_root.join(&record.path); + let content = std::fs::read_to_string(&path)?; + let section = crate::markdown::find_section(&content, heading) + .ok_or_else(|| anyhow::anyhow!("Section not found: {heading}"))?; + Ok(SectionResult { + path: record.path, + heading: section.heading.text, + content: section.content, + line_start: section.body_start, + line_end: section.body_end, + }) +} + /// Build a project context bundle: note, child notes, tasks, team, recent mentions. pub fn context_project(params: &ContextParams, name: &str) -> Result { let (note, project_id, project_folder) = if let Some(pf) = resolve_file(params, name)? { @@ -1112,4 +1147,20 @@ mod tests { assert!(s.is_char_boundary(snap)); assert!(snap <= 6); } + + #[test] + fn test_read_section() { + let tmp = TempDir::new().unwrap(); + let root = tmp.path().to_path_buf(); + let store = Store::open_memory().unwrap(); + let content = "# Person\n\n## Role\n\nEngineer\n\n## Interactions\n\nMet on 2026-03-26\n"; + std::fs::write(root.join("person.md"), content).unwrap(); + store + .insert_file("person.md", "hash", 100, &[], "per123", None) + .unwrap(); + + let result = read_section(&store, &root, "person.md", "Interactions").unwrap(); + assert!(result.content.contains("Met on 2026-03-26")); + assert!(!result.content.contains("Engineer")); + } } diff --git a/src/health.rs b/src/health.rs new file mode 100644 index 0000000..fc2e980 --- /dev/null +++ b/src/health.rs @@ -0,0 +1,240 @@ +use anyhow::Result; + +use crate::store::Store; + +/// Full vault health report. +#[derive(Debug, Clone, serde::Serialize)] +pub struct HealthReport { + pub orphans: Vec, + pub broken_links: Vec, + pub stale_notes: Vec, + pub inbox_pending: Vec, + pub tag_issues: Vec, + pub index_age_seconds: u64, + pub total_files: usize, +} + +/// A wikilink that could not be resolved to any indexed file. +#[derive(Debug, Clone, serde::Serialize)] +pub struct BrokenLink { + pub source: String, + pub target: String, +} + +/// A tag-related problem in a file. +#[derive(Debug, Clone, serde::Serialize)] +pub struct TagIssue { + pub file: String, + pub issue: String, +} + +/// Configuration controlling which folders are excluded from health checks. +pub struct HealthConfig { + pub daily_folder: Option, + pub inbox_folder: Option, +} + +/// Find files with no edges (neither incoming nor outgoing). +/// +/// Excludes files whose path starts with the configured daily or inbox folder +/// prefixes — those are expected to be unlinked. +pub fn find_orphans(store: &Store, config: &HealthConfig) -> Result> { + let mut exclude = Vec::new(); + if let Some(ref daily) = config.daily_folder { + exclude.push(daily.as_str()); + } + if let Some(ref inbox) = config.inbox_folder { + exclude.push(inbox.as_str()); + } + let isolated = store.find_isolated_files(&exclude)?; + Ok(isolated.into_iter().map(|f| f.path).collect()) +} + +/// Find wikilink references that could not be resolved to any indexed file. +/// +/// These are recorded in the `unresolved_links` table during indexing. +pub fn find_broken_links(store: &Store) -> Result> { + let unresolved = store.get_unresolved_links()?; + Ok(unresolved + .into_iter() + .map(|(source, target)| BrokenLink { source, target }) + .collect()) +} + +/// Find notes that haven't been updated in the given number of days. +/// +/// Stub — returns an empty vec for now. A full implementation would check +/// `mtime` or a `reviewed_at` frontmatter field. +pub fn find_stale_notes(_store: &Store, _days: u32) -> Result> { + Ok(Vec::new()) +} + +/// Generate a combined health report for the vault. +pub fn generate_health_report(store: &Store, config: &HealthConfig) -> Result { + let orphans = find_orphans(store, config)?; + let broken_links = find_broken_links(store)?; + let stale_notes = find_stale_notes(store, 90)?; + + // Inbox pending: files in the inbox folder. + let inbox_pending = if let Some(ref inbox) = config.inbox_folder { + store + .find_files_by_prefix(&format!("{}%", inbox))? + .into_iter() + .map(|f| f.path) + .collect() + } else { + Vec::new() + }; + + let all_files = store.get_all_files()?; + let total_files = all_files.len(); + + // Tag issues: find work notes missing required tags. + let tag_issues = all_files + .iter() + .filter(|f| f.path.contains("Work/") || f.path.contains("01-Projects/Work/")) + .filter(|f| !f.tags.iter().any(|t| t == "work")) + .map(|f| TagIssue { + file: f.path.clone(), + issue: "work note missing 'work' tag".to_string(), + }) + .collect(); + + // Index age: seconds since the most recent indexed_at timestamp. + let index_age_seconds = { + let last = all_files + .iter() + .filter_map(|f| f.indexed_at.parse::().ok()) + .max() + .unwrap_or(0); + if last == 0 { + 0 + } else { + use std::time::SystemTime; + let now = SystemTime::now() + .duration_since(SystemTime::UNIX_EPOCH) + .unwrap_or_default() + .as_secs(); + now.saturating_sub(last) + } + }; + + Ok(HealthReport { + orphans, + broken_links, + stale_notes, + inbox_pending, + tag_issues, + index_age_seconds, + total_files, + }) +} + +#[cfg(test)] +mod tests { + use super::*; + use crate::store::Store; + + fn setup_health_store() -> Store { + let store = Store::open_memory().unwrap(); + // Insert files with edges to test orphan detection. + let linked_id = store + .insert_file("linked.md", "aaa111", 100, &[], "aaa111", None) + .unwrap(); + let orphan_id = store + .insert_file("orphan.md", "bbb222", 100, &[], "bbb222", None) + .unwrap(); + let _daily_id = store + .insert_file("daily/2026-03-26.md", "ccc333", 100, &[], "ccc333", None) + .unwrap(); + // Add edge: linked.md → orphan.md (both files are "connected") + store.insert_edge(linked_id, orphan_id, "wikilink").unwrap(); + store + } + + #[test] + fn test_find_orphans_excludes_daily() { + let store = setup_health_store(); + let config = HealthConfig { + daily_folder: Some("daily/".to_string()), + inbox_folder: None, + }; + let orphans = find_orphans(&store, &config).unwrap(); + // linked.md has outgoing edge, orphan.md has incoming edge — both connected. + // daily note is excluded by prefix. Result should be empty. + assert!(orphans.is_empty()); + } + + #[test] + fn test_find_orphans_detects_isolated() { + let store = Store::open_memory().unwrap(); + store + .insert_file("connected.md", "h1", 100, &[], "d1", None) + .unwrap(); + let iso_id = store + .insert_file("island.md", "h2", 100, &[], "d2", None) + .unwrap(); + let other_id = store + .insert_file("other.md", "h3", 100, &[], "d3", None) + .unwrap(); + store.insert_edge(iso_id, other_id, "wikilink").unwrap(); + + let config = HealthConfig { + daily_folder: None, + inbox_folder: None, + }; + let orphans = find_orphans(&store, &config).unwrap(); + // connected.md has no edges at all — it's the orphan. + assert_eq!(orphans.len(), 1); + assert_eq!(orphans[0], "connected.md"); + } + + #[test] + fn test_find_broken_links() { + let store = setup_health_store(); + // Record an unresolved link (wikilink target that doesn't exist). + store + .insert_unresolved_link("linked.md", "nonexistent.md") + .unwrap(); + let broken = find_broken_links(&store).unwrap(); + assert_eq!(broken.len(), 1); + assert_eq!(broken[0].source, "linked.md"); + assert_eq!(broken[0].target, "nonexistent.md"); + } + + #[test] + fn test_find_broken_links_empty_when_none() { + let store = setup_health_store(); + let broken = find_broken_links(&store).unwrap(); + assert!(broken.is_empty()); + } + + #[test] + fn test_generate_health_report() { + let store = Store::open_memory().unwrap(); + store + .insert_file("note.md", "h1", 100, &[], "d1", None) + .unwrap(); + store + .insert_file("00-Inbox/unsorted.md", "h2", 100, &[], "d2", None) + .unwrap(); + store + .insert_unresolved_link("note.md", "missing.md") + .unwrap(); + + let config = HealthConfig { + daily_folder: Some("daily/".to_string()), + inbox_folder: Some("00-Inbox/".to_string()), + }; + let report = generate_health_report(&store, &config).unwrap(); + assert_eq!(report.total_files, 2); + // note.md has no edges and is not in daily/ or inbox/ — it's an orphan. + assert_eq!(report.orphans.len(), 1); + assert_eq!(report.orphans[0], "note.md"); + // One broken link recorded. + assert_eq!(report.broken_links.len(), 1); + // One file in inbox. + assert_eq!(report.inbox_pending.len(), 1); + assert_eq!(report.inbox_pending[0], "00-Inbox/unsorted.md"); + } +} diff --git a/src/lib.rs b/src/lib.rs index 5ab1ab8..3939ba2 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -5,9 +5,12 @@ pub mod docid; pub mod fts; pub mod fusion; pub mod graph; +pub mod health; pub mod indexer; pub mod links; pub mod llm; +pub mod markdown; +pub mod obsidian; pub mod placement; pub mod profile; pub mod search; diff --git a/src/main.rs b/src/main.rs index 10a20f3..af436b1 100644 --- a/src/main.rs +++ b/src/main.rs @@ -85,6 +85,18 @@ enum Command { /// Override a model: --model embed|rerank|expand #[arg(long, num_args = 2, value_names = &["TYPE", "URI"])] model: Option>, + + /// Enable Obsidian CLI integration. + #[arg(long, conflicts_with = "disable_obsidian_cli")] + enable_obsidian_cli: bool, + + /// Disable Obsidian CLI integration. + #[arg(long, conflicts_with = "enable_obsidian_cli")] + disable_obsidian_cli: bool, + + /// Register with an AI agent: "claude-code", "cursor", or "windsurf". + #[arg(long)] + register: Option, }, /// Manage embedding models. @@ -208,6 +220,50 @@ enum WriteAction { /// Archived note path (e.g., "04-Archive/01-Projects/note.md"). file: String, }, + /// Edit a specific section of a note. + Edit { + /// Target note (path, basename, or #docid). + #[arg(long)] + file: String, + /// Section heading to edit (case-insensitive). + #[arg(long)] + heading: String, + /// Content to add/replace in the section. + #[arg(long)] + content: String, + /// Edit mode: "replace", "prepend", or "append" (default: "append"). + #[arg(long, default_value = "append")] + mode: String, + }, + /// Rewrite a note's body content (preserves frontmatter by default). + Rewrite { + /// Target note (path, basename, or #docid). + #[arg(long)] + file: String, + /// New body content. + #[arg(long)] + content: String, + /// Preserve existing frontmatter (default: true). + #[arg(long, default_value_t = true)] + preserve_frontmatter: bool, + }, + /// Edit a note's frontmatter properties. + EditFrontmatter { + /// Target note (path, basename, or #docid). + #[arg(long)] + file: String, + /// Operations as JSON string: [{"op":"add_tag","value":"rust"},{"op":"set","key":"status","value":"done"}] + #[arg(long)] + operations: String, + }, + /// Delete a note. + Delete { + /// Target note (path, basename, or #docid). + file: String, + /// Delete mode: "soft" (archive, default) or "hard" (permanent). + #[arg(long, default_value = "soft")] + mode: String, + }, } #[derive(Subcommand, Debug)] @@ -452,7 +508,7 @@ async fn main() -> Result<()> { println!(" Max folder depth: {}", stats.folder_depth); let vault_profile = profile::VaultProfile { - vault_path, + vault_path: vault_path.clone(), vault_type, structure, stats, @@ -471,12 +527,97 @@ async fn main() -> Result<()> { cfg.intelligence = Some(enable); cfg.save()?; } + + // Obsidian CLI detection + let obsidian_running = std::process::Command::new("pgrep") + .args(["-x", "Obsidian"]) + .stdout(std::process::Stdio::null()) + .stderr(std::process::Stdio::null()) + .status() + .map(|s| s.success()) + .unwrap_or(false); + + let obsidian_in_path = std::process::Command::new("which") + .arg("obsidian") + .stdout(std::process::Stdio::null()) + .stderr(std::process::Stdio::null()) + .status() + .map(|s| s.success()) + .unwrap_or(false); + + if obsidian_running && obsidian_in_path { + eprint!("\nObsidian CLI detected. Enable integration? [Y/n] "); + io::stderr().flush()?; + let mut answer = String::new(); + io::stdin().lock().read_line(&mut answer)?; + let answer = answer.trim(); + let enable = answer.is_empty() || answer.eq_ignore_ascii_case("y"); + if enable { + let vault_name = vault_path + .file_name() + .and_then(|n| n.to_str()) + .unwrap_or("Personal") + .to_string(); + cfg.obsidian.enabled = true; + cfg.obsidian.vault_name = Some(vault_name.clone()); + cfg.save()?; + println!("Obsidian CLI enabled (vault: {vault_name})."); + } else { + println!( + "Obsidian CLI disabled. Enable later with: engraph configure --enable-obsidian-cli" + ); + } + } + + // AI agent detection + let home = dirs::home_dir().unwrap_or_default(); + let agent_configs: &[(&str, &str, &str)] = &[ + ("Claude Code", "claude-code", ".claude/settings.json"), + ("Cursor", "cursor", ".cursor/mcp.json"), + ("Windsurf", "windsurf", ".codeium/windsurf/mcp_config.json"), + ]; + + let mut detected: Vec<(&str, &str, String)> = Vec::new(); + for (name, key, rel_path) in agent_configs { + let full = home.join(rel_path); + if full.exists() { + detected.push((name, key, format!("~/{rel_path}"))); + } + } + + if !detected.is_empty() { + println!("\nAI agents detected:"); + for (name, _key, path) in &detected { + println!(" \u{2713} {name} ({path})"); + } + println!( + "\nTo register engraph as MCP server, add to your agent's config:\n \ + \"engraph\": {{\n \ + \"command\": \"engraph\",\n \ + \"args\": [\"serve\"]\n \ + }}" + ); + + // Record detected agents in config + for (_name, key, _path) in &detected { + match *key { + "claude-code" => cfg.agents.claude_code = true, + "cursor" => cfg.agents.cursor = true, + "windsurf" => cfg.agents.windsurf = true, + _ => {} + } + } + cfg.save()?; + } } Command::Configure { enable_intelligence, disable_intelligence, model, + enable_obsidian_cli, + disable_obsidian_cli, + register, } => { let mut cfg = Config::load()?; @@ -528,6 +669,54 @@ async fn main() -> Result<()> { } } + if enable_obsidian_cli { + cfg.obsidian.enabled = true; + println!("Obsidian CLI integration enabled."); + } else if disable_obsidian_cli { + cfg.obsidian.enabled = false; + println!("Obsidian CLI integration disabled."); + } + + if let Some(agent) = register { + match agent.as_str() { + "claude-code" => { + cfg.agents.claude_code = true; + println!( + "Registered Claude Code. Add to ~/.claude/settings.json:\n \ + \"engraph\": {{\n \ + \"command\": \"engraph\",\n \ + \"args\": [\"serve\"]\n \ + }}" + ); + } + "cursor" => { + cfg.agents.cursor = true; + println!( + "Registered Cursor. Add to ~/.cursor/mcp.json:\n \ + \"engraph\": {{\n \ + \"command\": \"engraph\",\n \ + \"args\": [\"serve\"]\n \ + }}" + ); + } + "windsurf" => { + cfg.agents.windsurf = true; + println!( + "Registered Windsurf. Add to ~/.codeium/windsurf/mcp_config.json:\n \ + \"engraph\": {{\n \ + \"command\": \"engraph\",\n \ + \"args\": [\"serve\"]\n \ + }}" + ); + } + other => { + anyhow::bail!( + "Unknown agent: {other}. Use: claude-code, cursor, or windsurf." + ); + } + } + } + cfg.save()?; println!( "Configuration saved to {}", @@ -975,6 +1164,165 @@ async fn main() -> Result<()> { println!("Restored: {} → {}", file, result.path); } } + WriteAction::Edit { + file, + heading, + content, + mode, + } => { + let edit_mode = match mode.as_str() { + "replace" => engraph::writer::EditMode::Replace, + "prepend" => engraph::writer::EditMode::Prepend, + _ => engraph::writer::EditMode::Append, + }; + let input = engraph::writer::EditInput { + file, + heading, + content, + mode: edit_mode, + modified_by: "cli".into(), + }; + let result = engraph::writer::edit_note(&store, &vault_path, &input, None)?; + if cli.json { + println!("{}", serde_json::to_string_pretty(&result)?); + } else { + println!( + "Edited: {} section \"{}\" ({})", + result.path, result.heading, result.mode + ); + } + } + WriteAction::Rewrite { + file, + content, + preserve_frontmatter, + } => { + let input = engraph::writer::RewriteInput { + file, + content, + preserve_frontmatter, + modified_by: "cli".into(), + }; + let result = engraph::writer::rewrite_note(&store, &vault_path, &input)?; + if cli.json { + println!("{}", serde_json::to_string_pretty(&result)?); + } else { + println!( + "Rewrote: {} (frontmatter {})", + result.path, + if preserve_frontmatter { + "preserved" + } else { + "replaced" + } + ); + } + } + WriteAction::EditFrontmatter { file, operations } => { + let raw_ops: Vec = serde_json::from_str(&operations) + .map_err(|e| anyhow::anyhow!("invalid JSON operations: {}", e))?; + let mut ops = Vec::new(); + for raw in &raw_ops { + let op = raw.get("op").and_then(|v| v.as_str()).unwrap_or(""); + match op { + "set" => { + let key = raw + .get("key") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(); + let value = raw + .get("value") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(); + ops.push(engraph::writer::FrontmatterOp::Set(key, value)); + } + "remove" => { + let key = raw + .get("key") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(); + ops.push(engraph::writer::FrontmatterOp::Remove(key)); + } + "add_tag" => { + let value = raw + .get("value") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(); + ops.push(engraph::writer::FrontmatterOp::AddTag(value)); + } + "remove_tag" => { + let value = raw + .get("value") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(); + ops.push(engraph::writer::FrontmatterOp::RemoveTag(value)); + } + "add_alias" => { + let value = raw + .get("value") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(); + ops.push(engraph::writer::FrontmatterOp::AddAlias(value)); + } + "remove_alias" => { + let value = raw + .get("value") + .and_then(|v| v.as_str()) + .unwrap_or("") + .to_string(); + ops.push(engraph::writer::FrontmatterOp::RemoveAlias(value)); + } + _ => { + return Err(anyhow::anyhow!("unknown frontmatter op: {:?}", op)); + } + } + } + let input = engraph::writer::EditFrontmatterInput { + file, + operations: ops, + modified_by: "cli".into(), + }; + let result = engraph::writer::edit_frontmatter(&store, &vault_path, &input)?; + if cli.json { + println!("{}", serde_json::to_string_pretty(&result)?); + } else { + println!("Frontmatter updated: {}", result.path); + } + } + WriteAction::Delete { file, mode } => { + let delete_mode = match mode.as_str() { + "hard" => engraph::writer::DeleteMode::Hard, + _ => engraph::writer::DeleteMode::Soft, + }; + let archive_folder = profile + .as_ref() + .and_then(|p| p.structure.folders.archive.as_deref()) + .unwrap_or("04-Archive"); + engraph::writer::delete_note( + &store, + &vault_path, + &file, + delete_mode, + archive_folder, + )?; + if cli.json { + println!( + "{}", + serde_json::to_string_pretty(&serde_json::json!({ + "deleted": file, + "mode": mode + }))? + ); + } else { + println!("Deleted: {} ({})", file, mode); + } + } } } diff --git a/src/markdown.rs b/src/markdown.rs new file mode 100644 index 0000000..4d99bfd --- /dev/null +++ b/src/markdown.rs @@ -0,0 +1,175 @@ +#[derive(Debug, Clone)] +pub struct HeadingInfo { + pub line: usize, + pub level: u8, + pub text: String, +} + +pub fn parse_headings(content: &str) -> Vec { + let mut headings = Vec::new(); + let mut in_code_block = false; + for (i, line) in content.lines().enumerate() { + let trimmed = line.trim(); + if trimmed.starts_with("```") || trimmed.starts_with("~~~") { + in_code_block = !in_code_block; + continue; + } + if in_code_block { + continue; + } + if let Some(rest) = trimmed.strip_prefix('#') { + let hashes = rest.chars().take_while(|&c| c == '#').count(); + let level = 1 + hashes as u8; + let after_hashes = &rest[hashes..]; + if level <= 6 && (after_hashes.is_empty() || after_hashes.starts_with(' ')) { + let text = after_hashes.trim().trim_end_matches('#').trim(); + headings.push(HeadingInfo { + line: i, + level, + text: text.to_string(), + }); + } + } + } + headings +} + +#[derive(Debug, Clone)] +pub struct Section { + pub heading: HeadingInfo, + pub body_start: usize, + pub body_end: usize, + pub content: String, +} + +pub fn find_section(content: &str, heading_text: &str) -> Option
{ + let headings = parse_headings(content); + let target = heading_text.trim().to_lowercase(); + let lines: Vec<&str> = content.lines().collect(); + + let idx = headings + .iter() + .position(|h| h.text.to_lowercase() == target)?; + let h = &headings[idx]; + let body_start = h.line + 1; + let body_end = headings[idx + 1..] + .iter() + .find(|next| next.level <= h.level) + .map(|next| next.line) + .unwrap_or(lines.len()); + + let content_str = lines[body_start..body_end].join("\n"); + Some(Section { + heading: HeadingInfo { + line: h.line, + level: h.level, + text: h.text.clone(), + }, + body_start, + body_end, + content: content_str, + }) +} + +pub fn split_frontmatter(content: &str) -> (Option, String) { + let lines: Vec<&str> = content.lines().collect(); + if lines.first().map(|l| l.trim()) != Some("---") { + return (None, content.to_string()); + } + for (i, line) in lines.iter().enumerate().skip(1) { + if line.trim() == "---" { + let fm = lines[1..i].join("\n"); + let body = lines[i + 1..].join("\n"); + return (Some(fm), body); + } + } + (None, content.to_string()) +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_parse_headings_basic() { + let content = "# Title\n\nSome text\n\n## Section A\n\nContent\n\n## Section B\n"; + let headings = parse_headings(content); + assert_eq!(headings.len(), 3); + assert_eq!(headings[0].level, 1); + assert_eq!(headings[0].text, "Title"); + assert_eq!(headings[1].level, 2); + assert_eq!(headings[1].text, "Section A"); + } + + #[test] + fn test_parse_headings_ignores_code_blocks() { + let content = "# Real\n\n```\n# Not a heading\n```\n\n## Also Real\n"; + let headings = parse_headings(content); + assert_eq!(headings.len(), 2); + assert_eq!(headings[0].text, "Real"); + assert_eq!(headings[1].text, "Also Real"); + } + + #[test] + fn test_parse_headings_strips_trailing_hashes() { + let content = "## Heading ##\n"; + let headings = parse_headings(content); + assert_eq!(headings[0].text, "Heading"); + } + + #[test] + fn test_find_section_basic() { + let content = "# Title\n\n## Interactions\n\nEntry 1\nEntry 2\n\n## Links\n\nSome links\n"; + let section = find_section(content, "Interactions").unwrap(); + assert_eq!(section.heading.text, "Interactions"); + assert!(section.content.contains("Entry 1")); + assert!(!section.content.contains("Some links")); + } + + #[test] + fn test_find_section_case_insensitive() { + let content = "## My Section\n\nContent\n"; + assert!(find_section(content, "my section").is_some()); + } + + #[test] + fn test_find_section_with_subsections() { + let content = "# Title\n\n## Interactions\n\nEntry\n\n### Sub-detail\n\nMore\n\n## Links\n\nSome links\n"; + let section = find_section(content, "Interactions").unwrap(); + assert!(section.content.contains("Entry")); + assert!(section.content.contains("Sub-detail")); + assert!(!section.content.contains("Some links")); + } + + #[test] + fn test_find_section_not_found() { + let content = "## Existing\n\nContent\n"; + assert!(find_section(content, "Missing").is_none()); + } + + #[test] + fn test_split_frontmatter_valid() { + let content = "---\ntitle: Test\ntags:\n - foo\n---\n\n# Body\n"; + let (fm, body) = split_frontmatter(content); + assert!(fm.is_some()); + assert!(fm.unwrap().contains("title: Test")); + assert!(body.contains("# Body")); + } + + #[test] + fn test_split_frontmatter_none() { + let content = "# No frontmatter\n\nJust content\n"; + let (fm, body) = split_frontmatter(content); + assert!(fm.is_none()); + assert!(body.contains("No frontmatter")); + } + + #[test] + fn test_parse_headings_ignores_inline_tags() { + let content = "# Title\n\nSome text with #tag and #another-tag\n\n## Real Section\n"; + let headings = parse_headings(content); + assert_eq!(headings.len(), 2); + assert_eq!(headings[0].text, "Title"); + assert_eq!(headings[1].text, "Real Section"); + } +} diff --git a/src/obsidian.rs b/src/obsidian.rs new file mode 100644 index 0000000..c3c6812 --- /dev/null +++ b/src/obsidian.rs @@ -0,0 +1,192 @@ +use std::process::Command; +use std::time::{Duration, Instant}; + +use anyhow::{Result, bail}; + +#[derive(Debug)] +pub enum CircuitState { + Closed, + Degraded, + Open, +} + +const COOLDOWN: Duration = Duration::from_secs(60); +const CHECK_TTL: Duration = Duration::from_secs(5); +const CMD_TIMEOUT: Duration = Duration::from_secs(3); + +pub struct ObsidianCli { + pub vault_name: String, + pub state: CircuitState, + failures: u32, + last_check: Instant, + last_available: bool, + open_until: Option, +} + +impl ObsidianCli { + pub fn new(vault_name: String) -> Self { + Self { + vault_name, + state: CircuitState::Closed, + failures: 0, + last_check: Instant::now() - CHECK_TTL, // force first check + last_available: false, + open_until: None, + } + } + + /// Record a successful CLI operation. Resets circuit to Closed. + pub fn record_success(&mut self) { + self.failures = 0; + self.state = CircuitState::Closed; + self.open_until = None; + } + + /// Record a CLI failure. Transitions Closed→Degraded→Open. + pub fn record_failure(&mut self) { + self.failures += 1; + match self.failures { + 1 => self.state = CircuitState::Degraded, + _ => { + self.state = CircuitState::Open; + self.open_until = Some(Instant::now() + COOLDOWN); + } + } + } + + /// Check if we should delegate operations to Obsidian CLI. + /// + /// Returns false when the circuit is open (and cooldown hasn't expired), + /// or when the Obsidian process isn't running. + pub fn should_delegate(&mut self) -> bool { + // If Open, check cooldown + if matches!(self.state, CircuitState::Open) + && let Some(until) = self.open_until + { + if Instant::now() < until { + return false; + } + // Cooldown expired — transition to Degraded for a retry + self.state = CircuitState::Degraded; + self.failures = 1; + self.open_until = None; + } + + // Check if Obsidian process is running (cached for CHECK_TTL) + let running = self.check_process(); + + running && !matches!(self.state, CircuitState::Open) + } + + /// Check whether the Obsidian process is running. + /// Result is cached for `CHECK_TTL` to avoid spawning pgrep on every call. + fn check_process(&mut self) -> bool { + if self.last_check.elapsed() < CHECK_TTL { + return self.last_available; + } + + let available = Command::new("pgrep") + .arg("-x") + .arg("Obsidian") + .status() + .map(|s| s.success()) + .unwrap_or(false); + + self.last_check = Instant::now(); + self.last_available = available; + available + } + + /// Set a property on a vault note via Obsidian CLI. + pub async fn property_set(&mut self, file: &str, name: &str, value: &str) -> Result { + self.run_cli(&[ + "property:set", + &format!("name={name}"), + &format!("value={value}"), + &format!("file={file}"), + ]) + .await + } + + /// Append content to today's daily note via Obsidian CLI. + pub async fn daily_append(&mut self, content: &str) -> Result { + self.run_cli(&["daily:append", &format!("content={content}")]) + .await + } + + /// Execute an Obsidian CLI command with a 3-second timeout. + async fn run_cli(&mut self, args: &[&str]) -> Result { + let vault_arg = format!("vault={}", self.vault_name); + let mut cmd = tokio::process::Command::new("obsidian"); + cmd.arg(&vault_arg); + for arg in args { + cmd.arg(arg); + } + + let result = tokio::time::timeout(CMD_TIMEOUT, cmd.output()).await; + + match result { + Ok(Ok(output)) if output.status.success() => { + self.record_success(); + Ok(String::from_utf8_lossy(&output.stdout).into_owned()) + } + Ok(Ok(output)) => { + self.record_failure(); + let stderr = String::from_utf8_lossy(&output.stderr); + bail!("obsidian CLI failed (exit {}): {stderr}", output.status) + } + Ok(Err(e)) => { + self.record_failure(); + bail!("obsidian CLI spawn error: {e}") + } + Err(_) => { + self.record_failure(); + bail!("obsidian CLI timed out after {CMD_TIMEOUT:?}") + } + } + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_circuit_starts_closed() { + let cli = ObsidianCli::new("TestVault".into()); + assert!(matches!(cli.state, CircuitState::Closed)); + } + + #[test] + fn test_single_failure_degrades() { + let mut cli = ObsidianCli::new("TestVault".into()); + cli.record_failure(); + assert!(matches!(cli.state, CircuitState::Degraded)); + } + + #[test] + fn test_two_failures_opens() { + let mut cli = ObsidianCli::new("TestVault".into()); + cli.record_failure(); + cli.record_failure(); + assert!(matches!(cli.state, CircuitState::Open)); + } + + #[test] + fn test_success_resets_to_closed() { + let mut cli = ObsidianCli::new("TestVault".into()); + cli.record_failure(); + assert!(matches!(cli.state, CircuitState::Degraded)); + cli.record_success(); + assert!(matches!(cli.state, CircuitState::Closed)); + } + + #[test] + fn test_is_available_when_open_returns_false() { + let mut cli = ObsidianCli::new("TestVault".into()); + cli.record_failure(); + cli.record_failure(); + // Open state — should not be available regardless of process + assert!(!cli.should_delegate()); + } +} diff --git a/src/profile.rs b/src/profile.rs index 07dbc89..b8e9526 100644 --- a/src/profile.rs +++ b/src/profile.rs @@ -61,6 +61,229 @@ pub struct VaultStats { pub folder_count: usize, } +// --------------------------------------------------------------------------- +// Content-based role detection +// --------------------------------------------------------------------------- + +/// Check whether a markdown file's frontmatter looks like a person note. +/// Returns true if it has a tag containing "person" or "people", OR has a "role" key. +fn is_person_like(text: &str) -> bool { + // Find frontmatter block. + let fm = if text.starts_with("---\n") { + text.get(4..) + .and_then(|rest| rest.find("\n---").map(|end| &rest[..end])) + } else if text.starts_with("---\r\n") { + text.get(5..) + .and_then(|rest| rest.find("\n---").map(|end| &rest[..end])) + } else { + None + }; + + let Some(fm) = fm else { + return false; + }; + + let mut has_person_tag = false; + let mut in_tags_block = false; + + for line in fm.lines() { + let trimmed = line.trim(); + + if trimmed.starts_with("role:") { + return true; + } + + if trimmed.starts_with("tags:") { + let after = trimmed.strip_prefix("tags:").unwrap().trim(); + if after.is_empty() { + in_tags_block = true; + continue; + } + // Inline list: tags: [person, ...] or tags: person, ... + let after = after.trim_start_matches('[').trim_end_matches(']'); + for tag in after.split(',') { + let t = tag + .trim() + .trim_matches('"') + .trim_matches('\'') + .trim_matches('#') + .to_ascii_lowercase(); + if t == "person" || t == "people" { + has_person_tag = true; + } + } + if has_person_tag { + return true; + } + in_tags_block = false; + continue; + } + + if in_tags_block { + if trimmed.starts_with("- ") { + let t = trimmed + .strip_prefix("- ") + .unwrap() + .trim() + .trim_matches('"') + .trim_matches('\'') + .trim_matches('#') + .to_ascii_lowercase(); + if t == "person" || t == "people" { + return true; + } + } else if !trimmed.is_empty() { + in_tags_block = false; + } + } + } + + false +} + +/// Check whether a filename looks like a date note (YYYY-MM-DD.md). +fn is_date_filename(name: &str) -> bool { + // Must match exactly: YYYY-MM-DD.md (13 chars: 4+1+2+1+2+3) + let bytes = name.as_bytes(); + if bytes.len() != 13 { + return false; + } + if &name[4..5] != "-" || &name[7..8] != "-" || &name[10..] != ".md" { + return false; + } + bytes[..4].iter().all(|b| b.is_ascii_digit()) + && bytes[5..7].iter().all(|b| b.is_ascii_digit()) + && bytes[8..10].iter().all(|b| b.is_ascii_digit()) +} + +/// Scan top-level subdirectories and return the one (with trailing slash) where +/// 60%+ of the `.md` files have person-like frontmatter. Returns `None` if no +/// folder qualifies. +pub fn detect_people_folder(root: &Path) -> Result> { + let entries = std::fs::read_dir(root) + .with_context(|| format!("cannot read directory {}", root.display()))?; + + for entry in entries { + let entry = entry?; + if !entry.file_type()?.is_dir() { + continue; + } + let name = entry.file_name(); + let name_str = name.to_string_lossy(); + if name_str.starts_with('.') { + continue; + } + + let dir = entry.path(); + let mut total = 0usize; + let mut person_like = 0usize; + + let inner = std::fs::read_dir(&dir) + .with_context(|| format!("cannot read directory {}", dir.display()))?; + for inner_entry in inner { + let inner_entry = inner_entry?; + if !inner_entry.file_type()?.is_file() { + continue; + } + let fname = inner_entry.file_name(); + let fname_str = fname.to_string_lossy(); + if !fname_str.ends_with(".md") { + continue; + } + total += 1; + let text = std::fs::read_to_string(inner_entry.path()).unwrap_or_default(); + if is_person_like(&text) { + person_like += 1; + } + } + + if total > 0 && person_like * 100 / total >= 60 { + return Ok(Some(format!("{}/", name_str))); + } + } + + Ok(None) +} + +/// Scan top-level subdirectories and return the one (with trailing slash) where +/// 60%+ of the `.md` filenames match the YYYY-MM-DD pattern. Returns `None` if +/// no folder qualifies. +pub fn detect_daily_folder(root: &Path) -> Result> { + let entries = std::fs::read_dir(root) + .with_context(|| format!("cannot read directory {}", root.display()))?; + + for entry in entries { + let entry = entry?; + if !entry.file_type()?.is_dir() { + continue; + } + let name = entry.file_name(); + let name_str = name.to_string_lossy(); + if name_str.starts_with('.') { + continue; + } + + let dir = entry.path(); + let mut total = 0usize; + let mut date_like = 0usize; + + let inner = std::fs::read_dir(&dir) + .with_context(|| format!("cannot read directory {}", dir.display()))?; + for inner_entry in inner { + let inner_entry = inner_entry?; + if !inner_entry.file_type()?.is_file() { + continue; + } + let fname = inner_entry.file_name(); + let fname_str = fname.to_string_lossy(); + if !fname_str.ends_with(".md") { + continue; + } + total += 1; + if is_date_filename(&fname_str) { + date_like += 1; + } + } + + if total > 0 && date_like * 100 / total >= 60 { + return Ok(Some(format!("{}/", name_str))); + } + } + + Ok(None) +} + +/// Find the archive folder by looking for well-known names (case-insensitive): +/// "archive", "_archive", ".archive", or folders matching PARA-style patterns +/// like "04-Archive". +pub fn detect_archive_folder(root: &Path) -> Result> { + let archive_names: &[&str] = &["archive", "_archive", ".archive"]; + + let entries = std::fs::read_dir(root) + .with_context(|| format!("cannot read directory {}", root.display()))?; + + for entry in entries { + let entry = entry?; + if !entry.file_type()?.is_dir() { + continue; + } + let name = entry.file_name(); + let name_str = name.to_string_lossy(); + + // Strip leading digits and separators for PARA-style matching. + let stripped = name_str + .trim_start_matches(|c: char| c.is_ascii_digit()) + .trim_start_matches(['-', '_', ' ']); + + let lower = stripped.to_ascii_lowercase(); + if archive_names.contains(&lower.as_str()) { + return Ok(Some(format!("{}/", name_str))); + } + } + + Ok(None) +} + // --------------------------------------------------------------------------- // Detection helpers // --------------------------------------------------------------------------- @@ -159,6 +382,28 @@ pub fn detect_structure(path: &Path) -> Result { StructureMethod::Flat }; + // For non-PARA vaults, try content-based detection for roles not yet filled. + if method != StructureMethod::Para { + if folders.people.is_none() { + folders.people = detect_people_folder(path) + .ok() + .flatten() + .map(|s| s.trim_end_matches('/').to_string()); + } + if folders.daily.is_none() { + folders.daily = detect_daily_folder(path) + .ok() + .flatten() + .map(|s| s.trim_end_matches('/').to_string()); + } + if folders.archive.is_none() { + folders.archive = detect_archive_folder(path) + .ok() + .flatten() + .map(|s| s.trim_end_matches('/').to_string()); + } + } + Ok(StructureDetection { method, folders }) } @@ -632,4 +877,35 @@ mod tests { assert_eq!(count, 4); // a, a/b, a/b/c, d assert_eq!(depth, 3); // a/b/c is depth 3 } + + #[test] + fn test_detect_people_folder_from_content() { + let tmp = tempfile::TempDir::new().unwrap(); + let root = tmp.path(); + std::fs::create_dir_all(root.join("contacts")).unwrap(); + // 3 out of 4 files have person-like frontmatter + for name in &["alice.md", "bob.md", "charlie.md"] { + std::fs::write( + root.join("contacts").join(name), + "---\ntags:\n - person\nrole: Engineer\n---\n", + ) + .unwrap(); + } + std::fs::write(root.join("contacts/readme.md"), "# Contacts\n").unwrap(); + + let detected = detect_people_folder(root).unwrap(); + assert_eq!(detected.as_deref(), Some("contacts/")); + } + + #[test] + fn test_detect_daily_folder_from_filenames() { + let tmp = tempfile::TempDir::new().unwrap(); + let root = tmp.path(); + std::fs::create_dir_all(root.join("journal")).unwrap(); + for date in &["2026-03-24.md", "2026-03-25.md", "2026-03-26.md"] { + std::fs::write(root.join("journal").join(date), "# Daily\n").unwrap(); + } + let detected = detect_daily_folder(root).unwrap(); + assert_eq!(detected.as_deref(), Some("journal/")); + } } diff --git a/src/serve.rs b/src/serve.rs index 6a91346..9259275 100644 --- a/src/serve.rs +++ b/src/serve.rs @@ -1,5 +1,7 @@ +use std::collections::HashMap; use std::path::{Path, PathBuf}; use std::sync::Arc; +use std::time::SystemTime; use anyhow::Result; use rmcp::handler::server::tool::ToolRouter; @@ -17,6 +19,7 @@ use crate::llm::{EmbedModel, OrchestratorModel, RerankModel}; use crate::profile::VaultProfile; use crate::search; use crate::store::Store; +use crate::writer::FrontmatterOp; // --------------------------------------------------------------------------- // Parameter structs @@ -120,10 +123,63 @@ pub struct UnarchiveParams { pub file: String, } +#[derive(Debug, Deserialize, JsonSchema)] +pub struct ReadSectionParams { + /// Target note: file path, basename, or #docid. + pub file: String, + /// Section heading to read (case-insensitive). + pub heading: String, +} + +#[derive(Debug, Deserialize, JsonSchema)] +pub struct HealthParams {} + +#[derive(Debug, Deserialize, JsonSchema)] +pub struct EditParams { + /// Target note: file path, basename, or #docid. + pub file: String, + /// Section heading to edit (case-insensitive). + pub heading: String, + /// Content to add/replace in the section. + pub content: String, + /// Edit mode: "replace", "prepend", or "append" (default: "append"). + pub mode: Option, +} + +#[derive(Debug, Deserialize, JsonSchema)] +pub struct RewriteParams { + /// Target note: file path, basename, or #docid. + pub file: String, + /// New body content (replaces everything below frontmatter). + pub content: String, + /// Whether to preserve existing frontmatter (default: true). + pub preserve_frontmatter: Option, +} + +#[derive(Debug, Deserialize, JsonSchema)] +pub struct EditFrontmatterParams { + /// Target note: file path, basename, or #docid. + pub file: String, + /// Operations to apply. Array of objects like {"op": "add_tag", "value": "rust"} or {"op": "set", "key": "status", "value": "done"} or {"op": "remove", "key": "status"} or {"op": "remove_tag", "value": "old"}. + pub operations: Vec, +} + +#[derive(Debug, Deserialize, JsonSchema)] +pub struct DeleteParams { + /// Target note: file path, basename, or #docid. + pub file: String, + /// Delete mode: "soft" (archive, default) or "hard" (permanent). + pub mode: Option, +} + // --------------------------------------------------------------------------- // Server // --------------------------------------------------------------------------- +/// Map of recently-written file paths to their mtime. +/// Used to tell the watcher "I just wrote this file, skip re-indexing it." +pub type RecentWrites = Arc>>; + #[derive(Clone)] pub struct EngraphServer { store: Arc>, @@ -135,6 +191,8 @@ pub struct EngraphServer { orchestrator: Option>>>, /// Result reranker (None when intelligence is disabled or failed to load). reranker: Option>>>, + /// Tracks files recently written by MCP tools so the watcher can skip re-indexing them. + recent_writes: RecentWrites, } fn mcp_err(e: &anyhow::Error) -> McpError { @@ -156,6 +214,121 @@ fn to_json_result(value: &T) -> Result`. +fn parse_frontmatter_ops(operations: &[serde_json::Value]) -> Result, McpError> { + let mut ops = Vec::with_capacity(operations.len()); + for op_val in operations { + let op_str = op_val.get("op").and_then(|v| v.as_str()).ok_or_else(|| { + McpError::new( + rmcp::model::ErrorCode::INVALID_PARAMS, + "each operation must have an \"op\" string field", + None::, + ) + })?; + match op_str { + "set" => { + let key = op_val.get("key").and_then(|v| v.as_str()).ok_or_else(|| { + McpError::new( + rmcp::model::ErrorCode::INVALID_PARAMS, + "\"set\" operation requires a \"key\" field", + None::, + ) + })?; + let value = op_val + .get("value") + .and_then(|v| v.as_str()) + .ok_or_else(|| { + McpError::new( + rmcp::model::ErrorCode::INVALID_PARAMS, + "\"set\" operation requires a \"value\" field", + None::, + ) + })?; + ops.push(FrontmatterOp::Set(key.to_string(), value.to_string())); + } + "remove" => { + let key = op_val.get("key").and_then(|v| v.as_str()).ok_or_else(|| { + McpError::new( + rmcp::model::ErrorCode::INVALID_PARAMS, + "\"remove\" operation requires a \"key\" field", + None::, + ) + })?; + ops.push(FrontmatterOp::Remove(key.to_string())); + } + "add_tag" => { + let value = op_val + .get("value") + .and_then(|v| v.as_str()) + .ok_or_else(|| { + McpError::new( + rmcp::model::ErrorCode::INVALID_PARAMS, + "\"add_tag\" operation requires a \"value\" field", + None::, + ) + })?; + ops.push(FrontmatterOp::AddTag(value.to_string())); + } + "remove_tag" => { + let value = op_val + .get("value") + .and_then(|v| v.as_str()) + .ok_or_else(|| { + McpError::new( + rmcp::model::ErrorCode::INVALID_PARAMS, + "\"remove_tag\" operation requires a \"value\" field", + None::, + ) + })?; + ops.push(FrontmatterOp::RemoveTag(value.to_string())); + } + "add_alias" => { + let value = op_val + .get("value") + .and_then(|v| v.as_str()) + .ok_or_else(|| { + McpError::new( + rmcp::model::ErrorCode::INVALID_PARAMS, + "\"add_alias\" operation requires a \"value\" field", + None::, + ) + })?; + ops.push(FrontmatterOp::AddAlias(value.to_string())); + } + "remove_alias" => { + let value = op_val + .get("value") + .and_then(|v| v.as_str()) + .ok_or_else(|| { + McpError::new( + rmcp::model::ErrorCode::INVALID_PARAMS, + "\"remove_alias\" operation requires a \"value\" field", + None::, + ) + })?; + ops.push(FrontmatterOp::RemoveAlias(value.to_string())); + } + unknown => { + return Err(McpError::new( + rmcp::model::ErrorCode::INVALID_PARAMS, + format!("unknown frontmatter operation: \"{unknown}\""), + None::, + )); + } + } + } + Ok(ops) +} + #[tool_router] impl EngraphServer { #[tool( @@ -413,6 +586,135 @@ impl EngraphServer { .map_err(|e| mcp_err(&e))?; to_json_result(&result) } + + #[tool( + name = "read_section", + description = "Read a specific heading section from a note. Returns content from that heading to the next same-level heading." + )] + async fn read_section( + &self, + params: Parameters, + ) -> Result { + let store = self.store.lock().await; + let result = + context::read_section(&store, &self.vault_path, ¶ms.0.file, ¶ms.0.heading) + .map_err(|e| mcp_err(&e))?; + to_json_result(&result) + } + + #[tool( + name = "health", + description = "Vault health report: orphans, broken links, stale notes, tag hygiene, index freshness." + )] + async fn health(&self, _params: Parameters) -> Result { + let store = self.store.lock().await; + let profile_ref = self.profile.as_ref().as_ref(); + let config = crate::health::HealthConfig { + daily_folder: profile_ref.and_then(|p| p.structure.folders.daily.clone()), + inbox_folder: profile_ref.and_then(|p| p.structure.folders.inbox.clone()), + }; + let report = + crate::health::generate_health_report(&store, &config).map_err(|e| mcp_err(&e))?; + to_json_result(&report) + } + + #[tool( + name = "edit", + description = "Edit a specific section of a note. Supports replace, prepend, or append modes. Targets sections by heading name." + )] + async fn edit(&self, params: Parameters) -> Result { + let store = self.store.lock().await; + let mode = match params.0.mode.as_deref().unwrap_or("append") { + "replace" => crate::writer::EditMode::Replace, + "prepend" => crate::writer::EditMode::Prepend, + _ => crate::writer::EditMode::Append, + }; + let input = crate::writer::EditInput { + file: params.0.file, + heading: params.0.heading, + content: params.0.content, + mode, + modified_by: "claude-code".into(), + }; + let result = crate::writer::edit_note(&store, &self.vault_path, &input, None) + .map_err(|e| mcp_err(&e))?; + // Record write so the watcher skips re-indexing + let full_path = self.vault_path.join(&result.path); + record_write(&self.recent_writes, &full_path).await; + to_json_result(&result) + } + + #[tool( + name = "rewrite", + description = "Replace the entire body of a note. Optionally preserves existing frontmatter. Use for major content overhauls." + )] + async fn rewrite(&self, params: Parameters) -> Result { + let store = self.store.lock().await; + let input = crate::writer::RewriteInput { + file: params.0.file, + content: params.0.content, + preserve_frontmatter: params.0.preserve_frontmatter.unwrap_or(true), + modified_by: "claude-code".into(), + }; + let result = crate::writer::rewrite_note(&store, &self.vault_path, &input) + .map_err(|e| mcp_err(&e))?; + let full_path = self.vault_path.join(&result.path); + record_write(&self.recent_writes, &full_path).await; + to_json_result(&result) + } + + #[tool( + name = "edit_frontmatter", + description = "Edit frontmatter fields with granular operations: set/remove properties, add/remove tags, add/remove aliases." + )] + async fn edit_frontmatter( + &self, + params: Parameters, + ) -> Result { + let ops = parse_frontmatter_ops(¶ms.0.operations)?; + let store = self.store.lock().await; + let input = crate::writer::EditFrontmatterInput { + file: params.0.file, + operations: ops, + modified_by: "claude-code".into(), + }; + let result = crate::writer::edit_frontmatter(&store, &self.vault_path, &input) + .map_err(|e| mcp_err(&e))?; + let full_path = self.vault_path.join(&result.path); + record_write(&self.recent_writes, &full_path).await; + to_json_result(&result) + } + + #[tool( + name = "delete", + description = "Delete a note. Soft mode (default) moves it to the archive folder. Hard mode permanently removes it from disk and index." + )] + async fn delete(&self, params: Parameters) -> Result { + let store = self.store.lock().await; + let mode = match params.0.mode.as_deref().unwrap_or("soft") { + "hard" => crate::writer::DeleteMode::Hard, + _ => crate::writer::DeleteMode::Soft, + }; + let archive_folder = self + .profile + .as_ref() + .as_ref() + .and_then(|p| p.structure.folders.archive.as_deref()) + .unwrap_or("04-Archive"); + crate::writer::delete_note( + &store, + &self.vault_path, + ¶ms.0.file, + mode, + archive_folder, + ) + .map_err(|e| mcp_err(&e))?; + let result = serde_json::json!({ + "deleted": params.0.file, + "mode": params.0.mode.as_deref().unwrap_or("soft"), + }); + to_json_result(&result) + } } #[tool_handler] @@ -420,9 +722,10 @@ impl rmcp::handler::server::ServerHandler for EngraphServer { fn get_info(&self) -> ServerInfo { ServerInfo::new(ServerCapabilities::builder().enable_tools().build()).with_instructions( "engraph: vault intelligence for Obsidian. \ - Read: vault_map to orient, search to find, read for content, who/project for context. \ - Write: create for new notes, append to add content, update_metadata for tags/aliases, move_note to relocate. \ - Lifecycle: archive to soft-delete (moves to archive, removes from index), unarchive to restore.", + Read: vault_map to orient, search to find, read/read_section for content, who/project for context bundles, health for vault diagnostics. \ + Write: create for new notes, append to add content, edit to modify a section, rewrite to replace body, \ + edit_frontmatter for tags/properties, update_metadata for bulk tag/alias replacement. \ + Lifecycle: move_note to relocate, archive to soft-delete, unarchive to restore, delete for permanent removal.", ) } } @@ -495,6 +798,7 @@ pub async fn run_serve(data_dir: &Path) -> Result<()> { Arc::new(Mutex::new(Box::new(embedder) as Box)); let vault_path_arc = Arc::new(vault_path); let profile_arc = Arc::new(profile); + let recent_writes: RecentWrites = Arc::new(Mutex::new(HashMap::new())); // Start file watcher for real-time index updates let mut exclude = config.exclude.clone(); @@ -513,6 +817,7 @@ pub async fn run_serve(data_dir: &Path) -> Result<()> { profile_arc.clone(), config, exclude, + recent_writes.clone(), )?; let server = EngraphServer { @@ -523,6 +828,7 @@ pub async fn run_serve(data_dir: &Path) -> Result<()> { tool_router: EngraphServer::tool_router(), orchestrator, reranker, + recent_writes, }; eprintln!("engraph MCP server starting..."); diff --git a/src/store.rs b/src/store.rs index 46e4ca3..b33aa02 100644 --- a/src/store.rs +++ b/src/store.rs @@ -46,6 +46,16 @@ pub struct EdgeStats { pub isolated_file_count: usize, } +/// A record representing a CLI event (for observability/analytics). +#[derive(Debug, Clone)] +pub struct CliEvent { + pub id: i64, + pub timestamp: String, + pub operation: String, + pub outcome: String, + pub detail: Option, +} + /// A record of a placement correction (user moved a note from suggested folder). #[derive(Debug, Clone)] pub struct PlacementCorrection { @@ -281,6 +291,31 @@ impl Store { );", )?; + // CLI events table (observability/analytics) + self.conn.execute_batch( + "CREATE TABLE IF NOT EXISTS cli_events ( + id INTEGER PRIMARY KEY, + timestamp TEXT NOT NULL DEFAULT (datetime('now')), + operation TEXT NOT NULL, + outcome TEXT NOT NULL, + detail TEXT + ); + CREATE INDEX IF NOT EXISTS idx_cli_events_ts ON cli_events(timestamp);", + )?; + + // Unresolved links table — tracks wikilink targets that couldn't be + // resolved to a file during indexing. Used by health analysis. + self.conn.execute_batch( + "CREATE TABLE IF NOT EXISTS unresolved_links ( + id INTEGER PRIMARY KEY, + source_file TEXT NOT NULL, + target TEXT NOT NULL, + created_at TEXT NOT NULL DEFAULT (datetime('now')), + UNIQUE(source_file, target) + ); + CREATE INDEX IF NOT EXISTS idx_unresolved_source ON unresolved_links(source_file);", + )?; + Ok(()) } @@ -1465,6 +1500,15 @@ impl Store { } /// Resolve a file reference (path, basename, or #docid) to a FileRecord. + /// + /// Resolution order: + /// 1. `#docid` — 6-char hex prefixed with `#` + /// 2. Exact path match + /// 3. Basename match (case-insensitive, with separator normalization) + /// 4. Fuzzy match — Levenshtein distance ≤ 2 on basenames (stripped of `.md`) + /// - If exactly one candidate: return it + /// - If multiple equidistant candidates: error with candidate list + /// - If none within threshold: return None pub fn resolve_file(&self, file_or_docid: &str) -> Result> { if file_or_docid.starts_with('#') && file_or_docid.len() == 7 { return self.get_file_by_docid(&file_or_docid[1..]); @@ -1472,7 +1516,63 @@ impl Store { if let Some(f) = self.get_file(file_or_docid)? { return Ok(Some(f)); } - self.find_file_by_basename(file_or_docid) + if let Some(f) = self.find_file_by_basename(file_or_docid)? { + return Ok(Some(f)); + } + self.find_file_by_fuzzy(file_or_docid) + } + + /// Fuzzy-match a query against all stored file basenames using Levenshtein distance. + /// Returns the unique closest match within distance ≤ 2, or an error if ambiguous. + fn find_file_by_fuzzy(&self, query: &str) -> Result> { + use strsim::levenshtein; + + // Normalize query: strip .md, lowercase. + let query_stem = query.strip_suffix(".md").unwrap_or(query).to_lowercase(); + + // Collect all (path, basename_stem) pairs from the store. + let mut stmt = self.conn.prepare("SELECT path FROM files")?; + let paths: Vec = stmt + .query_map([], |row| row.get(0))? + .filter_map(|r| r.ok()) + .collect(); + + let mut best_distance = usize::MAX; + let mut best_paths: Vec = Vec::new(); + + for path in &paths { + // Extract basename and strip .md extension for comparison. + let basename = std::path::Path::new(path) + .file_name() + .and_then(|f| f.to_str()) + .unwrap_or(path); + let stem = basename + .strip_suffix(".md") + .unwrap_or(basename) + .to_lowercase(); + + let dist = levenshtein(&query_stem, &stem); + if dist > 2 { + continue; + } + if dist < best_distance { + best_distance = dist; + best_paths.clear(); + best_paths.push(path.clone()); + } else if dist == best_distance { + best_paths.push(path.clone()); + } + } + + match best_paths.len() { + 0 => Ok(None), + 1 => self.get_file(&best_paths[0]), + _ => Err(anyhow::anyhow!( + "ambiguous fuzzy match for '{}': [{}]", + query, + best_paths.join(", ") + )), + } } pub fn resolve_tag(&self, proposed: &str) -> Result { @@ -1486,6 +1586,155 @@ impl Store { pub fn register_tag(&self, name: &str, created_by: &str) -> Result<()> { crate::tags::register_tag(&self.conn, name, created_by) } + + // ── CLI Events ────────────────────────────────────────────── + + /// Log a CLI event for observability/analytics. + pub fn log_cli_event( + &self, + operation: &str, + outcome: &str, + detail: Option<&str>, + ) -> Result<()> { + self.conn.execute( + "INSERT INTO cli_events (timestamp, operation, outcome, detail) + VALUES (datetime('now'), ?1, ?2, ?3)", + params![operation, outcome, detail], + )?; + Ok(()) + } + + /// Get CLI events since a given ISO-8601 date string (e.g., "2020-01-01"). + pub fn get_cli_events_since(&self, since: &str) -> Result> { + let mut stmt = self.conn.prepare( + "SELECT id, timestamp, operation, outcome, detail + FROM cli_events WHERE timestamp >= ?1 ORDER BY timestamp DESC", + )?; + let rows = stmt.query_map(params![since], |row| { + Ok(CliEvent { + id: row.get(0)?, + timestamp: row.get(1)?, + operation: row.get(2)?, + outcome: row.get(3)?, + detail: row.get(4)?, + }) + })?; + let mut results = Vec::new(); + for row in rows { + results.push(row?); + } + Ok(results) + } + + /// Prune CLI events older than the given number of days. + pub fn prune_cli_events(&self, days: u32) -> Result { + let deleted = self.conn.execute( + "DELETE FROM cli_events WHERE julianday('now') - julianday(timestamp) > ?1", + params![days], + )?; + Ok(deleted) + } + + // ── Hard delete ────────────────────────────────────────────── + + /// Completely remove a file and all associated data from the store. + /// + /// Deletion order: + /// 1. Collect chunk vector_ids for the file + /// 2. Delete from `chunks_vec` (virtual table, no CASCADE) + /// 3. Delete from `chunks_fts` (virtual table, no CASCADE) + /// 4. Delete from `edges` where from_file or to_file matches + /// 5. Delete from `files` (CASCADE handles chunks table) + pub fn delete_file_hard(&self, path: &str) -> Result<()> { + let file = self + .get_file(path)? + .ok_or_else(|| anyhow::anyhow!("file not found: {}", path))?; + let file_id = file.id; + + // 1. Collect chunk vector_ids + let vector_ids = self.get_vector_ids_for_file(file_id)?; + + // 2. Delete from chunks_vec (virtual table — no CASCADE) + for vid in &vector_ids { + self.delete_vec(*vid)?; + } + + // 3. Delete from chunks_fts (virtual table — no CASCADE) + self.delete_fts_chunks_for_file(file_id)?; + + // 4. Delete from edges (both directions) + self.delete_edges_for_file(file_id)?; + + // 5. Delete from files (CASCADE handles chunks table) + self.delete_file(file_id)?; + + Ok(()) + } + + // ── Unresolved Links ───────────────────────────────────────── + + /// Record a wikilink target that could not be resolved during indexing. + pub fn insert_unresolved_link(&self, source_file: &str, target: &str) -> Result<()> { + self.conn.execute( + "INSERT OR IGNORE INTO unresolved_links (source_file, target) VALUES (?1, ?2)", + params![source_file, target], + )?; + Ok(()) + } + + /// Remove all unresolved links originating from the given source file. + pub fn clear_unresolved_links_for_file(&self, source_file: &str) -> Result<()> { + self.conn.execute( + "DELETE FROM unresolved_links WHERE source_file = ?1", + params![source_file], + )?; + Ok(()) + } + + /// Return all unresolved links (source_file, target) pairs. + pub fn get_unresolved_links(&self) -> Result> { + let mut stmt = self + .conn + .prepare("SELECT source_file, target FROM unresolved_links ORDER BY source_file")?; + let rows = stmt.query_map([], |row| { + Ok((row.get::<_, String>(0)?, row.get::<_, String>(1)?)) + })?; + let mut results = Vec::new(); + for row in rows { + results.push(row?); + } + Ok(results) + } + + // ── Health Queries ─────────────────────────────────────────── + + /// Find files that have no edges (neither incoming nor outgoing). + /// Optionally exclude files whose path starts with any of the given prefixes. + pub fn find_isolated_files(&self, exclude_prefixes: &[&str]) -> Result> { + let all_files = self.get_all_files()?; + let connected: HashSet = { + let mut stmt = self.conn.prepare( + "SELECT DISTINCT id FROM files WHERE id IN \ + (SELECT from_file FROM edges UNION SELECT to_file FROM edges)", + )?; + let rows = stmt.query_map([], |row| row.get::<_, i64>(0))?; + let mut set = HashSet::new(); + for row in rows { + set.insert(row?); + } + set + }; + let isolated = all_files + .into_iter() + .filter(|f| !connected.contains(&f.id)) + .filter(|f| { + !exclude_prefixes + .iter() + .any(|prefix| f.path.starts_with(prefix)) + }) + .collect(); + Ok(isolated) + } } fn parse_tags(json: &str) -> Vec { @@ -2676,4 +2925,144 @@ mod tests { let store = Store::open_memory().unwrap(); assert!(!store.has_dimension_mismatch(256).unwrap()); } + + // ── Fuzzy resolve tests ─────────────────────────────────── + + #[test] + fn test_resolve_file_fuzzy_match() { + let store = Store::open_memory().unwrap(); + store + .insert_file("Steve Barbera.md", "hash1", 100, &[], "ab1234", None) + .unwrap(); + // "Steve Barbara" is within Levenshtein 2 of "Steve Barbera" + let result = store.resolve_file("Steve Barbara").unwrap(); + assert!(result.is_some()); + assert_eq!(result.unwrap().path, "Steve Barbera.md"); + } + + #[test] + fn test_resolve_file_fuzzy_ambiguous() { + let store = Store::open_memory().unwrap(); + store + .insert_file("test-a.md", "h1", 100, &[], "aaa111", None) + .unwrap(); + store + .insert_file("test-b.md", "h2", 100, &[], "bbb222", None) + .unwrap(); + // "test-c" is equidistant from both — should error, not pick arbitrarily + let result = store.resolve_file("test-c"); + assert!(result.is_err()); + } + + #[test] + fn test_resolve_file_existing_docid() { + let store = Store::open_memory().unwrap(); + store + .insert_file("note.md", "hash", 100, &[], "abc123", None) + .unwrap(); + let result = store.resolve_file("#abc123").unwrap(); + assert!(result.is_some()); + } + + // ── CLI events tests ──────────────────────────────────────── + + #[test] + fn test_cli_events_insert_and_query() { + let store = Store::open_memory().unwrap(); + store.log_cli_event("edit", "success", None).unwrap(); + store + .log_cli_event("edit", "fallback", Some("timeout")) + .unwrap(); + let events = store.get_cli_events_since("2020-01-01").unwrap(); + assert_eq!(events.len(), 2); + assert_eq!(events[0].operation, "edit"); + assert_eq!(events[1].operation, "edit"); + // Most recent first + assert_eq!(events[0].outcome, "fallback"); + assert_eq!(events[0].detail.as_deref(), Some("timeout")); + assert_eq!(events[1].outcome, "success"); + assert!(events[1].detail.is_none()); + } + + #[test] + fn test_cli_events_prune() { + let store = Store::open_memory().unwrap(); + store.log_cli_event("search", "success", None).unwrap(); + // Events inserted just now should NOT be pruned with days=0 (julianday diff ~0) + let pruned = store.prune_cli_events(1).unwrap(); + assert_eq!(pruned, 0); + let events = store.get_cli_events_since("2020-01-01").unwrap(); + assert_eq!(events.len(), 1); + } + + #[test] + fn test_cli_events_table_exists() { + let store = Store::open_memory().unwrap(); + let tables: Vec = { + let mut stmt = store + .conn + .prepare("SELECT name FROM sqlite_master WHERE type='table' AND name='cli_events'") + .unwrap(); + let rows = stmt.query_map([], |row| row.get(0)).unwrap(); + rows.filter_map(|r| r.ok()).collect() + }; + assert!(tables.contains(&"cli_events".to_string())); + } + + // ── delete_file_hard tests ────────────────────────────────── + + #[test] + fn test_delete_file_hard() { + let store = Store::open_memory().unwrap(); + let tags = vec!["tag".to_string()]; + let file_id = store + .insert_file("delete-me.md", "hash", 100, &tags, "del123", None) + .unwrap(); + + // Insert a chunk + FTS entry + vec entry for the file + let vid = store.next_vector_id().unwrap(); + store + .insert_chunk(file_id, "## Heading", "chunk text", vid, 10) + .unwrap(); + store.insert_fts_chunk(file_id, 0, "chunk text").unwrap(); + + // Insert an embedding vector into chunks_vec + let embedding = vec![0.1_f32; 256]; + store.insert_vec(vid, &embedding).unwrap(); + + // Insert an edge from this file to itself (just to test edge cleanup) + let file_id2 = store + .insert_file("other.md", "hash2", 100, &[], "oth123", None) + .unwrap(); + store.insert_edge(file_id, file_id2, "wikilink").unwrap(); + store.insert_edge(file_id2, file_id, "wikilink").unwrap(); + + // Verify data exists + assert!(store.get_file("delete-me.md").unwrap().is_some()); + assert_eq!(store.get_chunks_by_file(file_id).unwrap().len(), 1); + + // Hard delete + store.delete_file_hard("delete-me.md").unwrap(); + + // File is gone + assert!(store.get_file("delete-me.md").unwrap().is_none()); + // Chunks are gone (CASCADE) + assert_eq!(store.get_chunks_by_file(file_id).unwrap().len(), 0); + // FTS entries are gone + let fts_results = store.fts_search("chunk text", 10).unwrap(); + assert!(fts_results.is_empty()); + // Edges are gone + assert_eq!(store.edge_count_for_file(file_id).unwrap(), 0); + // Only the edge from file_id2 to file_id was deleted, not file_id2's other edges + // (file_id2 has no remaining edges since both directions involved file_id) + assert_eq!(store.edge_count_for_file(file_id2).unwrap(), 0); + } + + #[test] + fn test_delete_file_hard_not_found() { + let store = Store::open_memory().unwrap(); + let result = store.delete_file_hard("nonexistent.md"); + assert!(result.is_err()); + assert!(result.unwrap_err().to_string().contains("file not found")); + } } diff --git a/src/watcher.rs b/src/watcher.rs index 298cc91..3318f11 100644 --- a/src/watcher.rs +++ b/src/watcher.rs @@ -14,6 +14,7 @@ use crate::indexer; use crate::llm::EmbedModel; use crate::placement; use crate::profile::VaultProfile; +use crate::serve::RecentWrites; use crate::store::Store; /// Start the file watcher and consumer. Returns a thread handle for the producer @@ -27,6 +28,7 @@ pub fn start_watcher( profile: Arc>, config: Config, exclude: Vec, + recent_writes: RecentWrites, ) -> anyhow::Result<(std::thread::JoinHandle<()>, oneshot::Sender<()>)> { let (tx, rx) = mpsc::channel::>(64); let (shutdown_tx, shutdown_rx) = oneshot::channel::<()>(); @@ -64,6 +66,7 @@ pub fn start_watcher( vault_clone, profile_clone, config_clone, + recent_writes, ) .await; }); @@ -270,6 +273,25 @@ fn detect_moves(events: &mut Vec, store: &Store, vault_path: &Path) } } +/// Check if a file was recently written by an MCP tool (so the watcher should skip it). +/// Returns true if the file's current mtime matches the recorded write mtime. +async fn is_recent_write(recent_writes: &RecentWrites, path: &Path) -> bool { + let mut map = recent_writes.lock().await; + if let Some(recorded_mtime) = map.get(path) { + if let Ok(meta) = std::fs::metadata(path) + && let Ok(current_mtime) = meta.modified() + && current_mtime == *recorded_mtime + { + // Match — this file was written by us; remove entry and skip + map.remove(path); + return true; + } + // mtime doesn't match (file was modified again externally) — remove stale entry + map.remove(path); + } + false +} + /// Consumer async task that processes batches of watch events. /// /// Two-pass processing: @@ -282,6 +304,7 @@ pub async fn run_consumer( vault_path: Arc, _profile: Arc>, config: Config, + recent_writes: RecentWrites, ) { tracing::info!("Watcher consumer started"); @@ -301,6 +324,12 @@ pub async fn run_consumer( for event in &events { match event { WatchEvent::Changed(path) => { + // Skip files recently written by MCP tools to avoid redundant re-indexing + if is_recent_write(&recent_writes, path).await { + tracing::debug!(path = %path.display(), "skipping re-index for MCP-written file"); + continue; + } + let rel = path .strip_prefix(vault_path.as_ref()) .unwrap_or(path) diff --git a/src/writer.rs b/src/writer.rs index 556df4c..4484369 100644 --- a/src/writer.rs +++ b/src/writer.rs @@ -41,6 +41,54 @@ pub struct UpdateMetadataInput { pub modified_by: String, } +#[derive(Debug, Clone)] +pub enum EditMode { + Replace, + Prepend, + Append, +} + +#[derive(Debug, Clone)] +pub struct EditInput { + pub file: String, + pub heading: String, + pub content: String, + pub mode: EditMode, + pub modified_by: String, +} + +#[derive(Debug, Clone, serde::Serialize)] +pub struct EditResult { + pub path: String, + pub heading: String, + pub mode: String, +} + +#[derive(Debug, Clone)] +pub struct RewriteInput { + pub file: String, + pub content: String, + pub preserve_frontmatter: bool, + pub modified_by: String, +} + +#[derive(Debug, Clone)] +pub enum FrontmatterOp { + Set(String, String), + Remove(String), + AddTag(String), + RemoveTag(String), + AddAlias(String), + RemoveAlias(String), +} + +#[derive(Debug, Clone)] +pub struct EditFrontmatterInput { + pub file: String, + pub operations: Vec, + pub modified_by: String, +} + #[derive(Debug, Clone, serde::Serialize)] pub struct WriteResult { pub path: String, @@ -694,6 +742,289 @@ pub fn update_metadata( }) } +/// Edit a specific section within an existing note. +/// +/// Finds the target section by heading name, then applies the edit based on mode: +/// - Replace: replace the entire section body with new content +/// - Append: add new content at the end of the section body +/// - Prepend: add new content at the start of the section body +/// +/// Does NOT re-index chunks — that's for the MCP layer. +pub fn edit_note( + store: &Store, + vault_path: &Path, + input: &EditInput, + _obsidian: Option<&mut crate::obsidian::ObsidianCli>, +) -> Result { + // Step 1: Resolve file via store + let file_record = store + .resolve_file(&input.file)? + .ok_or_else(|| anyhow::anyhow!("file not found: {}", input.file))?; + + let full_path = vault_path.join(&file_record.path); + + // Step 2: Read current content from disk + let content = std::fs::read_to_string(&full_path)?; + + // Step 3: Find the target section + let section = crate::markdown::find_section(&content, &input.heading).ok_or_else(|| { + anyhow::anyhow!("section '{}' not found in {}", input.heading, input.file) + })?; + + // Step 4: Apply the edit based on mode + let lines: Vec<&str> = content.lines().collect(); + let before = &lines[..section.body_start]; + let body = &lines[section.body_start..section.body_end]; + let after = &lines[section.body_end..]; + + let mode_name; + let new_body = match input.mode { + EditMode::Replace => { + mode_name = "Replace"; + format!("\n{}\n", input.content.trim_end()) + } + EditMode::Append => { + mode_name = "Append"; + let existing = body.join("\n"); + let trimmed_existing = existing.trim_end(); + if trimmed_existing.is_empty() { + format!("\n{}\n", input.content.trim_end()) + } else { + format!("{}\n{}\n", trimmed_existing, input.content.trim_end()) + } + } + EditMode::Prepend => { + mode_name = "Prepend"; + let existing = body.join("\n"); + let trimmed_existing = existing.trim_start(); + if trimmed_existing.is_empty() { + format!("\n{}\n", input.content.trim_end()) + } else { + format!("\n{}\n{}", input.content.trim_end(), trimmed_existing) + } + } + }; + + // Step 5: Reconstruct the file + let mut result_parts: Vec = Vec::new(); + if !before.is_empty() { + result_parts.push(before.join("\n")); + } + result_parts.push(new_body); + if !after.is_empty() { + result_parts.push(after.join("\n")); + } + // Join with newlines, ensuring we don't double up + let new_content = result_parts.join("\n"); + + // Step 6: Write atomically (overwrite = true) + atomic_write(&full_path, &new_content, true)?; + + // Step 7: Return EditResult + Ok(EditResult { + path: file_record.path, + heading: input.heading.clone(), + mode: mode_name.to_string(), + }) +} + +/// Rewrite the body of an existing note, optionally preserving existing frontmatter. +/// +/// If `preserve_frontmatter` is true and the note has frontmatter, the existing +/// YAML block is kept intact and only the body is replaced with `input.content`. +/// If false, the file is replaced entirely with `input.content`. +/// +/// Does NOT re-index — the MCP layer handles that. +pub fn rewrite_note(store: &Store, vault_path: &Path, input: &RewriteInput) -> Result { + // Step 1: Resolve file via store + let file_record = store + .resolve_file(&input.file)? + .ok_or_else(|| anyhow::anyhow!("file not found: {}", input.file))?; + + let full_path = vault_path.join(&file_record.path); + + // Step 2: Read current content from disk + let existing_content = std::fs::read_to_string(&full_path)?; + + // Step 3: Split frontmatter using crate::markdown::split_frontmatter + let (maybe_frontmatter, _old_body) = crate::markdown::split_frontmatter(&existing_content); + + // Step 4: Reconstruct content + let new_content = if input.preserve_frontmatter { + if let Some(frontmatter) = maybe_frontmatter { + format!("---\n{}\n---\n\n{}", frontmatter, input.content) + } else { + // No existing frontmatter — just use new content as-is + input.content.clone() + } + } else { + input.content.clone() + }; + + // Step 5: Write atomically (overwrite = true) + atomic_write(&full_path, &new_content, true)?; + + // Step 6: Return EditResult (reusing existing result type) + Ok(EditResult { + path: file_record.path, + heading: String::new(), + mode: "Rewrite".to_string(), + }) +} + +/// Edit frontmatter fields with granular operations (add/remove tags, set/remove properties). +/// +/// Uses `crate::markdown::split_frontmatter()` to extract raw YAML, then applies +/// operations sequentially using `serde_yaml`. Does NOT re-index chunks. +pub fn edit_frontmatter( + store: &Store, + vault_path: &Path, + input: &EditFrontmatterInput, +) -> Result { + // Step 1: Resolve file via store + let file_record = store + .resolve_file(&input.file)? + .ok_or_else(|| anyhow::anyhow!("file not found: {}", input.file))?; + + let full_path = vault_path.join(&file_record.path); + + // Step 2: Read content from disk + let content = std::fs::read_to_string(&full_path)?; + + // Step 3: Split frontmatter using crate::markdown::split_frontmatter (returns raw YAML without delimiters) + let (maybe_fm, body) = crate::markdown::split_frontmatter(&content); + + // Step 4: Parse YAML into a Mapping (create empty mapping if no frontmatter) + let mut mapping: serde_yaml::Mapping = if let Some(ref fm) = maybe_fm { + let val: serde_yaml::Value = serde_yaml::from_str(fm) + .unwrap_or(serde_yaml::Value::Mapping(serde_yaml::Mapping::new())); + match val { + serde_yaml::Value::Mapping(m) => m, + _ => serde_yaml::Mapping::new(), + } + } else { + serde_yaml::Mapping::new() + }; + + // Step 5: Apply operations sequentially + for op in &input.operations { + match op { + FrontmatterOp::Set(key, value) => { + mapping.insert( + serde_yaml::Value::String(key.clone()), + serde_yaml::Value::String(value.clone()), + ); + } + FrontmatterOp::Remove(key) => { + mapping.remove(serde_yaml::Value::String(key.clone())); + } + FrontmatterOp::AddTag(tag) => { + apply_add_to_sequence(&mut mapping, "tags", tag); + } + FrontmatterOp::RemoveTag(tag) => { + apply_remove_from_sequence(&mut mapping, "tags", tag); + } + FrontmatterOp::AddAlias(alias) => { + apply_add_to_sequence(&mut mapping, "aliases", alias); + } + FrontmatterOp::RemoveAlias(alias) => { + apply_remove_from_sequence(&mut mapping, "aliases", alias); + } + } + } + + // Step 6: Serialize back to YAML + let yaml_str = serde_yaml::to_string(&serde_yaml::Value::Mapping(mapping))?; + + // Step 7: Reassemble: ---\n{yaml}---\n\n{body} + // serde_yaml::to_string adds a trailing newline, so we don't need an extra one before --- + let new_content = format!("---\n{}---\n\n{}", yaml_str, body); + + // Step 8: Write atomically + atomic_write(&full_path, &new_content, true)?; + + // Update store with new content hash and mtime + let content_hash = compute_content_hash(&new_content); + let mtime = file_mtime(&full_path)?; + let docid = file_record + .docid + .clone() + .unwrap_or_else(|| generate_docid(&file_record.path)); + + // Extract updated tags from the written content for store update + let (updated_fm, _) = crate::markdown::split_frontmatter(&new_content); + let updated_tags: Vec = if let Some(ref fm) = updated_fm { + extract_yaml_sequence(fm, "tags") + } else { + vec![] + }; + + store.insert_file( + &file_record.path, + &content_hash, + mtime, + &updated_tags, + &docid, + file_record.created_by.as_deref(), + )?; + + Ok(EditResult { + path: file_record.path, + heading: String::new(), + mode: "EditFrontmatter".to_string(), + }) +} + +/// Helper: add a value to a YAML sequence field (create if missing, skip duplicates). +fn apply_add_to_sequence(mapping: &mut serde_yaml::Mapping, key: &str, value: &str) { + let key_val = serde_yaml::Value::String(key.to_string()); + let new_item = serde_yaml::Value::String(value.to_string()); + + let seq = mapping + .entry(key_val) + .or_insert_with(|| serde_yaml::Value::Sequence(vec![])); + + if let serde_yaml::Value::Sequence(items) = seq + && !items.contains(&new_item) + { + items.push(new_item); + } +} + +/// Helper: remove a value from a YAML sequence field. +fn apply_remove_from_sequence(mapping: &mut serde_yaml::Mapping, key: &str, value: &str) { + let key_val = serde_yaml::Value::String(key.to_string()); + let remove_item = serde_yaml::Value::String(value.to_string()); + + if let Some(serde_yaml::Value::Sequence(items)) = mapping.get_mut(&key_val) { + items.retain(|item| item != &remove_item); + } +} + +/// Helper: extract string values from a YAML sequence field. +fn extract_yaml_sequence(yaml_str: &str, key: &str) -> Vec { + let val: serde_yaml::Value = match serde_yaml::from_str(yaml_str) { + Ok(v) => v, + Err(_) => return vec![], + }; + if let serde_yaml::Value::Mapping(ref m) = val + && let Some(serde_yaml::Value::Sequence(items)) = + m.get(serde_yaml::Value::String(key.to_string())) + { + return items + .iter() + .filter_map(|v| { + if let serde_yaml::Value::String(s) = v { + Some(s.clone()) + } else { + None + } + }) + .collect(); + } + vec![] +} + /// Move a note to a new folder. pub fn move_note( file: &str, @@ -780,6 +1111,89 @@ pub fn move_note( }) } +// ── Delete ────────────────────────────────────────────────────── + +#[derive(Debug, Clone)] +pub enum DeleteMode { + /// Move the file to the archive folder, update the store path. + Soft, + /// Remove the file from disk and purge all store data. + Hard, +} + +/// Delete a note from the vault. +/// +/// - `Soft`: move the file to `archive_folder` and update the store record (path only). +/// The note remains on disk but is relocated. No index rebuild — it stays searchable +/// under its new path. +/// - `Hard`: remove the file from disk and call `store.delete_file_hard()` to purge all +/// associated chunks, edges, FTS, and vector data. +pub fn delete_note( + store: &Store, + vault_path: &Path, + file: &str, + mode: DeleteMode, + archive_folder: &str, +) -> Result<()> { + let file_record = store + .resolve_file(file)? + .ok_or_else(|| anyhow::anyhow!("file not found: {}", file))?; + + let old_path = vault_path.join(&file_record.path); + + match mode { + DeleteMode::Soft => { + // Build destination path inside archive_folder + let basename = std::path::Path::new(&file_record.path) + .file_name() + .ok_or_else(|| { + anyhow::anyhow!("cannot determine filename for: {}", file_record.path) + })?; + let new_rel_path = format!( + "{}/{}", + archive_folder.trim_end_matches('/'), + basename.to_string_lossy() + ); + let new_full_path = vault_path.join(&new_rel_path); + + // Ensure target directory exists + if let Some(parent) = new_full_path.parent() { + std::fs::create_dir_all(parent)?; + } + + // Move file on disk + std::fs::rename(&old_path, &new_full_path)?; + + // Update store: remove old record, insert under new path + let tags = file_record.tags.clone(); + let docid = file_record.docid.as_deref().unwrap_or("").to_string(); + let created_by = file_record.created_by.clone(); + let mtime = file_record.mtime; + + let content = std::fs::read_to_string(&new_full_path)?; + let content_hash = compute_content_hash(&content); + + store.delete_file(file_record.id)?; + store.insert_file( + &new_rel_path, + &content_hash, + mtime, + &tags, + &docid, + created_by.as_deref(), + )?; + + Ok(()) + } + DeleteMode::Hard => { + // Delete disk file first, then purge store + std::fs::remove_file(&old_path)?; + store.delete_file_hard(&file_record.path)?; + Ok(()) + } + } +} + // ── Archive / Unarchive ───────────────────────────────────────── /// Archive a note: move to archive folder, add archived frontmatter, remove from index. @@ -1187,4 +1601,390 @@ mod tests { assert_ne!(h1, h3); assert_eq!(h1.len(), 64); // SHA-256 hex } + + fn setup_vault() -> (tempfile::TempDir, Store, std::path::PathBuf) { + let tmp = tempfile::tempdir().unwrap(); + let store = Store::open_memory().unwrap(); + let root = tmp.path().to_path_buf(); + (tmp, store, root) + } + + #[test] + fn test_edit_note_append_to_section() { + let (_tmp, store, root) = setup_vault(); + let content = "# Person\n\n## Interactions\n\nOld entry\n\n## Links\n\nSome links\n"; + std::fs::write(root.join("person.md"), content).unwrap(); + store + .insert_file("person.md", "hash", 100, &[], "per123", None) + .unwrap(); + + let input = EditInput { + file: "person.md".into(), + heading: "Interactions".into(), + content: "New entry".into(), + mode: EditMode::Append, + modified_by: "test".into(), + }; + let result = edit_note(&store, &root, &input, None).unwrap(); + assert_eq!(result.heading, "Interactions"); + assert_eq!(result.mode, "Append"); + + let updated = std::fs::read_to_string(root.join("person.md")).unwrap(); + assert!(updated.contains("Old entry")); + assert!(updated.contains("New entry")); + // New entry should be before ## Links + let new_pos = updated.find("New entry").unwrap(); + let links_pos = updated.find("## Links").unwrap(); + assert!(new_pos < links_pos); + } + + #[test] + fn test_edit_note_replace_section() { + let (_tmp, store, root) = setup_vault(); + let content = "# Note\n\n## Tasks\n\n- [x] Old task\n\n## Notes\n\nText\n"; + std::fs::write(root.join("note.md"), content).unwrap(); + store + .insert_file("note.md", "hash", 100, &[], "not123", None) + .unwrap(); + + let input = EditInput { + file: "note.md".into(), + heading: "Tasks".into(), + content: "- [ ] New task\n".into(), + mode: EditMode::Replace, + modified_by: "test".into(), + }; + edit_note(&store, &root, &input, None).unwrap(); + + let updated = std::fs::read_to_string(root.join("note.md")).unwrap(); + assert!(!updated.contains("Old task")); + assert!(updated.contains("New task")); + assert!(updated.contains("## Notes")); // Other sections untouched + } + + #[test] + fn test_edit_note_prepend_to_section() { + let (_tmp, store, root) = setup_vault(); + let content = "# Doc\n\n## Log\n\nExisting line\n\n## Footer\n\nEnd\n"; + std::fs::write(root.join("doc.md"), content).unwrap(); + store + .insert_file("doc.md", "hash", 100, &[], "doc123", None) + .unwrap(); + + let input = EditInput { + file: "doc.md".into(), + heading: "Log".into(), + content: "Prepended line".into(), + mode: EditMode::Prepend, + modified_by: "test".into(), + }; + edit_note(&store, &root, &input, None).unwrap(); + + let updated = std::fs::read_to_string(root.join("doc.md")).unwrap(); + assert!(updated.contains("Prepended line")); + assert!(updated.contains("Existing line")); + // Prepended should come before existing + let prepend_pos = updated.find("Prepended line").unwrap(); + let existing_pos = updated.find("Existing line").unwrap(); + assert!(prepend_pos < existing_pos); + } + + #[test] + fn test_edit_note_section_not_found() { + let (_tmp, store, root) = setup_vault(); + let content = "# Note\n\n## Existing\n\nContent\n"; + std::fs::write(root.join("note.md"), content).unwrap(); + store + .insert_file("note.md", "hash", 100, &[], "not123", None) + .unwrap(); + + let input = EditInput { + file: "note.md".into(), + heading: "Missing".into(), + content: "Stuff".into(), + mode: EditMode::Append, + modified_by: "test".into(), + }; + let result = edit_note(&store, &root, &input, None); + assert!(result.is_err()); + assert!( + result + .unwrap_err() + .to_string() + .contains("section 'Missing' not found") + ); + } + + #[test] + fn test_edit_note_file_not_found() { + let (_tmp, store, root) = setup_vault(); + + let input = EditInput { + file: "nonexistent.md".into(), + heading: "Section".into(), + content: "Stuff".into(), + mode: EditMode::Append, + modified_by: "test".into(), + }; + let result = edit_note(&store, &root, &input, None); + assert!(result.is_err()); + assert!(result.unwrap_err().to_string().contains("file not found")); + } + + #[test] + fn test_rewrite_preserves_frontmatter() { + let (tmp, store, root) = setup_vault(); + let content = "---\ntags:\n - project\nstatus: active\n---\n\n# Old Content\n\nOld body\n"; + std::fs::write(root.join("note.md"), content).unwrap(); + store + .insert_file( + "note.md", + "hash", + 100, + &["project".to_string()], + "rew123", + None, + ) + .unwrap(); + + let input = RewriteInput { + file: "note.md".into(), + content: "# New Content\n\nNew body\n".into(), + preserve_frontmatter: true, + modified_by: "test".into(), + }; + rewrite_note(&store, &root, &input).unwrap(); + + let updated = std::fs::read_to_string(root.join("note.md")).unwrap(); + assert!(updated.contains("status: active")); + assert!(updated.contains("# New Content")); + assert!(!updated.contains("Old body")); + drop(tmp); + } + + #[test] + fn test_edit_frontmatter_add_tag() { + let (_tmp, store, root) = setup_vault(); + let content = "---\ntags:\n - project\n---\n\n# Content\n"; + std::fs::write(root.join("note.md"), content).unwrap(); + store + .insert_file( + "note.md", + "hash", + 100, + &["project".to_string()], + "efm123", + None, + ) + .unwrap(); + + let input = EditFrontmatterInput { + file: "note.md".into(), + operations: vec![FrontmatterOp::AddTag("rust".into())], + modified_by: "test".into(), + }; + edit_frontmatter(&store, &root, &input).unwrap(); + + let updated = std::fs::read_to_string(root.join("note.md")).unwrap(); + assert!(updated.contains("project")); + assert!(updated.contains("rust")); + } + + #[test] + fn test_edit_frontmatter_remove_tag() { + let (_tmp, store, root) = setup_vault(); + let content = "---\ntags:\n - project\n - old\n---\n\n# Content\n"; + std::fs::write(root.join("note.md"), content).unwrap(); + store + .insert_file( + "note.md", + "hash", + 100, + &["project".to_string(), "old".to_string()], + "efm456", + None, + ) + .unwrap(); + + let input = EditFrontmatterInput { + file: "note.md".into(), + operations: vec![FrontmatterOp::RemoveTag("old".into())], + modified_by: "test".into(), + }; + edit_frontmatter(&store, &root, &input).unwrap(); + + let updated = std::fs::read_to_string(root.join("note.md")).unwrap(); + assert!(updated.contains("project")); + assert!(!updated.contains("old")); + } + + #[test] + fn test_edit_frontmatter_set_property() { + let (_tmp, store, root) = setup_vault(); + let content = "---\nstatus: draft\n---\n\n# Content\n"; + std::fs::write(root.join("note.md"), content).unwrap(); + store + .insert_file("note.md", "hash", 100, &[], "efm789", None) + .unwrap(); + + let input = EditFrontmatterInput { + file: "note.md".into(), + operations: vec![FrontmatterOp::Set("status".into(), "active".into())], + modified_by: "test".into(), + }; + edit_frontmatter(&store, &root, &input).unwrap(); + + let updated = std::fs::read_to_string(root.join("note.md")).unwrap(); + assert!(updated.contains("status: active")); + assert!(!updated.contains("status: draft")); + } + + #[test] + fn test_edit_frontmatter_remove_property() { + let (_tmp, store, root) = setup_vault(); + let content = "---\nstatus: draft\ntitle: Test\n---\n\n# Content\n"; + std::fs::write(root.join("note.md"), content).unwrap(); + store + .insert_file("note.md", "hash", 100, &[], "efmrm1", None) + .unwrap(); + + let input = EditFrontmatterInput { + file: "note.md".into(), + operations: vec![FrontmatterOp::Remove("status".into())], + modified_by: "test".into(), + }; + edit_frontmatter(&store, &root, &input).unwrap(); + + let updated = std::fs::read_to_string(root.join("note.md")).unwrap(); + assert!(!updated.contains("status")); + assert!(updated.contains("title: Test")); + } + + #[test] + fn test_edit_frontmatter_add_alias() { + let (_tmp, store, root) = setup_vault(); + let content = "---\ntags:\n - test\n---\n\n# Content\n"; + std::fs::write(root.join("note.md"), content).unwrap(); + store + .insert_file( + "note.md", + "hash", + 100, + &["test".to_string()], + "efmal1", + None, + ) + .unwrap(); + + let input = EditFrontmatterInput { + file: "note.md".into(), + operations: vec![FrontmatterOp::AddAlias("My Alias".into())], + modified_by: "test".into(), + }; + edit_frontmatter(&store, &root, &input).unwrap(); + + let updated = std::fs::read_to_string(root.join("note.md")).unwrap(); + assert!(updated.contains("aliases")); + assert!(updated.contains("My Alias")); + } + + #[test] + fn test_edit_frontmatter_no_existing_frontmatter() { + let (_tmp, store, root) = setup_vault(); + let content = "# Content\n\nJust body, no frontmatter.\n"; + std::fs::write(root.join("note.md"), content).unwrap(); + store + .insert_file("note.md", "hash", 100, &[], "efmnf1", None) + .unwrap(); + + let input = EditFrontmatterInput { + file: "note.md".into(), + operations: vec![ + FrontmatterOp::Set("status".into(), "active".into()), + FrontmatterOp::AddTag("new-tag".into()), + ], + modified_by: "test".into(), + }; + edit_frontmatter(&store, &root, &input).unwrap(); + + let updated = std::fs::read_to_string(root.join("note.md")).unwrap(); + assert!(updated.starts_with("---\n")); + assert!(updated.contains("status: active")); + assert!(updated.contains("new-tag")); + assert!(updated.contains("# Content")); + } + + #[test] + fn test_edit_frontmatter_multiple_operations() { + let (_tmp, store, root) = setup_vault(); + let content = "---\ntags:\n - old-tag\nstatus: draft\n---\n\n# Content\n"; + std::fs::write(root.join("note.md"), content).unwrap(); + store + .insert_file( + "note.md", + "hash", + 100, + &["old-tag".to_string()], + "efmmo1", + None, + ) + .unwrap(); + + let input = EditFrontmatterInput { + file: "note.md".into(), + operations: vec![ + FrontmatterOp::RemoveTag("old-tag".into()), + FrontmatterOp::AddTag("new-tag".into()), + FrontmatterOp::Set("status".into(), "active".into()), + FrontmatterOp::Set("priority".into(), "high".into()), + ], + modified_by: "test".into(), + }; + edit_frontmatter(&store, &root, &input).unwrap(); + + let updated = std::fs::read_to_string(root.join("note.md")).unwrap(); + assert!(!updated.contains("old-tag")); + assert!(updated.contains("new-tag")); + assert!(updated.contains("status: active")); + assert!(updated.contains("priority: high")); + assert!(!updated.contains("status: draft")); + } + + #[test] + fn test_delete_note_soft() { + let (tmp, store, root) = setup_vault(); + std::fs::create_dir_all(root.join("04-Archive")).unwrap(); + std::fs::write(root.join("deleteme.md"), "# Delete me").unwrap(); + store + .insert_file("deleteme.md", "hash", 100, &[], "del123", None) + .unwrap(); + + delete_note( + &store, + &root, + "deleteme.md", + DeleteMode::Soft, + "04-Archive/", + ) + .unwrap(); + + assert!(!root.join("deleteme.md").exists()); + assert!(root.join("04-Archive/deleteme.md").exists()); + drop(tmp); + } + + #[test] + fn test_delete_note_hard() { + let (tmp, store, root) = setup_vault(); + std::fs::write(root.join("gone.md"), "# Gone forever").unwrap(); + store + .insert_file("gone.md", "hash", 100, &[], "gon123", None) + .unwrap(); + + delete_note(&store, &root, "gone.md", DeleteMode::Hard, "").unwrap(); + + assert!(!root.join("gone.md").exists()); + assert!(store.get_file("gone.md").unwrap().is_none()); + drop(tmp); + } }