diff --git a/docs/guide/README.md b/docs/guide/README.md new file mode 100644 index 00000000..e7d050e7 --- /dev/null +++ b/docs/guide/README.md @@ -0,0 +1,44 @@ +# khive User Guide + +Documentation for using khive as a research knowledge graph runtime. + +| Guide | What it covers | +| ---------------------------------------------- | ----------------------------------------------- | +| [Getting Started](getting-started.md) | Install, connect, first session | +| [Knowledge Graph Modeling](knowledge-graph.md) | Entity kinds, edge relations, modeling patterns | +| [Prompt Cookbook](prompt-cookbook.md) | 20+ real-world verb patterns with examples | +| [Search and Retrieval](search.md) | FTS, vector, hybrid fusion, reranking | +| [Memory and Recall](memory.md) | Episodic vs semantic, salience, decay | +| [GTD Task Management](tasks.md) | Task lifecycle, priorities, dependencies | + +## How to read these guides + +Each guide is self-contained but cross-references related topics. Start with +[Getting Started](getting-started.md) if you have never used khive, then read +[Knowledge Graph Modeling](knowledge-graph.md) for the conceptual foundation. + +The [Prompt Cookbook](prompt-cookbook.md) is a reference you return to — it shows +the exact `request(ops="...")` syntax for every common operation. + +## What khive is + +khive is a structured persistence layer for AI research agents. It provides a +typed knowledge graph (8 entity kinds, 15 edge relations, 5 note kinds), hybrid +search (FTS5 + vector + RRF fusion), GQL/SPARQL queries, task management, agent +memory with decay-weighted recall, inter-agent messaging, scheduling, and a +knowledge corpus with reranking. + +All interaction goes through a single MCP tool: `request(ops="verb(args)")`. + +## What khive is not + +khive is not a general-purpose database, a vector DB, or a chat memory system. +It has opinions: closed taxonomies, a fixed edge ontology, namespace isolation. +If your data does not fit the schema, reconsider how you model it before +requesting schema changes. + +## Further reading + +- [AGENTS.md](../../AGENTS.md) — agent usage reference (verb tables, property conventions, edge density rules) +- [CLAUDE.md](../../CLAUDE.md) — developer guide for working on khive itself +- [ADR index](../adr/README.md) — architecture decision records (the design contract) diff --git a/docs/guide/getting-started.md b/docs/guide/getting-started.md new file mode 100644 index 00000000..f1b57ce5 --- /dev/null +++ b/docs/guide/getting-started.md @@ -0,0 +1,187 @@ +# Getting Started + +This guide walks you from zero to a productive khive session: install the +binary, connect it to your MCP client, create your first entities, search the +graph, and link concepts together. + +## What khive gives you + +khive is a research knowledge graph runtime. When you read papers, form +concepts, link ideas, record decisions, or track tasks, khive gives that work a +typed, queryable graph that persists across sessions. Everything is accessible +through 63 verbs across 7 packs, dispatched through a single MCP tool. + +## Install + +### From crates.io (Rust) + +```bash +cargo install khive-mcp +``` + +### From npm + +```bash +npm install -g khive +# or +npm install -g @khive-ai/cli +``` + +### From source + +```bash +git clone https://github.com/ohdearquant/khive +cd khive +cargo build --workspace --release +# Binary at crates/target/release/khive-mcp +``` + +## Connect to your MCP client + +### Claude Code + +Add to your MCP configuration (`.claude/settings.json` or equivalent): + +```json +{ + "mcpServers": { + "khive": { + "command": "khive-mcp", + "args": [] + } + } +} +``` + +### Claude Desktop + +Add to `claude_desktop_config.json`: + +```json +{ + "mcpServers": { + "khive": { + "command": "khive-mcp", + "args": [] + } + } +} +``` + +khive auto-spawns a background daemon on first request to keep the ANN index and +embedding model warm. You do not need to manage this — it starts automatically +and cleans up on exit. + +## The single-tool interface + +khive exposes one MCP tool: `request`. Every operation goes through it: + +``` +request(ops="verb(arg=value, arg=value)") +``` + +This is the only syntax you need. The `ops` string contains a verb call (or a +batch of them), and khive dispatches it to the appropriate pack handler. + +## Your first session + +### 1. Create an entity + +Entities are the nodes in your knowledge graph. khive has 8 entity kinds: +`concept`, `document`, `dataset`, `project`, `person`, `org`, `artifact`, +`service`. + +``` +request(ops="create(kind=\"entity\", entity_kind=\"concept\", name=\"FlashAttention\", description=\"IO-aware exact attention algorithm\", properties={domain: \"attention\", year: 2022})") +``` + +Response: + +```json +{ + "ok": true, + "result": { + "id": "a1b2c3d4", + "kind": "concept", + "name": "FlashAttention", + "description": "IO-aware exact attention algorithm" + } +} +``` + +### 2. Create a related entity + +``` +request(ops="create(kind=\"entity\", entity_kind=\"document\", name=\"FlashAttention: Fast and Memory-Efficient Exact Attention\", properties={authors: \"Dao et al.\", year: 2022, source: \"arxiv:2205.14135\"})") +``` + +### 3. Link them + +Edges express typed relationships. `introduced_by` means "this concept was +introduced by that document": + +``` +request(ops="link(source_id=\"\", target_id=\"\", relation=\"introduced_by\", weight=1.0)") +``` + +### 4. Search the graph + +Search uses hybrid FTS5 + vector similarity with RRF fusion: + +``` +request(ops="search(kind=\"entity\", query=\"memory efficient attention\")") +``` + +Returns a scored list of matching entities. + +### 5. Explore neighbors + +See what connects to an entity: + +``` +request(ops="neighbors(node_id=\"\", direction=\"both\")") +``` + +### 6. Create a note + +Notes are temporal observations about your work — what you noticed, concluded, +or decided. They can annotate entities: + +``` +request(ops="create(kind=\"note\", note_kind=\"observation\", content=\"FlashAttention reduces memory from O(N^2) to O(N) by tiling and recomputation\", annotates=[\"\"])") +``` + +### 7. Batch operations + +Run multiple independent operations in one call: + +``` +request(ops="[create(kind=\"entity\", entity_kind=\"concept\", name=\"FlashAttention-2\"), create(kind=\"entity\", entity_kind=\"concept\", name=\"FlashAttention-3\")]") +``` + +Batched ops run in parallel with no ordering guarantee. If op B depends on op +A's output, use two separate `request` calls. + +### 8. Query the graph + +For complex pattern matching, use GQL: + +``` +request(ops="query(query=\"MATCH (a:concept)-[:introduced_by]->(b:document) RETURN a.name, b.name LIMIT 10\")") +``` + +Or SPARQL: + +``` +request(ops="query(query=\"SELECT ?a ?b WHERE { ?a :introduced_by ?b . } LIMIT 10\")") +``` + +Both compile to the same SQL backend. + +## What to read next + +- [Knowledge Graph Modeling](knowledge-graph.md) — how to think about entity + kinds, edge relations, and modeling decisions +- [Prompt Cookbook](prompt-cookbook.md) — 20+ ready-to-use verb patterns +- [Search and Retrieval](search.md) — how hybrid search, reranking, and + decompose work diff --git a/docs/guide/knowledge-graph.md b/docs/guide/knowledge-graph.md new file mode 100644 index 00000000..58d118da --- /dev/null +++ b/docs/guide/knowledge-graph.md @@ -0,0 +1,314 @@ +# Knowledge Graph Modeling + +This guide covers how to think about modeling in khive — when to use each entity +kind, which edge relation fits, when something belongs as a note versus an +entity, and common modeling patterns for research work. + +## The two substrates + +khive has two kinds of records: + +- **Entities** are things in the world: algorithms, papers, people, projects, + datasets, organizations, binaries, APIs. They are graph nodes with typed edges + between them. +- **Notes** are your observations about the world: what you noticed, concluded, + decided, asked, or cited. They are temporal records with salience, optional + decay, and can annotate entities via `annotates` edges. + +The rule of thumb: if it has a name and exists independently of your session, +it is an entity. If it is something you thought or recorded during a session, +it is a note. + +## Entity kinds + +khive has 8 entity kinds. This is a closed set — you cannot add new kinds +without an ADR. + +### concept + +Algorithms, techniques, architectures, theories, models. This is the most +common kind and the default. + +``` +create(kind="entity", entity_kind="concept", name="LoRA", + description="Low-Rank Adaptation of LLMs", + properties={domain: "fine-tuning", type: "technique", year: 2021}) +``` + +Use `concept` for anything that is an idea, method, or approach. Use +`properties.type` for finer classification: `algorithm`, `technique`, +`architecture`, `model`, `theory`. + +### document + +Papers, preprints, technical reports, blog posts, books. + +``` +create(kind="entity", entity_kind="document", + name="Attention Is All You Need", + properties={authors: "Vaswani et al.", year: 2017, + source: "arxiv:1706.03762"}) +``` + +Name the entity with its short title. Put full title, authors, year, and +citation pointer in `properties`. + +### dataset + +Benchmarks, corpora, evaluation sets. + +``` +create(kind="entity", entity_kind="dataset", name="MMLU", + description="Massive Multitask Language Understanding benchmark", + properties={type: "benchmark", year: 2021}) +``` + +### project + +Codebases, libraries, tools, frameworks. + +``` +create(kind="entity", entity_kind="project", name="lattice-inference", + description="Pure-Rust transformer inference engine", + properties={status: "implemented"}) +``` + +### person + +Researchers, engineers, authors. + +``` +create(kind="entity", entity_kind="person", name="Edward Hu", + properties={affiliation: "Microsoft"}) +``` + +### org + +Labs, companies, institutions. + +``` +create(kind="entity", entity_kind="org", name="Anthropic", + description="AI safety company") +``` + +### artifact + +Binaries, model checkpoints, Docker images, packages. + +``` +create(kind="entity", entity_kind="artifact", name="Llama-3-70B", + properties={type: "checkpoint", source: "meta-llama"}) +``` + +### service + +APIs, hosted endpoints, SaaS products. + +``` +create(kind="entity", entity_kind="service", name="OpenAI API", + properties={type: "api"}) +``` + +## Note kinds + +khive has 5 base note kinds (also a closed set): + +| Kind | What it records | Example | +| ------------- | ---------------------- | ------------------------------------------------------------------- | +| `observation` | An empirical capture | "FlashAttention reduces memory from O(N^2) to O(N)" | +| `insight` | A synthetic conclusion | "Tiling is the key technique across all IO-aware attention methods" | +| `question` | An open inquiry | "Does FlashAttention-3 support GQA natively?" | +| `decision` | A committed choice | "We will use FlashAttention-2 for the inference engine" | +| `reference` | An external pointer | "See arxiv:2205.14135 Section 3.2 for the tiling algorithm" | + +`observation` is the default if you omit `note_kind`. + +Packs add their own note kinds: `task` (GTD pack), `memory` (Memory pack), +`message` (Comm pack), `scheduled_event` (Schedule pack). These are created +through their respective pack verbs, not through `create(kind="note")`. + +## Edge relations + +khive has 15 edge relations. This is a closed set enforced at compile time. + +### When to use each relation + +**Structure** — parent/child and classification: + +| Relation | Direction | When to use | +| ------------- | ------------------- | ------------------------------------------------------ | +| `contains` | parent to child | A system contains a module. An org contains a project. | +| `part_of` | child to parent | Inverse of contains. A module is part of a system. | +| `instance_of` | specific to general | GQA is an instance of multi-query attention. | + +**Derivation** — how ideas build on each other: + +| Relation | Direction | When to use | +| --------------- | ------------------- | --------------------------------------------- | +| `extends` | child to parent | FlashAttention-2 extends FlashAttention. | +| `variant_of` | variant to original | QLoRA is a variant of LoRA. | +| `introduced_by` | concept to source | LoRA was introduced by the LoRA paper. | +| `supersedes` | new to old | FlashAttention-3 supersedes FlashAttention-2. | + +**Provenance** — where things come from: + +| Relation | Direction | When to use | +| -------------- | --------------- | ------------------------------------------ | +| `derived_from` | output to input | A model checkpoint derived from a dataset. | + +**Temporal** — ordering: + +| Relation | Direction | When to use | +| ---------- | ---------------- | ------------------------------------- | +| `precedes` | earlier to later | Paper A was published before Paper B. | + +**Dependency** — runtime/build relationships: + +| Relation | Direction | When to use | +| ------------ | ----------------------- | ------------------------------------------------- | +| `depends_on` | consumer to dependency | Project A depends on Project B at runtime. | +| `enables` | prerequisite to outcome | BPE tokenization enables subword-level attention. | + +**Implementation** — code realizes concept: + +| Relation | Direction | When to use | +| ------------ | --------------- | -------------------------------------------- | +| `implements` | code to concept | lattice-inference implements FlashAttention. | + +**Lateral** — peer relationships: + +| Relation | Direction | When to use | +| --------------- | ---------------- | ---------------------------------------------------- | +| `competes_with` | either direction | LoRA competes with full fine-tuning. | +| `composed_with` | either direction | FlashAttention composed with GQA in a serving stack. | + +**Annotation** — notes observing entities: + +| Relation | Direction | When to use | +| ----------- | ---------------- | ----------------------------------------------------------- | +| `annotates` | note to anything | An observation about a concept, a decision about a project. | + +### Edge endpoint rules + +Not every `(source_kind, relation, target_kind)` triple is valid. The base +contract in ADR-002 defines which entity kinds can appear as source and target +for each relation. Key rules: + +- `annotates` is the only cross-substrate relation. Source must be a note; + target can be anything (entity, note, edge, event). +- `supersedes` is same-substrate only: entity to entity, or note to note. +- All other 13 relations require entity-to-entity endpoints. +- `competes_with` and `composed_with` are symmetric — the system canonicalizes + direction internally. + +Packs can add endpoint pairs through the `EDGE_RULES` mechanism (ADR-017). The +KG pack adds person-to-org and org-to-org pairs. The GTD pack allows task-to-task +`depends_on` edges. These are additive — packs cannot tighten the base contract. + +### Why a closed ontology + +A sparse, fixed set of relations keeps the graph queryable. Ad-hoc relations +like `uses`, `related_to`, or `loaded_by` fragment the graph and make traversal +meaningless. If your relationship does not fit one of the 15, it is probably a +property on the entity rather than an edge. + +## Modeling patterns + +### Research papers + +A paper typically produces: one `document` entity (the paper itself), one or +more `concept` entities (the ideas it introduces), and `introduced_by` edges +from concepts to the paper. + +``` +create(kind="entity", entity_kind="document", name="LoRA Paper", + properties={title: "LoRA: Low-Rank Adaptation of Large Language Models", + authors: "Hu et al.", year: 2021, source: "arxiv:2106.09685"}) + +create(kind="entity", entity_kind="concept", name="LoRA", + properties={domain: "fine-tuning", type: "technique"}) + +link(source_id="", target_id="", relation="introduced_by") +``` + +For citation chains between papers, use `precedes` (temporal ordering): + +``` +link(source_id="", target_id="", relation="precedes") +``` + +### Software projects + +Model a project with `contains` for internal structure, `implements` for the +concepts it realizes, and `depends_on` for external dependencies: + +``` +create(kind="entity", entity_kind="project", name="lattice-inference", + properties={status: "implemented"}) + +link(source_id="", target_id="", relation="implements") +link(source_id="", target_id="", relation="depends_on") +``` + +### People and organizations + +``` +create(kind="entity", entity_kind="person", name="Tri Dao") +create(kind="entity", entity_kind="org", name="Princeton") + +link(source_id="", target_id="", relation="part_of") +``` + +### Decision records + +Use `decision` notes that annotate the entities they concern: + +``` +create(kind="note", note_kind="decision", + content="We will use FlashAttention-2 over vanilla attention because memory reduction is critical for 70B inference", + annotates=["", ""]) +``` + +### Temporal chains + +For versioned artifacts or sequential papers: + +``` +link(source_id="", target_id="", relation="precedes") +link(source_id="", target_id="", relation="precedes") +link(source_id="", target_id="", relation="supersedes") +``` + +## Anti-patterns + +| Pattern | Problem | Fix | +| ------------------------------ | ----------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ | +| Storing findings only as notes | Notes are temporal context; entities are structural. A concept worth naming deserves an entity. | Create the entity, then annotate it with notes. | +| Creating duplicate entities | Fragments the graph, splits edges. | Always `search` before `create`. If found, `link` to it. | +| Using ad-hoc relation names | `link(relation="uses")` will be rejected. | Map to the 15 closed relations. If none fit, use a property. | +| Reversed `introduced_by` | `paper → concept` is wrong. | Direction is `concept → paper` (the paper introduces the concept). | +| Over-noting | 20 observations but zero entities. | Extract the structural content into entities first. | +| Under-linking | Entities with 0-1 edges are orphans. | Target 5+ edges per entity. Below 3 means the entity needs more context. | +| Version numbers in names | "LoRA v2" instead of "QLoRA". | Version info goes in properties. Names are canonical short forms. | + +## Edge density + +Sparse graphs are useless for traversal. Target minimums: + +| Entity kind | Min edges | What to link | +| ------------------- | --------- | -------------------------------------------------------------------------------------------- | +| concept (algorithm) | 4 | `extends` or `instance_of` (parent), `introduced_by` (paper), `competes_with` (alternatives) | +| concept (paper) | 2 | `introduced_by` edges from concepts it introduced | +| project | 3 | `implements` (concepts), `depends_on` (deps), `contains`/`part_of` (structure) | +| person | 1 | `introduced_by` edges from their work | + +Overall target: 5+ edges per entity average. Check with `stats()` — if +`total_edges / total_entities` is below 4, the graph needs polish. + +## See also + +- [Prompt Cookbook](prompt-cookbook.md) — concrete verb patterns for all the + operations described here +- [Search and Retrieval](search.md) — how to find things in the graph +- [AGENTS.md](../../AGENTS.md) — the full agent reference with GQL/SPARQL + examples diff --git a/docs/guide/memory.md b/docs/guide/memory.md new file mode 100644 index 00000000..61669a9a --- /dev/null +++ b/docs/guide/memory.md @@ -0,0 +1,206 @@ +# Memory and Recall + +This guide covers how agent memory works in khive — how to store memories with +appropriate salience, how decay affects recall ranking, and patterns for +effective cross-session memory. + +## Two memory types + +khive supports two memory types: + +| Type | What it stores | When to use | +| ---------- | ----------------------------------------------- | -------------------------------------------------- | +| `episodic` | Session events, conversations, task completions | Default. Context that happened at a specific time. | +| `semantic` | Patterns, insights, reusable knowledge | Facts and rules that are useful across sessions. | + +These are the only valid values. There is no `procedural` or `working` memory +type. + +## Storing memories + +### Basic remember + +``` +request(ops="memory.remember(content=\"khive uses RRF fusion for hybrid search scoring\", salience=0.8, memory_type=\"semantic\")") +``` + +### Parameters + +| Parameter | Type | Default | Description | +| -------------- | ------ | ---------- | ----------------------------------------------- | +| `content` | string | required | The memory content | +| `salience` | float | 0.5 | Importance weight for recall ranking (0.0-1.0) | +| `decay_factor` | float | 0.01 | Higher = faster decay. 0.01 = ~69-day half-life | +| `memory_type` | string | "episodic" | `episodic` or `semantic` | +| `source_id` | uuid | none | Entity or note this memory annotates | + +### Salience calibration + +Salience determines how prominently a memory surfaces during recall. Use these +ranges: + +| Salience | Use for | Example | +| -------- | -------------------------------------------- | ------------------------------------------- | +| 0.85-1.0 | Critical directives, safety constraints | "Never delete the production database" | +| 0.7-0.8 | Key insights, reusable patterns, corrections | "RRF scoring requires cosine normalization" | +| 0.5-0.7 | Session summaries, routine context | "Completed attention benchmark run" | +| < 0.5 | Low-value, ephemeral, auto-generated | Routine status updates | + +A common mistake is inflating salience — if everything is 0.9+, the scoring +signal is lost and recall becomes unranked. + +### Linking memories to entities + +``` +request(ops="memory.remember(content=\"FlashAttention-3 uses asynchronous tiling on H100\", salience=0.7, source_id=\"\")") +``` + +The `source_id` creates an `annotates` edge from the memory note to the +specified entity. This makes the memory discoverable via `neighbors` on that +entity. + +## Recalling memories + +### Basic recall + +``` +request(ops="memory.recall(query=\"attention optimization\", limit=5)") +``` + +Returns a scored list of matching memories: + +```json +[ + {"id": "...", "content": "FlashAttention-3 uses async tiling...", "score": 0.72, "salience": 0.7, ...}, + {"id": "...", "content": "PagedAttention reduces KV cache...", "score": 0.58, "salience": 0.6, ...} +] +``` + +### Recall parameters + +| Parameter | Type | Default | Description | +| -------------- | ------ | -------- | ------------------------------------------ | +| `query` | string | required | Search query | +| `limit` | int | 10 | Max results | +| `min_score` | float | none | Minimum composite score threshold | +| `min_salience` | float | none | Minimum salience filter | +| `memory_type` | string | none | Filter by memory type | +| `tags` | list | none | Filter by tags | +| `tag_mode` | string | "any" | `any` (OR) or `all` (AND) for tag matching | + +### Tag-filtered recall + +``` +request(ops="memory.recall(query=\"search optimization\", tags=[\"khive\", \"retrieval\"], tag_mode=\"any\")") +``` + +## Scoring formula + +Recall ranking uses a composite score: + +``` +composite = (retrieval_score * 0.70) + (salience * decay_weight * 0.20) + (temporal_score * 0.10) +``` + +Where: + +- **retrieval_score** (70% weight): RRF fusion of FTS5 keyword match and vector + similarity +- **salience * decay_weight** (20% weight): the memory's importance, decayed + over time +- **temporal_score** (10% weight): recency bonus + +### Decay math + +Decay follows an exponential curve: + +``` +decay_weight = e^(-decay_factor * age_in_days) +``` + +With the default `decay_factor=0.01`: + +- After 1 day: 99% of original salience +- After 7 days: 93% +- After 30 days: 74% +- After 69 days: 50% (half-life) +- After 180 days: 17% + +Higher `decay_factor` means faster decay: + +- `0.001`: very slow (693-day half-life) — for permanent reference memories +- `0.01`: moderate (69-day half-life) — default, good for most content +- `0.05`: fast (14-day half-life) — for session-specific context +- `0.1`: very fast (7-day half-life) — for truly ephemeral context + +## Brain integration + +The Brain pack provides Bayesian profile tuning based on feedback signals. After +recalling memories, you can feed back which results were useful: + +### Auto-feedback (recommended) + +``` +request(ops="brain.auto_feedback(results=[{id: \"\", used: true}, {id: \"\", used: false}])") +``` + +Call this after `memory.recall` to automatically signal which results you +actually used. The brain profile adjusts its tuning over time. + +### Manual feedback + +``` +request(ops="brain.feedback(target_id=\"\", signal=\"useful\")") +``` + +Signals: `useful`, `not_useful`, `wrong`, `explicit_positive`, +`explicit_negative`, `correction`. + +Note: `target_id` must be a full UUID (not a short prefix). + +## Usage patterns + +### Session summary + +At the end of a work session, store key findings: + +``` +request(ops="memory.remember(content=\"SESSION: Completed FlashAttention-3 benchmark. Key finding: 2.3x speedup over FA2 on H100, but no improvement on A100 due to async tile dependency.\", salience=0.65, memory_type=\"episodic\")") +``` + +### Key insight + +When you discover something reusable: + +``` +request(ops="memory.remember(content=\"INSIGHT: knowledge.search with rerank=true gives normalized 0-1 scores vs raw RRF ~0.016. Always use rerank for score comparison.\", salience=0.75, memory_type=\"semantic\")") +``` + +### Session start recall + +At the beginning of a session, recall recent context: + +``` +request(ops="memory.recall(query=\"recent session work progress\", limit=5, memory_type=\"episodic\")") +``` + +Then make targeted recalls based on what you are about to work on: + +``` +request(ops="memory.recall(query=\"FlashAttention benchmark results\", limit=5)") +``` + +### Agent handoff + +When handing off work to another agent: + +``` +request(ops="memory.remember(content=\"HANDOFF: Attention benchmark suite is ready at benchmarks/attention/. Next step: run on H100 cluster. Contact: lambda:platform for GPU allocation.\", salience=0.8, memory_type=\"episodic\")") +``` + +## See also + +- [Search and Retrieval](search.md) — how hybrid search and RRF fusion work +- [Prompt Cookbook](prompt-cookbook.md) — memory verb patterns +- [GTD Task Management](tasks.md) — task lifecycle (often paired with memory + for context) diff --git a/docs/guide/prompt-cookbook.md b/docs/guide/prompt-cookbook.md new file mode 100644 index 00000000..8457d439 --- /dev/null +++ b/docs/guide/prompt-cookbook.md @@ -0,0 +1,440 @@ +# Prompt Cookbook + +Ready-to-use patterns for every common khive operation. Each pattern shows the +exact `request(ops="...")` syntax, expected response shape, and when to use it. + +All examples use the function-call DSL form. JSON form is equivalent — use it +when the DSL string would be hard to escape. + +--- + +## Create and link + +### Create an entity + +``` +request(ops="create(kind=\"entity\", entity_kind=\"concept\", name=\"FlashAttention\", description=\"IO-aware exact attention algorithm\", properties={domain: \"attention\", year: 2022})") +``` + +Response: + +```json +{"ok": true, "result": {"id": "a1b2c3d4", "kind": "concept", "name": "FlashAttention", ...}} +``` + +Use when: you encounter a new algorithm, paper, project, or any named thing +worth tracking. Always `search` first to avoid duplicates. + +### Create a note + +``` +request(ops="create(kind=\"note\", note_kind=\"observation\", content=\"FlashAttention reduces memory from O(N^2) to O(N) by tiling and recomputation\", salience=0.7)") +``` + +Use when: you want to record a finding, insight, or decision. Notes are +temporal; entities are structural. + +### Create an annotated note + +``` +request(ops="create(kind=\"note\", note_kind=\"insight\", content=\"Tiling is the common technique across all IO-aware attention methods\", annotates=[\"\"])") +``` + +Use when: your observation is about a specific entity. The `annotates` edge +makes it discoverable via `neighbors`. + +### Link two entities + +``` +request(ops="link(source_id=\"\", target_id=\"\", relation=\"introduced_by\", weight=1.0)") +``` + +Use when: you discover a relationship. Direction matters — check the +[edge relation guide](knowledge-graph.md) for source/target conventions. + +### Batch create + +``` +request(ops="[create(kind=\"entity\", entity_kind=\"concept\", name=\"GQA\"), create(kind=\"entity\", entity_kind=\"concept\", name=\"MQA\"), create(kind=\"entity\", entity_kind=\"concept\", name=\"MHA\")]") +``` + +Use when: you have multiple independent entities to create. Batched ops run in +parallel with no ordering guarantee. + +--- + +## Search and discover + +### Search entities + +``` +request(ops="search(kind=\"entity\", query=\"memory efficient attention\")") +``` + +### Search with filters + +``` +request(ops="search(kind=\"entity\", query=\"attention\", entity_kind=\"concept\", tags=[\"ml\"])") +``` + +### Search notes + +``` +request(ops="search(kind=\"note\", query=\"tiling recomputation\")") +``` + +Automatically excludes superseded notes. + +### Knowledge search with rerank + +``` +request(ops="knowledge.search(query=\"parameter efficient fine-tuning methods\", rerank=true)") +``` + +Use when: you want normalized [0,1] scores. Reranking uses a cross-encoder for +higher quality scoring. Default is on for `knowledge.search`. + +### Knowledge search with decompose + +``` +request(ops="knowledge.search(query=\"compare LoRA and QLoRA for 7B models\", decompose=true)") +``` + +Use when: your query mentions multiple distinct concepts. Decomposition splits +the query and merges results. + +--- + +## Navigate the graph + +### One-hop neighbors + +``` +request(ops="neighbors(node_id=\"\", direction=\"both\")") +``` + +Use when: you want to see everything connected to a node. Always pass +`direction="both"` unless you specifically need only outgoing or incoming edges. + +### Filtered neighbors + +``` +request(ops="neighbors(node_id=\"\", direction=\"in\", relations=[\"extends\", \"variant_of\"])") +``` + +Use when: you want only specific relationship types. + +### Multi-hop traverse + +``` +request(ops="traverse(roots=[\"\"], max_depth=3, relations=[\"extends\", \"variant_of\"], include_roots=false)") +``` + +Use when: you want to explore lineage — what extends what, multi-hop dependency +chains, reachability analysis. + +### GQL query + +``` +request(ops="query(query=\"MATCH (a:concept)-[:implements]->(b:project) RETURN a.name, b.name LIMIT 10\")") +``` + +### SPARQL query + +``` +request(ops="query(query=\"SELECT ?c WHERE { ?c :extends+ ?b . ?b :name 'LoRA' . } LIMIT 10\")") +``` + +Use GQL or SPARQL when: you need pattern matching over the graph structure — +"find all concepts that extend something introduced by a specific paper". + +--- + +## Memory + +### Store a memory + +``` +request(ops="memory.remember(content=\"khive uses RRF fusion for hybrid search scoring\", salience=0.8, memory_type=\"semantic\")") +``` + +`memory_type`: `episodic` (default) or `semantic` only. Salience: 0.0-1.0 +(higher = more important for recall ranking). + +### Recall memories + +``` +request(ops="memory.recall(query=\"hybrid search scoring\", limit=5)") +``` + +### Tag-filtered recall + +``` +request(ops="memory.recall(query=\"search optimization\", limit=5, tags=[\"khive\"], tag_mode=\"any\")") +``` + +### Store a memory linked to an entity + +``` +request(ops="memory.remember(content=\"FlashAttention-3 uses asynchronous tiling on H100\", salience=0.7, source_id=\"\")") +``` + +The `source_id` creates an `annotates` edge from the memory note to the entity. + +--- + +## Tasks (GTD) + +### Create a task + +``` +request(ops="gtd.assign(title=\"Implement FlashAttention-3 in lattice\", priority=\"p1\", status=\"next\")") +``` + +Defaults: `status=inbox`, `priority=p2`. + +### Create a task linked to an entity + +``` +request(ops="gtd.assign(title=\"Benchmark attention variants\", priority=\"p1\", context_entity_id=\"\")") +``` + +### Get next actions + +``` +request(ops="gtd.next(limit=5)") +``` + +Returns tasks with `status` in `[next, active]`, sorted by priority. + +### Transition a task + +``` +request(ops="gtd.transition(id=\"\", status=\"active\", note=\"started implementation\")") +``` + +Lifecycle: `inbox` -> `next` -> `active` -> `done` (or `cancelled`). Also +available: `waiting`, `someday`. + +### Complete a task + +``` +request(ops="gtd.transition(id=\"\", status=\"done\")") +``` + +### List tasks by status + +``` +request(ops="gtd.tasks(status=\"active\", limit=10)") +``` + +--- + +## Knowledge corpus + +### Learn a concept + +``` +request(ops="knowledge.learn(name=\"Speculative Decoding\", description=\"Draft-then-verify inference acceleration\", domain=\"inference\", tags=[\"decoding\", \"acceleration\"])") +``` + +Creates a concept entity in the knowledge corpus. + +### Cite a source + +``` +request(ops="knowledge.cite(concept_id=\"\", source_id=\"\")") +``` + +Both must be full UUIDs. Source must be a document or person entity. + +### Import markdown as atoms + +``` +request(ops="knowledge.import(path=\"/path/to/notes.md\", chunk_strategy=\"heading\")") +``` + +### Search the corpus + +``` +request(ops="knowledge.search(query=\"transformer inference optimization\", rerank=true, decompose=true)") +``` + +--- + +## Brain (Bayesian profiles) + +### Create a profile + +``` +request(ops="brain.create_profile(name=\"research-recall\")") +``` + +### Activate a profile + +``` +request(ops="brain.activate(profile_id=\"\")") +``` + +### Give feedback on recall quality + +``` +request(ops="brain.feedback(target_id=\"\", signal=\"useful\")") +``` + +`target_id` must be a full UUID. Signals: `useful`, `not_useful`, `wrong`, +`explicit_positive`, `explicit_negative`, `correction`. + +### Auto-feedback after recall + +``` +request(ops="brain.auto_feedback(results=[{id: \"\", used: true}])") +``` + +Convenience verb: call after `memory.recall` to automatically feed back which +results you actually used. + +### Check which profile serves you + +``` +request(ops="brain.resolve(consumer_kind=\"recall\")") +``` + +--- + +## Communication + +### Send a message + +``` +request(ops="comm.send(to=\"local\", content=\"Task completed: attention benchmarks ready\")") +``` + +### Check inbox + +``` +request(ops="comm.inbox(limit=5)") +``` + +### Reply in a thread + +``` +request(ops="comm.reply(id=\"\", content=\"Acknowledged, will review\")") +``` + +### Read a full thread + +``` +request(ops="comm.thread(id=\"\")") +``` + +--- + +## Schedule + +### Set a reminder + +``` +request(ops="schedule.remind(content=\"Check benchmark results\", at=\"2026-06-01T09:00:00\")") +``` + +### Schedule a future verb dispatch + +``` +request(ops="schedule.schedule(action=\"memory.recall(query='weekly review')\", at=\"2026-06-02T10:00:00\", repeat=\"weekly\")") +``` + +The `action` parameter is a DSL verb string, not plain text. + +### Check agenda + +``` +request(ops="schedule.agenda()") +``` + +### Cancel a scheduled event + +``` +request(ops="schedule.cancel(id=\"\")") +``` + +--- + +## Curation + +### Update an entity + +``` +request(ops="update(id=\"\", description=\"Updated description\", tags=[\"attention\", \"inference\"])") +``` + +### Merge duplicate entities + +``` +request(ops="merge(into_id=\"\", from_id=\"\", strategy=\"prefer_into\")") +``` + +Strategies: `prefer_into` (default), `prefer_from`, `union`. + +### Delete a record + +``` +request(ops="delete(id=\"\")") +``` + +Soft-delete by default. Pass `hard=true` for permanent deletion (cascades +edges for entities). + +### Check graph health + +``` +request(ops="stats()") +``` + +Returns entity, edge, note, and event counts. Check `total_edges / +total_entities` — below 4 means the graph needs more linking. + +--- + +## Batch and chain patterns + +### Parallel batch + +Multiple independent operations in one call: + +``` +request(ops="[search(kind=\"entity\", query=\"LoRA\"), search(kind=\"note\", query=\"LoRA\"), stats()]") +``` + +Each op runs independently. A failed op does not abort the batch — each entry +has its own `ok`/`error` field. + +### Two-step create-then-link + +When op B depends on op A's output, use two calls: + +``` +request(ops="create(kind=\"entity\", entity_kind=\"concept\", name=\"NewConcept\")") +# Read the id from the response, then: +request(ops="link(source_id=\"\", target_id=\"\", relation=\"extends\")") +``` + +### Dedup-before-create pattern + +Always search before creating to avoid duplicates: + +``` +request(ops="search(kind=\"entity\", query=\"FlashAttention\")") +# If found: link to existing. If not found: create. +``` + +--- + +## See also + +- [Getting Started](getting-started.md) — installation and first session +- [Knowledge Graph Modeling](knowledge-graph.md) — when to use each entity kind + and relation +- [Search and Retrieval](search.md) — how scoring, reranking, and decompose work +- [Memory and Recall](memory.md) — memory-specific patterns +- [GTD Task Management](tasks.md) — task lifecycle details diff --git a/docs/guide/search.md b/docs/guide/search.md new file mode 100644 index 00000000..4a757c9e --- /dev/null +++ b/docs/guide/search.md @@ -0,0 +1,193 @@ +# Search and Retrieval + +This guide covers how to find things in khive — from keyword search to vector +similarity to graph traversal — and when to use each approach. + +## Five ways to retrieve + +khive offers five retrieval verbs, each suited to a different question shape: + +| Verb | Question shape | Example | +| --------------------- | ----------------------------------- | ------------------------------------------- | +| `get(id)` | "I have a UUID, give me the record" | Fetch a known entity after a link operation | +| `search(kind, query)` | "Find things about X" | Discover entities matching a topic | +| `list(kind, filters)` | "Show me all Y" | Browse all concepts, all edges from a node | +| `neighbors(node_id)` | "What connects to this?" | One-hop graph exploration | +| `traverse(roots)` | "What is reachable within N hops?" | Multi-hop lineage, clusters | +| `query(gql)` | "Pattern match over the graph" | Complex structural queries | + +## Text search: `search` + +`search` combines full-text search (FTS5 trigram) with vector similarity +(embedding-based) using Reciprocal Rank Fusion (RRF). + +### Basic search + +``` +request(ops="search(kind=\"entity\", query=\"memory efficient attention\")") +``` + +Returns a scored list: + +```json +[ + {"id": "a1b2c3d4", "name": "FlashAttention", "score": 0.82, ...}, + {"id": "e5f6g7h8", "name": "PagedAttention", "score": 0.71, ...} +] +``` + +### Search notes + +``` +request(ops="search(kind=\"note\", query=\"tiling recomputation\")") +``` + +Note search automatically excludes superseded notes (notes targeted by a +`supersedes` edge). This is a view-layer filter — the old notes still exist. + +### Filtered search + +Narrow by entity kind, type, or tags: + +``` +request(ops="search(kind=\"entity\", query=\"attention\", entity_kind=\"concept\", tags=[\"ml\"])") +``` + +### Score interpretation + +Scores from `search` are RRF fusion scores. Raw RRF values are typically small +(0.01-0.03). When `rerank` is active (via `knowledge.search`), scores are +normalized to [0,1]. + +A practical floor: results below 0.3 are usually noise. Results above 0.7 are +strong matches. + +## Structured browse: `list` + +`list` returns records matching structured filters, without text similarity: + +``` +request(ops="list(kind=\"entity\", entity_kind=\"concept\", limit=20)") +request(ops="list(kind=\"edge\", source_id=\"\")") +request(ops="list(kind=\"note\", note_kind=\"decision\", limit=10)") +``` + +Use `list` when you want categorical browsing, not similarity ranking. + +## Graph navigation: `neighbors` and `traverse` + +### One-hop: neighbors + +``` +request(ops="neighbors(node_id=\"\", direction=\"both\")") +``` + +Direction options: `out` (default), `in`, `both`. **Always pass +`direction="both"` unless you specifically want only outgoing edges** — the +default is outgoing-only, which misses inbound relationships. + +Filter by relation: + +``` +request(ops="neighbors(node_id=\"\", direction=\"in\", relations=[\"extends\", \"variant_of\"])") +``` + +### Multi-hop: traverse + +``` +request(ops="traverse(roots=[\"\"], max_depth=3, relations=[\"extends\", \"variant_of\"])") +``` + +Returns paths — each path is a list of nodes from root to leaf. Use +`include_roots=false` to exclude the starting nodes from results. + +Traverse is BFS-based. It respects `direction` (default: `out`) and +`relations` filters. + +## Pattern matching: `query` + +For complex structural questions, use GQL or SPARQL: + +### GQL + +``` +request(ops="query(query=\"MATCH (a:concept)-[:extends]->(b:concept) WHERE b.name = 'LoRA' RETURN a\")") +``` + +``` +request(ops="query(query=\"MATCH (p:document)<-[:introduced_by]-(c:concept)<-[:implements]-(impl:project) RETURN c.name, impl.name\")") +``` + +### SPARQL + +``` +request(ops="query(query=\"SELECT ?a WHERE { ?a :extends+ ?b . ?b :name 'LoRA' . } LIMIT 10\")") +``` + +Both syntaxes compile to the same SQL backend. Use whichever feels natural. + +## Knowledge search: rerank and decompose + +The `knowledge.search` verb adds two capabilities on top of base search: + +### Reranking + +``` +request(ops="knowledge.search(query=\"memory efficient attention mechanisms\", rerank=true)") +``` + +Reranking uses a cross-encoder model to re-score results after the initial +retrieval pass. This produces clean [0,1] scores instead of raw RRF values. +Reranking is on by default for `knowledge.search`. + +### Query decomposition + +``` +request(ops="knowledge.search(query=\"compare LoRA and QLoRA fine-tuning approaches\", decompose=true)") +``` + +Decomposition splits multi-concept queries into sub-queries, runs them +independently, and merges the results. This avoids FTS edge cases where +compound queries miss relevant documents. + +Use `decompose=true` when your query mentions multiple distinct concepts. + +## Memory recall + +`memory.recall` is a specialized search over memory notes with decay-weighted +scoring: + +``` +request(ops="memory.recall(query=\"attention optimization\", limit=5)") +``` + +See [Memory and Recall](memory.md) for the full scoring formula and usage +patterns. + +## Choosing the right retrieval + +| You want to... | Use | +| --------------------------------- | -------------------------------------------- | +| Find entities about a topic | `search(kind="entity", query="...")` | +| Find notes about a topic | `search(kind="note", query="...")` | +| Browse all entities of a kind | `list(kind="entity", entity_kind="concept")` | +| See what connects to a node | `neighbors(node_id="...", direction="both")` | +| Explore multi-hop paths | `traverse(roots=["..."], max_depth=3)` | +| Structural pattern matching | `query(query="MATCH ...")` | +| Find knowledge atoms with scoring | `knowledge.search(query="...", rerank=true)` | +| Recall agent memories | `memory.recall(query="...")` | + +## Performance notes + +- **Cold start**: the first search in a session loads the ANN index and + embedding model. The daemon keeps these warm for subsequent calls. +- **Daemon**: khive auto-spawns `khive-mcp --daemon` on first request. The + daemon keeps indexes hot across sessions. +- **Vector search without embeddings**: if running with `--no-embed`, only FTS + results are returned (no vector similarity). + +## See also + +- [Prompt Cookbook](prompt-cookbook.md) — search patterns with full syntax +- [Memory and Recall](memory.md) — memory-specific recall with decay +- [AGENTS.md](../../AGENTS.md) — GQL and SPARQL examples diff --git a/docs/guide/tasks.md b/docs/guide/tasks.md new file mode 100644 index 00000000..ab059ce9 --- /dev/null +++ b/docs/guide/tasks.md @@ -0,0 +1,197 @@ +# GTD Task Management + +This guide covers task management in khive — the GTD lifecycle, priority levels, +task dependencies, and common workflow patterns. + +## What tasks are + +Tasks in khive are notes with `kind=task`, managed by the GTD pack. They have a +status lifecycle, priority level, optional assignee, and can be linked to +entities in the knowledge graph. + +Tasks are created with `gtd.assign`, not `create(kind="note")`. The GTD pack +handles lifecycle validation and status transitions. + +## Task lifecycle + +``` +inbox ──> next ──> active ──> done + │ │ │ + │ │ └──> cancelled + │ │ + └──> someday waiting ◄──┐ + │ │ + └────────────────────┘ +``` + +### Status meanings + +| Status | Meaning | When to use | +| ----------- | ----------------------------- | ----------------------------------------------------- | +| `inbox` | Captured but not committed | Default. Something that needs triage. | +| `next` | Committed, ready to work on | After triage — this is actionable and prioritized. | +| `active` | Currently in progress | When you start working on it. | +| `done` | Completed | Finished successfully. | +| `cancelled` | Abandoned | No longer relevant. | +| `waiting` | Blocked on something external | Waiting for a response, a dependency, or a condition. | +| `someday` | Deferred indefinitely | Not urgent, not committed, but worth remembering. | + +### Valid transitions + +Not all transitions are valid. The GTD pack validates them: + +- `inbox` can go to: `next`, `someday`, `cancelled` +- `next` can go to: `active`, `waiting`, `someday`, `cancelled` +- `active` can go to: `done`, `waiting`, `cancelled` +- `waiting` can go to: `next`, `active`, `cancelled` +- `someday` can go to: `next`, `cancelled` + +Idempotent transitions (same status to same status) are accepted silently. + +## Creating tasks + +### Basic task + +``` +request(ops="gtd.assign(title=\"Implement FlashAttention-3 in lattice\")") +``` + +Defaults: `status=inbox`, `priority=p2`. + +### Task with priority and status + +``` +request(ops="gtd.assign(title=\"Benchmark attention variants\", priority=\"p0\", status=\"next\")") +``` + +### Task linked to an entity + +``` +request(ops="gtd.assign(title=\"Review FlashAttention paper\", context_entity_id=\"\")") +``` + +The `context_entity_id` links the task to a KG entity, making it discoverable +via graph traversal. + +### Task with assignee + +``` +request(ops="gtd.assign(title=\"Write attention tests\", assignee=\"lambda:platform\")") +``` + +### Task with tags + +``` +request(ops="gtd.assign(title=\"Profile memory usage\", tags=[\"perf\", \"attention\"])") +``` + +## Priority levels + +| Priority | Meaning | +| -------- | ---------------------------------------- | +| `p0` | Critical — do now, everything else waits | +| `p1` | High — do today | +| `p2` | Normal — do this cycle (default) | +| `p3` | Low — do when convenient | + +## Working with tasks + +### Get next actions + +``` +request(ops="gtd.next(limit=5)") +``` + +Returns tasks with `status` in `[next, active]`, sorted by priority (p0 first). + +### List tasks by status + +``` +request(ops="gtd.tasks(status=\"active\")") +request(ops="gtd.tasks(status=\"waiting\", assignee=\"lambda:khive\")") +``` + +### Transition a task + +``` +request(ops="gtd.transition(id=\"\", status=\"active\", note=\"started implementation\")") +``` + +The `note` parameter records why the transition happened. + +### Complete a task + +``` +request(ops="gtd.transition(id=\"\", status=\"done\")") +``` + +You can also use `gtd.complete`: + +``` +request(ops="gtd.complete(id=\"\", result=\"all benchmarks pass\")") +``` + +Note: `gtd.complete` requires the task to be in `active` status. Use +`gtd.transition(status="done")` if the task is in another status. + +## Task dependencies + +Tasks can depend on other tasks using `depends_on` edges. The GTD pack extends +the base edge contract to allow task-to-task `depends_on` relationships: + +``` +request(ops="gtd.assign(title=\"Write tests\", status=\"next\")") +# Get the task id from the response + +request(ops="gtd.assign(title=\"Run CI\", depends_on=[\"\"])") +``` + +Or link them explicitly: + +``` +request(ops="link(source_id=\"\", target_id=\"\", relation=\"depends_on\")") +``` + +## Workflow patterns + +### Daily review + +``` +request(ops="gtd.next(limit=10)") +``` + +Review actionable tasks, reprioritize, transition stale items to `waiting` or +`someday`. + +### Triage inbox + +``` +request(ops="gtd.tasks(status=\"inbox\")") +``` + +For each inbox item, decide: promote to `next`, defer to `someday`, or +`cancel`. + +``` +request(ops="gtd.transition(id=\"\", status=\"next\", note=\"promoted after review\")") +``` + +### Context switching + +When picking up a new area of work, filter by assignee or tags: + +``` +request(ops="gtd.tasks(assignee=\"lambda:khive\", status=\"next\")") +``` + +### Batch status update + +``` +request(ops="[gtd.transition(id=\"\", status=\"done\"), gtd.transition(id=\"\", status=\"done\")]") +``` + +## See also + +- [Prompt Cookbook](prompt-cookbook.md) — task verb patterns +- [Knowledge Graph Modeling](knowledge-graph.md) — linking tasks to entities +- [Memory and Recall](memory.md) — storing context alongside tasks