From 6824cd772622915f74e9631ab68ec2b6000fe29a Mon Sep 17 00:00:00 2001
From: "mintlify[bot]" <109931778+mintlify[bot]@users.noreply.github.com>
Date: Mon, 1 Jun 2026 09:55:39 +0000
Subject: [PATCH 1/6] docs: document multi-worker DataLoader support for remote
 tables

---
 docs/training/torch.mdx | 77 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)
diff --git a/docs/training/torch.mdx b/docs/training/torch.mdx
index 9a9dbe5b..e9c571f1 100644
--- a/docs/training/torch.mdx
+++ b/docs/training/torch.mdx
@@ -96,3 +96,80 @@ dataloader = torch.utils.data.DataLoader(
 for batch in dataloader:
     print(batch.schema)
 ```
+
+## Using multiple DataLoader workers
+
+PyTorch's `DataLoader` can fan out reads across worker processes by setting `num_workers > 0`. LanceDB tables and `Permutation` objects are picklable, so each worker reopens its own connection after the worker process starts.
+
+Because LanceDB is multi-threaded internally, use the `spawn` start method (not `fork`) when running with multiple workers. See [the performance guide](/performance) for more on safe multiprocessing patterns.
+
+```py Python icon=Python 
+from lancedb.permutation import Permutation
+
+permutation = Permutation.identity(table)
+dataloader = torch.utils.data.DataLoader(
+    permutation,
+    batch_size=1024,
+    shuffle=True,
+    num_workers=4,
+    multiprocessing_context="spawn",
+    persistent_workers=True,
+)
+```
+
+### Remote tables in DataLoader workers
+
+Tables opened from a remote LanceDB Enterprise connection (`db://...`) also work with multi-worker DataLoaders. The connection details needed to reopen the table — `db_url`, `api_key`, `region`, `host_override`, and the serializable parts of `client_config` — travel with the pickled table and are used to rebuild the connection in each worker.
+
+```py Python icon=Python 
+import lancedb
+from lancedb.permutation import Permutation
+
+db = lancedb.connect(
+    "db://my-database",
+    api_key="sk-...",
+    region="us-east-1",
+)
+table = db.open_table("my_table")
+
+permutation = Permutation.identity(table).select_columns(["id", "image"])
+dataloader = torch.utils.data.DataLoader(
+    permutation,
+    batch_size=512,
+    num_workers=4,
+    multiprocessing_context="spawn",
+)
+```
+
+<Note>
+This embeds the API key in the pickle sent to each worker. If you'd rather load credentials inside the worker — for example, from an environment variable or a secret manager — use the connection factory escape hatch described below. A factory is also required when your `client_config` uses a non-serializable `header_provider`.
+</Note>
+
+### Providing a custom connection factory
+
+`Permutation.with_connection_factory` lets you control how each worker reopens the base table. The factory takes the base table name and returns a LanceDB table. It must be picklable, which in practice means a top-level function, a `functools.partial` of one, or an instance of a picklable class with `__call__` — lambdas and closures over local variables will not work.
+
+```py Python icon=Python 
+import os
+import lancedb
+from lancedb.permutation import Permutation
+
+def open_table(name: str):
+    db = lancedb.connect(
+        "db://my-database",
+        api_key=os.environ["LANCEDB_API_KEY"],
+        region="us-east-1",
+    )
+    return db.open_table(name)
+
+permutation = (
+    Permutation.identity(table)
+    .with_connection_factory(open_table)
+)
+dataloader = torch.utils.data.DataLoader(
+    permutation,
+    batch_size=512,
+    num_workers=4,
+    multiprocessing_context="spawn",
+)
+```

From f449aa6b91f31098a88ddda2ed7dbae301021e56 Mon Sep 17 00:00:00 2001
From: prrao87 <35005448+prrao87@users.noreply.github.com>
Date: Tue, 9 Jun 2026 13:32:53 -0400
Subject: [PATCH 2/6] docs: tighten multi-worker torch guidance

---
 docs/training/torch.mdx | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/docs/training/torch.mdx b/docs/training/torch.mdx
index e9c571f1..ba50db34 100644
--- a/docs/training/torch.mdx
+++ b/docs/training/torch.mdx
@@ -99,11 +99,12 @@ for batch in dataloader:
 
 ## Using multiple DataLoader workers
 
-PyTorch's `DataLoader` can fan out reads across worker processes by setting `num_workers > 0`. LanceDB tables and `Permutation` objects are picklable, so each worker reopens its own connection after the worker process starts.
+Set `num_workers > 0` to read from LanceDB in multiple PyTorch worker processes. LanceDB tables and `Permutation` objects are picklable, so each worker reopens the table after it starts.
 
-Because LanceDB is multi-threaded internally, use the `spawn` start method (not `fork`) when running with multiple workers. See [the performance guide](/performance) for more on safe multiprocessing patterns.
+Prefer the `spawn` start method when using multiple workers; LanceDB uses internal threads. See [the performance guide](/performance) for more multiprocessing guidance.
 
 ```py Python icon=Python 
+import torch
 from lancedb.permutation import Permutation
 
 permutation = Permutation.identity(table)
@@ -119,11 +120,11 @@ dataloader = torch.utils.data.DataLoader(
 
 ### Remote tables in DataLoader workers
 
-Tables opened from a remote LanceDB Enterprise connection (`db://...`) also work with multi-worker DataLoaders. The connection details needed to reopen the table — `db_url`, `api_key`, `region`, `host_override`, and the serializable parts of `client_config` — travel with the pickled table and are used to rebuild the connection in each worker.
+Remote LanceDB Enterprise tables (`db://...`) work the same way: workers reopen the table from the pickled connection state.
 
 ```py Python icon=Python 
 import lancedb
-from lancedb.permutation import Permutation
+import torch
 
 db = lancedb.connect(
     "db://my-database",
@@ -132,9 +133,8 @@ db = lancedb.connect(
 )
 table = db.open_table("my_table")
 
-permutation = Permutation.identity(table).select_columns(["id", "image"])
 dataloader = torch.utils.data.DataLoader(
-    permutation,
+    table,
     batch_size=512,
     num_workers=4,
     multiprocessing_context="spawn",
@@ -142,16 +142,17 @@ dataloader = torch.utils.data.DataLoader(
 ```
 
 <Note>
-This embeds the API key in the pickle sent to each worker. If you'd rather load credentials inside the worker — for example, from an environment variable or a secret manager — use the connection factory escape hatch described below. A factory is also required when your `client_config` uses a non-serializable `header_provider`.
+This sends the connection state, including the API key, to each worker. Use a connection factory if credentials should be loaded inside the worker or your `client_config` contains a non-serializable `header_provider`.
 </Note>
 
 ### Providing a custom connection factory
 
-`Permutation.with_connection_factory` lets you control how each worker reopens the base table. The factory takes the base table name and returns a LanceDB table. It must be picklable, which in practice means a top-level function, a `functools.partial` of one, or an instance of a picklable class with `__call__` — lambdas and closures over local variables will not work.
+`Permutation.with_connection_factory` lets each worker reopen the base table with custom logic. The factory takes the table name, returns a LanceDB table, and must be picklable.
 
 ```py Python icon=Python 
 import os
 import lancedb
+import torch
 from lancedb.permutation import Permutation
 
 def open_table(name: str):
@@ -162,6 +163,7 @@ def open_table(name: str):
     )
     return db.open_table(name)
 
+table = open_table("my_table")
 permutation = (
     Permutation.identity(table)
     .with_connection_factory(open_table)

From 70291f29a2ff31fbae6bf31272d2eaabe5656a27 Mon Sep 17 00:00:00 2001
From: prrao87 <35005448+prrao87@users.noreply.github.com>
Date: Thu, 2 Jul 2026 14:21:24 -0400
Subject: [PATCH 3/6] docs: fix PyTorch DataLoader table collation examples

---
 docs/training/torch.mdx | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/docs/training/torch.mdx b/docs/training/torch.mdx
index ba50db34..1c5d1626 100644
--- a/docs/training/torch.mdx
+++ b/docs/training/torch.mdx
@@ -17,13 +17,14 @@ The `Table` class in LanceDB implements a contract for a PyTorch
 import lancedb
 import torch
 import pyarrow as pa
+from lancedb.util import tbl_to_tensor
 
 mem_db = lancedb.connect("memory://")
 table = mem_db.create_table("test_table", pa.table({"a": range(1000)}))
 
 # Any LanceDB table can be used as a PyTorch Dataset
 dataloader = torch.utils.data.DataLoader(
-    table, batch_size=1024, shuffle=True
+    table, batch_size=1024, shuffle=True, collate_fn=tbl_to_tensor
 )
 
 for batch in dataloader:
@@ -42,12 +43,17 @@ dataloader = torch.utils.data.DataLoader(permutation)
 
 ## Output Formats
 
-By default, a `Table` data loader will emit a `pyarrow.RecordBatch`.  To convert to a different format (such as a
-`pytorch.Tensor`), you will need to provide a custom collate function.
+By default, a `Table` data loader will emit Arrow data. PyTorch calls the `collate_fn` argument to turn the fetched
+items into one batch. Its default collate function only knows how to combine tensors, NumPy arrays, numbers, dicts, and
+lists, so it does not accept Arrow data directly. Direct `Table` data loaders should provide a custom collate function
+such as `lancedb.util.tbl_to_tensor`, which converts numeric Arrow columns into a column-major `torch.Tensor` with shape
+`(columns, rows)`.
 
-The `Permutation` class is more flexible.  By default, the output will be a list of dicts.  This is the default output
-format of standard data loaders and usually more convenient when you are getting started.  However, there is a
-significant performance penalty converting from Arrow, Lance's internal representation, to this default format.
+`Permutation` works differently: its default output is a list of Python dicts, which PyTorch's default collate function
+can batch into a dict of tensors. This is usually more convenient when you are getting started. However, there is a
+significant performance penalty converting from Arrow, Lance's internal representation, to this default format. Use a
+direct `Table` with `collate_fn` when you want Arrow-to-tensor conversion, or a `Permutation` when you want the default
+PyTorch dict-of-tensors behavior.
 
 To address this, the `Permutation` class provides a set of builtin transform functions that can be applied to map
 the Arrow data in different ways.  The `arrow` and `polars` formats will always avoid data copies.  However, `numpy`,
@@ -125,6 +131,7 @@ Remote LanceDB Enterprise tables (`db://...`) work the same way: workers reopen
 ```py Python icon=Python 
 import lancedb
 import torch
+from lancedb.util import tbl_to_tensor
 
 db = lancedb.connect(
     "db://my-database",
@@ -138,6 +145,7 @@ dataloader = torch.utils.data.DataLoader(
     batch_size=512,
     num_workers=4,
     multiprocessing_context="spawn",
+    collate_fn=tbl_to_tensor,
 )
 ```
 

From 55e040f6b3e145e30e6f100e59927d08c0c4b616 Mon Sep 17 00:00:00 2001
From: prrao87 <35005448+prrao87@users.noreply.github.com>
Date: Thu, 2 Jul 2026 14:23:54 -0400
Subject: [PATCH 4/6] docs: clarify PyTorch collate function usage

---
 docs/training/torch.mdx | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/training/torch.mdx b/docs/training/torch.mdx
index 1c5d1626..c04d2760 100644
--- a/docs/training/torch.mdx
+++ b/docs/training/torch.mdx
@@ -43,11 +43,11 @@ dataloader = torch.utils.data.DataLoader(permutation)
 
 ## Output Formats
 
-By default, a `Table` data loader will emit Arrow data. PyTorch calls the `collate_fn` argument to turn the fetched
-items into one batch. Its default collate function only knows how to combine tensors, NumPy arrays, numbers, dicts, and
-lists, so it does not accept Arrow data directly. Direct `Table` data loaders should provide a custom collate function
-such as `lancedb.util.tbl_to_tensor`, which converts numeric Arrow columns into a column-major `torch.Tensor` with shape
-`(columns, rows)`.
+By default, a `Table` data loader will emit Arrow data. `collate_fn` is PyTorch's batching hook: PyTorch calls it to
+turn the fetched items into one batch. PyTorch's default collate function only knows how to combine tensors, NumPy
+arrays, numbers, dicts, and lists, so it does not accept Arrow data directly. When using a `Table` directly, pass
+LanceDB's `lancedb.util.tbl_to_tensor` helper as PyTorch's `collate_fn`; it converts numeric Arrow columns into a
+column-major `torch.Tensor` with shape `(columns, rows)`.
 
 `Permutation` works differently: its default output is a list of Python dicts, which PyTorch's default collate function
 can batch into a dict of tensors. This is usually more convenient when you are getting started. However, there is a

From ea3815a8d7759a20e2a6bf71e241ddc63ca8bfc7 Mon Sep 17 00:00:00 2001
From: prrao87 <35005448+prrao87@users.noreply.github.com>
Date: Thu, 2 Jul 2026 15:12:26 -0400
Subject: [PATCH 5/6] docs: add Hermes Agent memory integration

Add an integration page for the LanceDB memory plugin for Hermes Agent, covering install and activation, the four memory tools, a remember -> recall -> read -> forget walkthrough, vector/hybrid retrieval modes, store inspection, and configuration. Wires the page into the AI Platforms & Frameworks nav group.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 docs/docs.json                        |   1 +
 docs/integrations/ai/hermes-agent.mdx | 321 ++++++++++++++++++++++++++
 2 files changed, 322 insertions(+)
 create mode 100644 docs/integrations/ai/hermes-agent.mdx

diff --git a/docs/docs.json b/docs/docs.json
index 2e9f8eb4..7720741c 100644
--- a/docs/docs.json
+++ b/docs/docs.json
@@ -277,6 +277,7 @@
                 "group": "AI Platforms & Frameworks",
                 "pages": [
                   "integrations/ai/agno",
+                  "integrations/ai/hermes-agent",
                   "integrations/ai/huggingface",
                   "integrations/ai/langchain",
                   "integrations/ai/llamaIndex",
diff --git a/docs/integrations/ai/hermes-agent.mdx b/docs/integrations/ai/hermes-agent.mdx
new file mode 100644
index 00000000..2be3414f
--- /dev/null
+++ b/docs/integrations/ai/hermes-agent.mdx
@@ -0,0 +1,321 @@
+---
+title: "Hermes Agent"
+sidebarTitle: "Hermes Agent"
+description: "Use LanceDB as a persistent, semantic memory backend for Hermes Agent — durable recall across sessions with vector and hybrid search."
+---
+
+[Hermes Agent](https://github.com/NousResearch/hermes-agent) is a self-hosted, open-source
+personal agent from [Nous Research](https://nousresearch.com). You can talk to it from a
+terminal UI or reach the same agent from Telegram, Discord, and Slack, and it exposes a
+dedicated slot for external *memory providers* that run alongside its built-in notes.
+
+The [LanceDB memory plugin](https://github.com/lancedb/hermes-agent-memory) fills that slot.
+It gives Hermes durable, semantic recall across sessions: state a preference or a project
+convention once, and the agent can retrieve it weeks later in a brand-new session — even when
+you ask for it in completely different words. Everything runs inside Hermes' own Python
+process, storing a single LanceDB table on local disk. There's no memory server to operate.
+
+The mental model is clean: **Hermes owns the agent loop; LanceDB manages the durable
+long-term memory and offers semantic recall.**
+
+## Why LanceDB fits agent memory
+
+Out of the box, Hermes remembers with a small curated notes file frozen into the system
+prompt, plus lexical (keyword) search over past sessions. Both are useful, but keyword search
+misses paraphrases of what you originally typed — the exact thing you need when recalling a
+fact you phrased differently months ago.
+
+LanceDB is an embedded retrieval library, which makes it a natural fit here:
+
+- **No server to stand up** — it reads and writes a table on local disk, so the plugin ships
+  as a dependency rather than a service to operate.
+- **One table holds everything** — content, metadata, and embeddings live together. A memory
+  becomes a structured row with a category, tags, timestamps, and provenance, not just a text
+  blob.
+- **Query it any way you need** — vector similarity for meaning, BM25 full-text for exact
+  names and jargon, a hybrid of the two, or plain metadata filters to keep recall scoped to
+  the right workspace.
+- **It scales up** — the same table abstraction carries over to larger LanceDB deployments
+  later, so the local setup is never a dead end.
+
+## Install and activate
+
+<Tip>
+Want to try this without touching your existing Hermes setup? Run everything in an isolated
+profile: `hermes profile create demo`, then add `-p demo` to the commands below. When you're
+done, `rm -rf ~/.hermes/profiles/demo` removes all trace.
+</Tip>
+
+<Steps>
+<Step title="Install Hermes Agent">
+Skip this if you already have Hermes installed.
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
+```
+</Step>
+
+<Step title="Install the plugin">
+This shallow-clones the plugin into `~/.hermes/plugins/lancedb/`.
+
+```bash
+hermes plugins install lancedb/hermes-agent-memory
+```
+</Step>
+
+<Step title="Install runtime dependencies into Hermes' environment">
+Hermes loads plugins inside its own Python interpreter, so the dependencies go *there* — not
+into a separate virtualenv. (This interpreter is shared across profiles, so you only install
+once.)
+
+```bash
+uv pip install --python ~/.hermes/hermes-agent/venv/bin/python3 lancedb openai pyyaml
+```
+</Step>
+
+<Step title="Set your embeddings API key">
+The plugin turns conversations into embeddings, so it needs an embeddings key. By default that
+is OpenAI, so set `OPENAI_API_KEY` in your environment or in `~/.hermes/.env`.
+
+<Info>
+Prefer a local or non-OpenAI model? The plugin uses an OpenAI-compatible client, so you can
+point it at any compatible endpoint (OpenRouter, Ollama, vLLM, …) in your config — no code
+change needed. See [Configuration](#configuration) below.
+</Info>
+</Step>
+
+<Step title="Activate and verify">
+Switch memory on and pick this plugin:
+
+```bash
+hermes memory setup     # choose "lancedb"
+```
+
+Then confirm it's actually active before you start chatting — this is the one step worth not
+skipping, because Hermes quietly falls back to its built-in notes if the provider isn't set:
+
+```bash
+hermes memory status
+```
+
+```text
+Memory status
+────────────────────────────────────────
+  Built-in:  always active
+  Provider:  lancedb
+
+  Plugin:    installed ✓
+  Status:    available ✓
+```
+
+You want to see `Provider: lancedb` with both `installed ✓` and `available ✓`.
+</Step>
+</Steps>
+
+## The memory tools
+
+Once activated, the agent has four tools for working with long-term memory:
+
+| Tool | What it does |
+|:--|:--|
+| `lancedb_recall` | Semantic (vector, the default) or hybrid search over your workspace memory. Returns matching facts with scores and provenance. |
+| `lancedb_remember` | Stores a durable fact when you explicitly ask. Deduplicated by content hash, so remembering the same thing twice doesn't pile up rows. |
+| `lancedb_read` | Fetches a single memory by ID, optionally with the original conversation messages it was distilled from. |
+| `lancedb_forget` | Deletes safely: previews candidates first, then deletes by exact ID, so nothing disappears by accident. |
+
+Beyond these tools, the plugin also captures durable facts from your conversations
+automatically — an auxiliary model distills them before context is compressed and again when a
+session ends, so insights survive even when the raw messages are summarized away.
+
+## Walkthrough: teach it a project convention
+
+Let's make this concrete with the pain we opened on: re-explaining your setup every session.
+We'll save a convention once, then prove a brand-new session can recall it — touching all four
+tools along the way.
+
+### Remember
+
+Ask Hermes to commit a convention to long-term memory. Saying "remember in long-term memory"
+makes sure it lands in the LanceDB store, which shows up as the `⚡ lancedb_r` (`lancedb_remember`)
+line below:
+
+```text
+● Remember in long-term memory: for this project I only use uv, never pip, and I always add type hints to Python functions.
+
+  ┊ 🧠 memory    +memory: "For this project, the user only uses uv for Python package management, never pip, and always adds type hints to Python functions."
+  ┊ ⚡ lancedb_r   0.0s
+ ─  ⚕ Hermes  ────────────────────────────────────────────────────────────────
+
+     Remembered. I've stored that project convention: use uv only, never pip, and always add type hints to Python functions.
+```
+
+### Recall
+
+First, take Hermes' built-in notes out of the picture so recall can *only* come from LanceDB —
+the two layers run side by side otherwise, and either could answer:
+
+```bash
+hermes config set memory.memory_enabled false
+```
+
+Now start a fresh session (`/new`) and ask for the convention back, worded completely
+differently from how you saved it:
+
+```text
+● /new
+● Look through your long-term memory for my Python project preferences.
+
+ ─  ⚕ Hermes  ────────────────────────────────────────────────────────────────
+
+     I found one durable memory about Python project preferences:
+
+     - Project Python tooling preference is uv; pip is not used; type hints are required.
+```
+
+This is the whole point in one exchange. You asked about your *preferences* using none of the
+words you stored the fact with — no *uv*, no *pip*, no *type hints* — yet a fresh session, with
+none of the earlier conversation in context, pulls it straight back. Semantic search matches on
+meaning rather than exact wording, which is precisely where keyword search falls short.
+
+<Tip>
+Turn the built-in layer back on for everyday use with `hermes config set memory.memory_enabled true`
+— the two memory layers happily coexist. We only disabled it here to prove the recall came from
+LanceDB.
+</Tip>
+
+### Read
+
+You can also ask where a fact came from. Hermes attributes the answer to its stored memory
+rather than guessing from a file in the repo (under the hood, `lancedb_read` can also return
+the exact source messages a fact was distilled from):
+
+```text
+● Where did that uv preference come from?
+
+ ─  ⚕ Hermes  ────────────────────────────────────────────────────────────────
+
+     It came from the project's persistent memory, not from a codebase file I inspected.
+
+     Specifically, the recalled memory says:
+     - "For this project, the user only uses uv for Python package management, never pip, and always adds type hints to Python functions."
+```
+
+### Forget
+
+When a preference changes, ask Hermes to drop the old fact. The tool calls tell the whole
+story: the two `⚡ lancedb_f` (`lancedb_forget`) lines are it previewing matches and then
+deleting, and the trailing `⚡ lancedb_r` is it saving the replacement in the same breath:
+
+```text
+● I've switched from uv to pixi. Forget the uv preference.
+
+  ┊ ⚡ lancedb_f uv preference python package management never pip type hints  0.3s
+  ┊ ⚡ lancedb_f   0.0s
+  ┊ ⚡ lancedb_r   0.2s
+ ─  ⚕ Hermes  ────────────────────────────────────────────────────────────────
+
+     Done — I removed the uv preference from memory and replaced it with:
+
+     - This project uses pixi for Python package management
+     - Type hints on Python functions are still preferred
+```
+
+Remember, recall, read, forget: four small operations that between them cover the entire
+lifecycle of a durable memory.
+
+## Retrieval modes
+
+Recall ships in `vector` mode by default — pure semantic search, which is what survives the
+paraphrasing you saw above. If you also need exact name or jargon matching, switch to `hybrid`
+(vector + BM25) and choose how the two legs are fused: RRF, a vector-biased linear blend, or a
+cross-encoder reranker. Mode is set per call; fusion is a config setting.
+
+```yaml
+# ~/.hermes/config.yaml
+plugins:
+  lancedb:
+    retrieval:
+      mode: hybrid          # vector (default) | hybrid
+      reranker:
+        type: rrf           # how the vector + BM25 legs are fused
+        # Swap RRF for a reranking pass (pulls in sentence-transformers + torch):
+        # type: cross-encoder
+        # model: cross-encoder/ettin-reranker-17m-v1
+        # rerank_top_n: 50
+```
+
+The cross-encoder is the one path that pulls in a local ML stack, so it stays opt-in. It
+defaults to the compact 17M-parameter [ettin reranker](https://huggingface.co/cross-encoder/ettin-reranker-17m-v1).
+
+## Inspect the store
+
+Everything lives in one table named `memories` at `~/.hermes/lancedb/memories.lance`. Because
+it's a plain LanceDB table, you can open it directly and see exactly what the agent has stored
+— a `kind` column separates extracted `fact` rows from the raw `turn` rows they were drawn
+from:
+
+```python
+import lancedb
+
+db = lancedb.connect("~/.hermes/lancedb")
+tbl = db.open_table("memories")
+print(tbl.to_pandas()[["kind", "category", "content"]].head())
+```
+
+## Configuration
+
+The plugin runs on sensible defaults once activated — you don't have to configure anything.
+`~/.hermes/config.yaml` is purely for overrides. Two common ones:
+
+Use a cheaper model for the auxiliary fact-extraction calls:
+
+```yaml
+# ~/.hermes/config.yaml
+auxiliary:
+  lancedb_extraction:
+    provider: openrouter
+    model: google/gemini-3-flash
+```
+
+Point embeddings at a fully local endpoint (for example, Ollama) so nothing leaves your
+machine:
+
+```yaml
+# ~/.hermes/config.yaml
+plugins:
+  lancedb:
+    embedding:
+      model: nomic-embed-text
+      base_url: http://localhost:11434/v1
+      api_key_env: OLLAMA_API_KEY      # any value works for local Ollama
+```
+
+<Info>
+Changing the embedding model (or its dimension) against an existing store requires recreating
+the table — the plugin fails loudly on a dimension mismatch rather than silently returning
+nothing. Every option is documented in the plugin's [`default_config.yaml`](https://github.com/lancedb/hermes-agent-memory/blob/main/src/default_config.yaml).
+</Info>
+
+## Benchmark
+
+On [LongMemEval-S](https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned), a
+long-conversation QA benchmark, LanceDB's semantic recall clearly beat Hermes' built-in lexical
+search (0.66 vs. 0.53 answer accuracy) by finding the right messages even when the question was
+worded differently from the original conversation. For the full methodology, the
+per-question-type breakdown, and a reproducible harness, see the
+[blog post](https://www.lancedb.com/blog/semantic-memory-for-hermes-agent-with-lancedb) and the
+[benchmark harness](https://github.com/lancedb/hermes-agent-memory/tree/main/benchmarks).
+
+## Why this works well
+
+- **It's local-first and embedded.** The memory table lives on your disk with no server to run;
+  the plugin installs as a dependency of Hermes' own environment.
+- **Recall survives paraphrasing.** Semantic search matches meaning, not spelling, which is the
+  failure mode that sinks keyword-only session search.
+- **Memories are structured and traceable.** Each fact is a row with metadata and a link back
+  to the messages it came from, and `forget` always previews before it deletes.
+- **Nothing about it is a dead end.** As your needs grow, the same table abstraction carries
+  over to LanceDB [Enterprise](/enterprise) for automatic compaction, reindexing, and scale.
+
+To try it, install the plugin, enable it with `hermes memory setup`, and run the kind of
+workflow we walked through above.

From 204112d08862bc4b33d1816ade018eaef26349d4 Mon Sep 17 00:00:00 2001
From: prrao87 <35005448+prrao87@users.noreply.github.com>
Date: Thu, 2 Jul 2026 15:20:15 -0400
Subject: [PATCH 6/6] Fix nits

---
 .gitignore                            |  1 +
 docs/docs.json                        |  2 ++
 docs/integrations/ai/hermes-agent.mdx | 20 +++++++++++++-------
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/.gitignore b/.gitignore
index 557b3a08..3b9c7997 100644
--- a/.gitignore
+++ b/.gitignore
@@ -24,3 +24,4 @@ workflows/docs-audit/artifacts/**
 !workflows/docs-audit/artifacts/
 workflows/docs-audit/state/**
 !workflows/docs-audit/state/
+scratch
\ No newline at end of file
diff --git a/docs/docs.json b/docs/docs.json
index 7720741c..1beaca6e 100644
--- a/docs/docs.json
+++ b/docs/docs.json
@@ -264,6 +264,7 @@
               },
               {
                 "group": "Data Platforms & Frameworks",
+                "expanded": true,
                 "pages": [
                   "integrations/data/pydantic",
                   "integrations/data/duckdb",
@@ -275,6 +276,7 @@
               },
               {
                 "group": "AI Platforms & Frameworks",
+                "expanded": true,
                 "pages": [
                   "integrations/ai/agno",
                   "integrations/ai/hermes-agent",
diff --git a/docs/integrations/ai/hermes-agent.mdx b/docs/integrations/ai/hermes-agent.mdx
index 2be3414f..3e6b4c0c 100644
--- a/docs/integrations/ai/hermes-agent.mdx
+++ b/docs/integrations/ai/hermes-agent.mdx
@@ -1,7 +1,7 @@
 ---
 title: "Hermes Agent"
 sidebarTitle: "Hermes Agent"
-description: "Use LanceDB as a persistent, semantic memory backend for Hermes Agent — durable recall across sessions with vector and hybrid search."
+description: "Use LanceDB as a persistent, semantic memory backend for Hermes Agent. Get durable recall across sessions with vector and hybrid search."
 ---
 
 [Hermes Agent](https://github.com/NousResearch/hermes-agent) is a self-hosted, open-source
@@ -15,8 +15,12 @@ convention once, and the agent can retrieve it weeks later in a brand-new sessio
 you ask for it in completely different words. Everything runs inside Hermes' own Python
 process, storing a single LanceDB table on local disk. There's no memory server to operate.
 
-The mental model is clean: **Hermes owns the agent loop; LanceDB manages the durable
-long-term memory and offers semantic recall.**
+<Info>
+**The mental model is clean**
+
+- Hermes owns the agent loop
+- LanceDB manages the durable long-term memory and offers semantic recall.
+</Info>
 
 ## Why LanceDB fits agent memory
 
@@ -127,10 +131,12 @@ Beyond these tools, the plugin also captures durable facts from your conversatio
 automatically — an auxiliary model distills them before context is compressed and again when a
 session ends, so insights survive even when the raw messages are summarized away.
 
-## Walkthrough: teach it a project convention
+## Walkthrough
+
+"_Teach it your project preferences_"
 
-Let's make this concrete with the pain we opened on: re-explaining your setup every session.
-We'll save a convention once, then prove a brand-new session can recall it — touching all four
+Let's make this concrete with the pain we opened on: re-explaining your setup to the agent every session.
+We'll save a convention once and then prove a brand-new session can recall it. This example will touch all four
 tools along the way.
 
 ### Remember
@@ -308,7 +314,7 @@ per-question-type breakdown, and a reproducible harness, see the
 
 ## Why this works well
 
-- **It's local-first and embedded.** The memory table lives on your disk with no server to run;
+- **It's local-first and embedded.** The LanceDB memory table lives on your disk with no server to run;
   the plugin installs as a dependency of Hermes' own environment.
 - **Recall survives paraphrasing.** Semantic search matches meaning, not spelling, which is the
   failure mode that sinks keyword-only session search.