diff --git a/.gitignore b/.gitignore
index 557b3a08..3b9c7997 100644
--- a/.gitignore
+++ b/.gitignore
@@ -24,3 +24,4 @@ workflows/docs-audit/artifacts/**
!workflows/docs-audit/artifacts/
workflows/docs-audit/state/**
!workflows/docs-audit/state/
+scratch
\ No newline at end of file
diff --git a/docs/docs.json b/docs/docs.json
index 2e9f8eb4..1beaca6e 100644
--- a/docs/docs.json
+++ b/docs/docs.json
@@ -264,6 +264,7 @@
},
{
"group": "Data Platforms & Frameworks",
+ "expanded": true,
"pages": [
"integrations/data/pydantic",
"integrations/data/duckdb",
@@ -275,8 +276,10 @@
},
{
"group": "AI Platforms & Frameworks",
+ "expanded": true,
"pages": [
"integrations/ai/agno",
+ "integrations/ai/hermes-agent",
"integrations/ai/huggingface",
"integrations/ai/langchain",
"integrations/ai/llamaIndex",
diff --git a/docs/integrations/ai/hermes-agent.mdx b/docs/integrations/ai/hermes-agent.mdx
new file mode 100644
index 00000000..3e6b4c0c
--- /dev/null
+++ b/docs/integrations/ai/hermes-agent.mdx
@@ -0,0 +1,327 @@
+---
+title: "Hermes Agent"
+sidebarTitle: "Hermes Agent"
+description: "Use LanceDB as a persistent, semantic memory backend for Hermes Agent. Get durable recall across sessions with vector and hybrid search."
+---
+
+[Hermes Agent](https://github.com/NousResearch/hermes-agent) is a self-hosted, open-source
+personal agent from [Nous Research](https://nousresearch.com). You can talk to it from a
+terminal UI or reach the same agent from Telegram, Discord, and Slack, and it exposes a
+dedicated slot for external *memory providers* that run alongside its built-in notes.
+
+The [LanceDB memory plugin](https://github.com/lancedb/hermes-agent-memory) fills that slot.
+It gives Hermes durable, semantic recall across sessions: state a preference or a project
+convention once, and the agent can retrieve it weeks later in a brand-new session — even when
+you ask for it in completely different words. Everything runs inside Hermes' own Python
+process, storing a single LanceDB table on local disk. There's no memory server to operate.
+
+
+**The mental model is clean**
+
+- Hermes owns the agent loop
+- LanceDB manages the durable long-term memory and offers semantic recall.
+
+
+## Why LanceDB fits agent memory
+
+Out of the box, Hermes remembers with a small curated notes file frozen into the system
+prompt, plus lexical (keyword) search over past sessions. Both are useful, but keyword search
+misses paraphrases of what you originally typed — the exact thing you need when recalling a
+fact you phrased differently months ago.
+
+LanceDB is an embedded retrieval library, which makes it a natural fit here:
+
+- **No server to stand up** — it reads and writes a table on local disk, so the plugin ships
+ as a dependency rather than a service to operate.
+- **One table holds everything** — content, metadata, and embeddings live together. A memory
+ becomes a structured row with a category, tags, timestamps, and provenance, not just a text
+ blob.
+- **Query it any way you need** — vector similarity for meaning, BM25 full-text for exact
+ names and jargon, a hybrid of the two, or plain metadata filters to keep recall scoped to
+ the right workspace.
+- **It scales up** — the same table abstraction carries over to larger LanceDB deployments
+ later, so the local setup is never a dead end.
+
+## Install and activate
+
+
+Want to try this without touching your existing Hermes setup? Run everything in an isolated
+profile: `hermes profile create demo`, then add `-p demo` to the commands below. When you're
+done, `rm -rf ~/.hermes/profiles/demo` removes all trace.
+
+
+
+
+Skip this if you already have Hermes installed.
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
+```
+
+
+
+This shallow-clones the plugin into `~/.hermes/plugins/lancedb/`.
+
+```bash
+hermes plugins install lancedb/hermes-agent-memory
+```
+
+
+
+Hermes loads plugins inside its own Python interpreter, so the dependencies go *there* — not
+into a separate virtualenv. (This interpreter is shared across profiles, so you only install
+once.)
+
+```bash
+uv pip install --python ~/.hermes/hermes-agent/venv/bin/python3 lancedb openai pyyaml
+```
+
+
+
+The plugin turns conversations into embeddings, so it needs an embeddings key. By default that
+is OpenAI, so set `OPENAI_API_KEY` in your environment or in `~/.hermes/.env`.
+
+
+Prefer a local or non-OpenAI model? The plugin uses an OpenAI-compatible client, so you can
+point it at any compatible endpoint (OpenRouter, Ollama, vLLM, …) in your config — no code
+change needed. See [Configuration](#configuration) below.
+
+
+
+
+Switch memory on and pick this plugin:
+
+```bash
+hermes memory setup # choose "lancedb"
+```
+
+Then confirm it's actually active before you start chatting — this is the one step worth not
+skipping, because Hermes quietly falls back to its built-in notes if the provider isn't set:
+
+```bash
+hermes memory status
+```
+
+```text
+Memory status
+────────────────────────────────────────
+ Built-in: always active
+ Provider: lancedb
+
+ Plugin: installed ✓
+ Status: available ✓
+```
+
+You want to see `Provider: lancedb` with both `installed ✓` and `available ✓`.
+
+
+
+## The memory tools
+
+Once activated, the agent has four tools for working with long-term memory:
+
+| Tool | What it does |
+|:--|:--|
+| `lancedb_recall` | Semantic (vector, the default) or hybrid search over your workspace memory. Returns matching facts with scores and provenance. |
+| `lancedb_remember` | Stores a durable fact when you explicitly ask. Deduplicated by content hash, so remembering the same thing twice doesn't pile up rows. |
+| `lancedb_read` | Fetches a single memory by ID, optionally with the original conversation messages it was distilled from. |
+| `lancedb_forget` | Deletes safely: previews candidates first, then deletes by exact ID, so nothing disappears by accident. |
+
+Beyond these tools, the plugin also captures durable facts from your conversations
+automatically — an auxiliary model distills them before context is compressed and again when a
+session ends, so insights survive even when the raw messages are summarized away.
+
+## Walkthrough
+
+"_Teach it your project preferences_"
+
+Let's make this concrete with the pain we opened on: re-explaining your setup to the agent every session.
+We'll save a convention once and then prove a brand-new session can recall it. This example will touch all four
+tools along the way.
+
+### Remember
+
+Ask Hermes to commit a convention to long-term memory. Saying "remember in long-term memory"
+makes sure it lands in the LanceDB store, which shows up as the `⚡ lancedb_r` (`lancedb_remember`)
+line below:
+
+```text
+● Remember in long-term memory: for this project I only use uv, never pip, and I always add type hints to Python functions.
+
+ ┊ 🧠 memory +memory: "For this project, the user only uses uv for Python package management, never pip, and always adds type hints to Python functions."
+ ┊ ⚡ lancedb_r 0.0s
+ ─ ⚕ Hermes ────────────────────────────────────────────────────────────────
+
+ Remembered. I've stored that project convention: use uv only, never pip, and always add type hints to Python functions.
+```
+
+### Recall
+
+First, take Hermes' built-in notes out of the picture so recall can *only* come from LanceDB —
+the two layers run side by side otherwise, and either could answer:
+
+```bash
+hermes config set memory.memory_enabled false
+```
+
+Now start a fresh session (`/new`) and ask for the convention back, worded completely
+differently from how you saved it:
+
+```text
+● /new
+● Look through your long-term memory for my Python project preferences.
+
+ ─ ⚕ Hermes ────────────────────────────────────────────────────────────────
+
+ I found one durable memory about Python project preferences:
+
+ - Project Python tooling preference is uv; pip is not used; type hints are required.
+```
+
+This is the whole point in one exchange. You asked about your *preferences* using none of the
+words you stored the fact with — no *uv*, no *pip*, no *type hints* — yet a fresh session, with
+none of the earlier conversation in context, pulls it straight back. Semantic search matches on
+meaning rather than exact wording, which is precisely where keyword search falls short.
+
+
+Turn the built-in layer back on for everyday use with `hermes config set memory.memory_enabled true`
+— the two memory layers happily coexist. We only disabled it here to prove the recall came from
+LanceDB.
+
+
+### Read
+
+You can also ask where a fact came from. Hermes attributes the answer to its stored memory
+rather than guessing from a file in the repo (under the hood, `lancedb_read` can also return
+the exact source messages a fact was distilled from):
+
+```text
+● Where did that uv preference come from?
+
+ ─ ⚕ Hermes ────────────────────────────────────────────────────────────────
+
+ It came from the project's persistent memory, not from a codebase file I inspected.
+
+ Specifically, the recalled memory says:
+ - "For this project, the user only uses uv for Python package management, never pip, and always adds type hints to Python functions."
+```
+
+### Forget
+
+When a preference changes, ask Hermes to drop the old fact. The tool calls tell the whole
+story: the two `⚡ lancedb_f` (`lancedb_forget`) lines are it previewing matches and then
+deleting, and the trailing `⚡ lancedb_r` is it saving the replacement in the same breath:
+
+```text
+● I've switched from uv to pixi. Forget the uv preference.
+
+ ┊ ⚡ lancedb_f uv preference python package management never pip type hints 0.3s
+ ┊ ⚡ lancedb_f 0.0s
+ ┊ ⚡ lancedb_r 0.2s
+ ─ ⚕ Hermes ────────────────────────────────────────────────────────────────
+
+ Done — I removed the uv preference from memory and replaced it with:
+
+ - This project uses pixi for Python package management
+ - Type hints on Python functions are still preferred
+```
+
+Remember, recall, read, forget: four small operations that between them cover the entire
+lifecycle of a durable memory.
+
+## Retrieval modes
+
+Recall ships in `vector` mode by default — pure semantic search, which is what survives the
+paraphrasing you saw above. If you also need exact name or jargon matching, switch to `hybrid`
+(vector + BM25) and choose how the two legs are fused: RRF, a vector-biased linear blend, or a
+cross-encoder reranker. Mode is set per call; fusion is a config setting.
+
+```yaml
+# ~/.hermes/config.yaml
+plugins:
+ lancedb:
+ retrieval:
+ mode: hybrid # vector (default) | hybrid
+ reranker:
+ type: rrf # how the vector + BM25 legs are fused
+ # Swap RRF for a reranking pass (pulls in sentence-transformers + torch):
+ # type: cross-encoder
+ # model: cross-encoder/ettin-reranker-17m-v1
+ # rerank_top_n: 50
+```
+
+The cross-encoder is the one path that pulls in a local ML stack, so it stays opt-in. It
+defaults to the compact 17M-parameter [ettin reranker](https://huggingface.co/cross-encoder/ettin-reranker-17m-v1).
+
+## Inspect the store
+
+Everything lives in one table named `memories` at `~/.hermes/lancedb/memories.lance`. Because
+it's a plain LanceDB table, you can open it directly and see exactly what the agent has stored
+— a `kind` column separates extracted `fact` rows from the raw `turn` rows they were drawn
+from:
+
+```python
+import lancedb
+
+db = lancedb.connect("~/.hermes/lancedb")
+tbl = db.open_table("memories")
+print(tbl.to_pandas()[["kind", "category", "content"]].head())
+```
+
+## Configuration
+
+The plugin runs on sensible defaults once activated — you don't have to configure anything.
+`~/.hermes/config.yaml` is purely for overrides. Two common ones:
+
+Use a cheaper model for the auxiliary fact-extraction calls:
+
+```yaml
+# ~/.hermes/config.yaml
+auxiliary:
+ lancedb_extraction:
+ provider: openrouter
+ model: google/gemini-3-flash
+```
+
+Point embeddings at a fully local endpoint (for example, Ollama) so nothing leaves your
+machine:
+
+```yaml
+# ~/.hermes/config.yaml
+plugins:
+ lancedb:
+ embedding:
+ model: nomic-embed-text
+ base_url: http://localhost:11434/v1
+ api_key_env: OLLAMA_API_KEY # any value works for local Ollama
+```
+
+
+Changing the embedding model (or its dimension) against an existing store requires recreating
+the table — the plugin fails loudly on a dimension mismatch rather than silently returning
+nothing. Every option is documented in the plugin's [`default_config.yaml`](https://github.com/lancedb/hermes-agent-memory/blob/main/src/default_config.yaml).
+
+
+## Benchmark
+
+On [LongMemEval-S](https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned), a
+long-conversation QA benchmark, LanceDB's semantic recall clearly beat Hermes' built-in lexical
+search (0.66 vs. 0.53 answer accuracy) by finding the right messages even when the question was
+worded differently from the original conversation. For the full methodology, the
+per-question-type breakdown, and a reproducible harness, see the
+[blog post](https://www.lancedb.com/blog/semantic-memory-for-hermes-agent-with-lancedb) and the
+[benchmark harness](https://github.com/lancedb/hermes-agent-memory/tree/main/benchmarks).
+
+## Why this works well
+
+- **It's local-first and embedded.** The LanceDB memory table lives on your disk with no server to run;
+ the plugin installs as a dependency of Hermes' own environment.
+- **Recall survives paraphrasing.** Semantic search matches meaning, not spelling, which is the
+ failure mode that sinks keyword-only session search.
+- **Memories are structured and traceable.** Each fact is a row with metadata and a link back
+ to the messages it came from, and `forget` always previews before it deletes.
+- **Nothing about it is a dead end.** As your needs grow, the same table abstraction carries
+ over to LanceDB [Enterprise](/enterprise) for automatic compaction, reindexing, and scale.
+
+To try it, install the plugin, enable it with `hermes memory setup`, and run the kind of
+workflow we walked through above.
diff --git a/docs/training/torch.mdx b/docs/training/torch.mdx
index 9a9dbe5b..c04d2760 100644
--- a/docs/training/torch.mdx
+++ b/docs/training/torch.mdx
@@ -17,13 +17,14 @@ The `Table` class in LanceDB implements a contract for a PyTorch
import lancedb
import torch
import pyarrow as pa
+from lancedb.util import tbl_to_tensor
mem_db = lancedb.connect("memory://")
table = mem_db.create_table("test_table", pa.table({"a": range(1000)}))
# Any LanceDB table can be used as a PyTorch Dataset
dataloader = torch.utils.data.DataLoader(
- table, batch_size=1024, shuffle=True
+ table, batch_size=1024, shuffle=True, collate_fn=tbl_to_tensor
)
for batch in dataloader:
@@ -42,12 +43,17 @@ dataloader = torch.utils.data.DataLoader(permutation)
## Output Formats
-By default, a `Table` data loader will emit a `pyarrow.RecordBatch`. To convert to a different format (such as a
-`pytorch.Tensor`), you will need to provide a custom collate function.
+By default, a `Table` data loader will emit Arrow data. `collate_fn` is PyTorch's batching hook: PyTorch calls it to
+turn the fetched items into one batch. PyTorch's default collate function only knows how to combine tensors, NumPy
+arrays, numbers, dicts, and lists, so it does not accept Arrow data directly. When using a `Table` directly, pass
+LanceDB's `lancedb.util.tbl_to_tensor` helper as PyTorch's `collate_fn`; it converts numeric Arrow columns into a
+column-major `torch.Tensor` with shape `(columns, rows)`.
-The `Permutation` class is more flexible. By default, the output will be a list of dicts. This is the default output
-format of standard data loaders and usually more convenient when you are getting started. However, there is a
-significant performance penalty converting from Arrow, Lance's internal representation, to this default format.
+`Permutation` works differently: its default output is a list of Python dicts, which PyTorch's default collate function
+can batch into a dict of tensors. This is usually more convenient when you are getting started. However, there is a
+significant performance penalty converting from Arrow, Lance's internal representation, to this default format. Use a
+direct `Table` with `collate_fn` when you want Arrow-to-tensor conversion, or a `Permutation` when you want the default
+PyTorch dict-of-tensors behavior.
To address this, the `Permutation` class provides a set of builtin transform functions that can be applied to map
the Arrow data in different ways. The `arrow` and `polars` formats will always avoid data copies. However, `numpy`,
@@ -96,3 +102,84 @@ dataloader = torch.utils.data.DataLoader(
for batch in dataloader:
print(batch.schema)
```
+
+## Using multiple DataLoader workers
+
+Set `num_workers > 0` to read from LanceDB in multiple PyTorch worker processes. LanceDB tables and `Permutation` objects are picklable, so each worker reopens the table after it starts.
+
+Prefer the `spawn` start method when using multiple workers; LanceDB uses internal threads. See [the performance guide](/performance) for more multiprocessing guidance.
+
+```py Python icon=Python
+import torch
+from lancedb.permutation import Permutation
+
+permutation = Permutation.identity(table)
+dataloader = torch.utils.data.DataLoader(
+ permutation,
+ batch_size=1024,
+ shuffle=True,
+ num_workers=4,
+ multiprocessing_context="spawn",
+ persistent_workers=True,
+)
+```
+
+### Remote tables in DataLoader workers
+
+Remote LanceDB Enterprise tables (`db://...`) work the same way: workers reopen the table from the pickled connection state.
+
+```py Python icon=Python
+import lancedb
+import torch
+from lancedb.util import tbl_to_tensor
+
+db = lancedb.connect(
+ "db://my-database",
+ api_key="sk-...",
+ region="us-east-1",
+)
+table = db.open_table("my_table")
+
+dataloader = torch.utils.data.DataLoader(
+ table,
+ batch_size=512,
+ num_workers=4,
+ multiprocessing_context="spawn",
+ collate_fn=tbl_to_tensor,
+)
+```
+
+
+This sends the connection state, including the API key, to each worker. Use a connection factory if credentials should be loaded inside the worker or your `client_config` contains a non-serializable `header_provider`.
+
+
+### Providing a custom connection factory
+
+`Permutation.with_connection_factory` lets each worker reopen the base table with custom logic. The factory takes the table name, returns a LanceDB table, and must be picklable.
+
+```py Python icon=Python
+import os
+import lancedb
+import torch
+from lancedb.permutation import Permutation
+
+def open_table(name: str):
+ db = lancedb.connect(
+ "db://my-database",
+ api_key=os.environ["LANCEDB_API_KEY"],
+ region="us-east-1",
+ )
+ return db.open_table(name)
+
+table = open_table("my_table")
+permutation = (
+ Permutation.identity(table)
+ .with_connection_factory(open_table)
+)
+dataloader = torch.utils.data.DataLoader(
+ permutation,
+ batch_size=512,
+ num_workers=4,
+ multiprocessing_context="spawn",
+)
+```