From d051177e0e607ec6b210d2630181792bced0601e Mon Sep 17 00:00:00 2001
From: sosahi <syousefisahi@nvidia.com>
Date: Fri, 29 May 2026 09:25:49 -0700
Subject: [PATCH 01/10] removed duplicate

---
 skills/nemo-retriever/SKILL.md | 8 --------
 1 file changed, 8 deletions(-)
diff --git a/skills/nemo-retriever/SKILL.md b/skills/nemo-retriever/SKILL.md
index 63a5aa8a4..0dc6e7056 100644
--- a/skills/nemo-retriever/SKILL.md
+++ b/skills/nemo-retriever/SKILL.md
@@ -5,14 +5,6 @@ license: Apache-2.0
 allowed-tools: Bash Write Read
 ---
 
-| name          | nemo-retriever |
-| :------------ | :-- |
-| description   | Use when the user wants to search, index, or answer questions over a folder of PDFs (or other documents) — including building a RAG / search index over PDFs, looking up information across many PDFs, or running the `retriever` CLI (ingest, query, pipeline, recall, eval, etc.). |
-| license       | Apache-2.0 |
-| compatibility | Designed for Claude Code, OpenCode, Codex, and Agent Skills-compatible tools. Requires Git, network access to GitHub |
-| metadata      |     |
-| allowed-tools | Bash Write Read |
-
 # nemo-retriever
 
 The `retriever` CLI indexes a folder of PDFs into LanceDB (`retriever ingest`) and serves vector search over it (`retriever query`). For any task about searching/answering questions across a folder of PDFs, use this CLI — do not write a custom RAG.

From 499943ab1f8f2bf8e705262fdbe7f706e5e26406 Mon Sep 17 00:00:00 2001
From: sosahi <syousefisahi@nvidia.com>
Date: Fri, 29 May 2026 09:51:08 -0700
Subject: [PATCH 02/10] incorporate Janisha's feedback

---
 skills/nemo-retriever/SKILL.md            | 4 ++--
 skills/nemo-retriever/references/query.md | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/skills/nemo-retriever/SKILL.md b/skills/nemo-retriever/SKILL.md
index 0dc6e7056..4f1932bbd 100644
--- a/skills/nemo-retriever/SKILL.md
+++ b/skills/nemo-retriever/SKILL.md
@@ -18,7 +18,7 @@ If `command -v retriever` returns nothing, follow `references/install.md` to ins
 | Turn type | Read this once | Then execute |
 | :--- | :--- | :--- |
 | **Setup turn** (first turn — `./lancedb/nv-ingest.lance` doesn't exist) | `references/setup.md` | Build the index |
-| **Query turn** (every subsequent turn — user asks a question) | `references/query.md` | One `retriever query` call, then `Write` `./output.json` |
+| **Query turn** (every subsequent turn — user asks a question) | `references/query.md` | One `retriever query` call, then `Write` `./output.json` **only if** the prompt mentions a judge, benchmark, output schema, or `output.json`; otherwise answer in chat |
 | Anything errored or returned empty | `references/pitfalls.md` | Apply the named recovery; do not improvise |
 
 For the full `retriever ingest` / `retriever query` CLI specs, see `references/cli/ingest.md` and `references/cli/query.md`. You do not need these for routine turns — `<RETRIEVER_VENV>/bin/retriever <subcommand> --help` is faster.
@@ -26,7 +26,7 @@ For the full `retriever ingest` / `retriever query` CLI specs, see `references/c
 ## Hard limits (apply to every turn)
 
 - **Setup turn**: build the index in one shell command (see `references/setup.md`). STOP after the index lands.
-- **Query turn**: at most **2 Bash calls** — 1 `retriever query`, +1 optional targeted text-extract per `references/query.md`. Then `Write` `./output.json` and STOP.
+- **Query turn**: at most **2 Bash calls** — 1 `retriever query`, +1 optional targeted text-extract per `references/query.md`. Then `Write` `./output.json` and STOP. **General-use exception:** if the prompt doesn't mention a judge, benchmark, output schema, or `output.json`, skip the `Write` step and answer in chat — all other discipline (2-Bash-call budget, no-narration, chart/image hedging) still applies.
 - **No narration between tool calls.** Tokens you emit between calls become input + cached input for every later turn — quadratic cost. Go straight from reading the summary to writing the JSON file.
 - **Banned**: `TodoWrite`, Glob, Grep, `Read` of whole PDFs, re-running setup, spawning subagents, speculative "confirmation" calls.
 
diff --git a/skills/nemo-retriever/references/query.md b/skills/nemo-retriever/references/query.md
index fc98e08c5..990bbae00 100644
--- a/skills/nemo-retriever/references/query.md
+++ b/skills/nemo-retriever/references/query.md
@@ -1,5 +1,7 @@
 # Query turn — the WHOLE workflow
 
+**General-use callout:** if the prompt doesn't mention a judge, benchmark, output schema, or `output.json`, skip the `Write` `./output.json` step and answer in chat. All other discipline (2-Bash-call budget, no-narration, chart/image hedging) still applies.
+
 ## Filename fast path — try BEFORE `retriever query`
 
 If the user's question literally contains a PDF basename from `./pdfs/` (stem ≥6 chars, with or without `.pdf`, case-insensitive), skip semantic search. Direct pdfium extraction on the named file is faster and avoids semantic-search misses — the right doc is given, and pages rank by query-token overlap.

From 2a054b93f805d10ab08a8b51f1ac583e763dd256 Mon Sep 17 00:00:00 2001
From: sosahi <syousefisahi@nvidia.com>
Date: Fri, 29 May 2026 10:09:33 -0700
Subject: [PATCH 03/10] update description

---
 skills/nemo-retriever/SKILL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/skills/nemo-retriever/SKILL.md b/skills/nemo-retriever/SKILL.md
index 4f1932bbd..735867600 100644
--- a/skills/nemo-retriever/SKILL.md
+++ b/skills/nemo-retriever/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: nemo-retriever
-description: Use when the user wants to search, index, or answer questions over a folder of PDFs (or other documents) — including building a RAG / search index over PDFs, looking up information across many PDFs, or running the `retriever` CLI (ingest, query, pipeline, recall, eval, etc.).
+description: Use when the user wants to search, query, extract, transcribe, describe, quote, filter, or aggregate across documents — PDFs, scanned forms / images (`.jpg` `.png` `.tiff`), Office (`.docx` `.pptx`), text (`.html` `.txt`), audio (`.mp3` `.wav` `.m4a`), or video (`.mp4` `.mov`). Prefer this over native Read / Grep for multi-file or non-PDF corpora. Not for: editing files, web browsing, single-file plain-text lookups, fine-tuning.
 license: Apache-2.0
 allowed-tools: Bash Write Read
 ---

From fb2379e7e9480dc092fafa6f1fc7e51c5bd8d8b7 Mon Sep 17 00:00:00 2001
From: sosahi <syousefisahi@nvidia.com>
Date: Fri, 29 May 2026 10:38:54 -0700
Subject: [PATCH 04/10] updated per Janisha's request

---
 skills/nemo-retriever/SKILL.md            | 2 +-
 skills/nemo-retriever/references/query.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/skills/nemo-retriever/SKILL.md b/skills/nemo-retriever/SKILL.md
index 410fe4801..2f465483d 100644
--- a/skills/nemo-retriever/SKILL.md
+++ b/skills/nemo-retriever/SKILL.md
@@ -30,7 +30,7 @@ Before ingesting a mixed folder, inventory extensions (`find <dir> -name '*.*' |
 ## Hard limits (apply to every turn)
 
 - **Setup turn**: build the index in one shell command (see `references/setup.md`). STOP after the index lands.
-- **Query turn**: at most **2 Bash calls** — 1 `retriever query`, +1 optional targeted text-extract per `references/query.md`. Then `Write` `./output.json` and STOP. **General-use exception:** if the prompt doesn't mention a judge, benchmark, output schema, or `output.json`, skip the `Write` step and answer in chat — all other discipline (2-Bash-call budget, no-narration, chart/image hedging) still applies.
+- **Query turn**: at most **2 Bash calls** — 1 `retriever query`, +1 optional targeted text-extract per `references/query.md`. Then `Write` `./output.json` (eval-harness only) or answer in chat (general use) and STOP.
 - **No narration between tool calls.** Tokens you emit between calls become input + cached input for every later turn — quadratic cost. Go straight from reading the summary to writing the JSON file.
 - **Banned**: `TodoWrite`, Glob, Grep, `Read` of whole PDFs, re-running setup, spawning subagents, speculative "confirmation" calls.
 
diff --git a/skills/nemo-retriever/references/query.md b/skills/nemo-retriever/references/query.md
index e7eaf30ef..728b371b2 100644
--- a/skills/nemo-retriever/references/query.md
+++ b/skills/nemo-retriever/references/query.md
@@ -1,6 +1,6 @@
 # Query turn — the WHOLE workflow
 
-**General-use callout:** if the prompt doesn't mention a judge, benchmark, output schema, or `output.json`, skip the `Write` `./output.json` step and answer in chat. All other discipline (2-Bash-call budget, no-narration, chart/image hedging) still applies.
+**General-use vs eval harness callout:** if the user prompt doesn't mention a judge, benchmark, output schema, or `output.json`, then just run the `retriever query` call and answer in chat.
 
 ## Filename fast path — try BEFORE `retriever query`
 

From 6b1cb2e4599b431694289db88aaf49923ccd0b8d Mon Sep 17 00:00:00 2001
From: sosahi <syousefisahi@nvidia.com>
Date: Fri, 29 May 2026 11:49:52 -0700
Subject: [PATCH 05/10] more updates from janisha

---
 skills/nemo-retriever/SKILL.md            | 4 ++--
 skills/nemo-retriever/references/query.md | 8 +-------
 2 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/skills/nemo-retriever/SKILL.md b/skills/nemo-retriever/SKILL.md
index 2f465483d..dcbf0d9ac 100644
--- a/skills/nemo-retriever/SKILL.md
+++ b/skills/nemo-retriever/SKILL.md
@@ -20,7 +20,7 @@ If `command -v retriever` returns nothing, follow `references/install.md` to ins
 | Turn type | Read this once | Then execute |
 | :--- | :--- | :--- |
 | **Setup turn** (first turn — `./lancedb/nv-ingest.lance` doesn't exist) | `references/setup.md` | Build the index |
-| **Query turn** (every subsequent turn — user asks a question) | `references/query.md` | One `retriever query` call, then `Write` `./output.json` **only if** the prompt mentions a judge, benchmark, output schema, or `output.json`; otherwise answer in chat |
+| **Query turn** (every subsequent turn — user asks a question) | `references/query.md` | One `retriever query` call |
 | Anything errored or returned empty | `references/pitfalls.md` | Apply the named recovery; do not improvise |
 
 For the full `retriever ingest` / `retriever query` CLI specs, see `references/cli/ingest.md` and `references/cli/query.md`. You do not need these for routine turns — `<RETRIEVER_VENV>/bin/retriever <subcommand> --help` is faster.
@@ -30,7 +30,7 @@ Before ingesting a mixed folder, inventory extensions (`find <dir> -name '*.*' |
 ## Hard limits (apply to every turn)
 
 - **Setup turn**: build the index in one shell command (see `references/setup.md`). STOP after the index lands.
-- **Query turn**: at most **2 Bash calls** — 1 `retriever query`, +1 optional targeted text-extract per `references/query.md`. Then `Write` `./output.json` (eval-harness only) or answer in chat (general use) and STOP.
+- **Query turn**: at most **2 Bash calls** — 1 `retriever query`, +1 optional targeted text-extract per `references/query.md`.
 - **No narration between tool calls.** Tokens you emit between calls become input + cached input for every later turn — quadratic cost. Go straight from reading the summary to writing the JSON file.
 - **Banned**: `TodoWrite`, Glob, Grep, `Read` of whole PDFs, re-running setup, spawning subagents, speculative "confirmation" calls.
 
diff --git a/skills/nemo-retriever/references/query.md b/skills/nemo-retriever/references/query.md
index 728b371b2..4339e9f20 100644
--- a/skills/nemo-retriever/references/query.md
+++ b/skills/nemo-retriever/references/query.md
@@ -1,9 +1,5 @@
 # Query turn — the WHOLE workflow
 
-**General-use vs eval harness callout:** if the user prompt doesn't mention a judge, benchmark, output schema, or `output.json`, then just run the `retriever query` call and answer in chat.
-
-## Filename fast path — try BEFORE `retriever query`
-
 
 ```bash
 <RETRIEVER_VENV>/bin/retriever query "<the user's question>" --top-k 10 --embed-model-name nvidia/llama-nemotron-embed-1b-v2 --rerank \
@@ -15,7 +11,7 @@ Run that **exactly** as a single pipeline — do not split it into `HITS=$(...)`
 
 That's your FIRST tool call on every query turn. Do not Read, Glob, Grep, or list PDFs before this — those duplicate what `retriever query` already did.
 
-**No narration between tool calls.** Do not write "Let me search…", "I'll now analyze…", "The retriever returned…", or any other commentary. Every assistant token you emit between the `retriever query` Bash call and the `Write` of `./output.json` becomes input tokens (and cached input tokens) for every subsequent turn in this session — quadratic cost. Go straight from reading the summary to writing the JSON file. The only assistant text in a query turn should be the tool calls themselves.
+**No narration between tool calls.** Do not write "Let me search…", "I'll now analyze…", "The retriever returned…", or any other commentary. Every assistant token you emit with the `retriever query` Bash call becomes input tokens (and cached input tokens) for every subsequent turn in this session — quadratic cost. Go straight from reading the summary to writing the JSON file. The only assistant text in a query turn should be the tool calls themselves.
 
 Each hit has: `text`, `pdf_basename`, `page_number` (int, **1-indexed**: the first page of a PDF is page `1`), `pdf_page` (string composite key `"<basename>_<page_number>"` — not a number, don't use it as one), `_distance`, and `metadata` (JSON with `type` ∈ `text|table|chart|image`).
 
@@ -53,8 +49,6 @@ If a question asks for an exact percentage or a directional claim **and the evid
 
 When both a chart hit and a text hit cover the same fact, always prefer the text hit's number.
 
-After writing `./output.json`, STOP. No print, no summary, no further tool calls.
-
 ## Non-semantic operations (use these, don't fall back to native tools)
 
 **Page filter** — "what's on page N of doc.pdf" → filter LanceDB directly, no `Read`:

From b43a935f64e88915bac0c66861f0a1766fea5e8e Mon Sep 17 00:00:00 2001
From: sosahi <syousefisahi@nvidia.com>
Date: Fri, 29 May 2026 12:34:26 -0700
Subject: [PATCH 06/10] all the remaining updates from Janisha

---
 skills/nemo-retriever/SKILL.md                      | 2 +-
 skills/nemo-retriever/references/query.md           | 3 ++-
 skills/nemo-retriever/references/troubleshooting.md | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/skills/nemo-retriever/SKILL.md b/skills/nemo-retriever/SKILL.md
index dcbf0d9ac..d4b25e792 100644
--- a/skills/nemo-retriever/SKILL.md
+++ b/skills/nemo-retriever/SKILL.md
@@ -30,7 +30,7 @@ Before ingesting a mixed folder, inventory extensions (`find <dir> -name '*.*' |
 ## Hard limits (apply to every turn)
 
 - **Setup turn**: build the index in one shell command (see `references/setup.md`). STOP after the index lands.
-- **Query turn**: at most **2 Bash calls** — 1 `retriever query`, +1 optional targeted text-extract per `references/query.md`.
+- **Query turn**: at most **2 Bash calls** — 1 `retriever query`, +1 optional targeted text-extract per `references/query.md`. Reply and then STOP.
 - **No narration between tool calls.** Tokens you emit between calls become input + cached input for every later turn — quadratic cost. Go straight from reading the summary to writing the JSON file.
 - **Banned**: `TodoWrite`, Glob, Grep, `Read` of whole PDFs, re-running setup, spawning subagents, speculative "confirmation" calls.
 
diff --git a/skills/nemo-retriever/references/query.md b/skills/nemo-retriever/references/query.md
index 4339e9f20..b42cadae5 100644
--- a/skills/nemo-retriever/references/query.md
+++ b/skills/nemo-retriever/references/query.md
@@ -27,7 +27,7 @@ It scans the LanceDB table the retriever already built — no PDF re-extraction.
 
 Don't reach for `pdftotext`, `pdftohtml`, or `pdfgrep` — they're system tools that aren't guaranteed installed on the user's machine. The retriever venv bundles pdfium and `lancedb`; `grep_corpus.py` and `retriever pdf stage page-elements --method pdfium` cover the same use cases without that dependency.
 
-## Write `./output.json` directly from the hits
+## Compose your reply from the hits
 
 - `final_answer`: synthesize from the top hits' `text`. Include the exact number / name / date / row / column the question asks for, plus the source PDF and 0-indexed page. One paragraph. No restating the question, no hedging caveats. If the chunks talk *around* the fact but don't state it, run ONE `<RETRIEVER_VENV>/bin/retriever pdf stage page-elements ./pdfs --method pdfium --json-output-dir /tmp/pdf_text --compact-json` and `Read` `/tmp/pdf_text/<top_pdf>.pdf.pdf_extraction.json` for the rank-1 page (or rank-2 if rank-1 is metadata) — that almost always surfaces the exact figure. Then synthesize. **If after both calls the asked-for fact still isn't in the evidence, write `final_answer` that says so explicitly** — e.g. "The retrieved pages do not state [X] for [entity]; the closest content is [Y]." Do NOT invent, extrapolate, or generate plausible-sounding content from adjacent material. A confidently-wrong answer scores worse than an honest "not in the retrieved pages".
 - `ranked_retrieved`: one entry per hit in the order `retriever query` returned: `{"doc_id": "<pdf_basename without .pdf>", "page_number": <int>, "rank": <i+1>}`. Up to 10. Duplicate `(doc, page)` is fine. **Indexing:** the retriever's `page_number` is 1-indexed. If the task's output schema says 0-indexed (e.g. "first page is page 0"), emit `hit.page_number - 1`; if the task says 1-indexed or doesn't specify, emit `hit.page_number` as-is.
@@ -48,6 +48,7 @@ If a question asks for an exact percentage or a directional claim **and the evid
 3. If prose doesn't mention it, **quote the chart transcription verbatim with an explicit hedge in `final_answer`**: "The chart on page N indicates [verbatim phrase] (chart-derived, not verified against prose)." Do NOT restate the chart's number as a confident fact.
 
 When both a chart hit and a text hit cover the same fact, always prefer the text hit's number.
+After your reply, STOP. No print, no summary, no further tool calls.
 
 ## Non-semantic operations (use these, don't fall back to native tools)
 
diff --git a/skills/nemo-retriever/references/troubleshooting.md b/skills/nemo-retriever/references/troubleshooting.md
index 1079a9c28..cdb399bff 100644
--- a/skills/nemo-retriever/references/troubleshooting.md
+++ b/skills/nemo-retriever/references/troubleshooting.md
@@ -36,7 +36,7 @@ If unsupported extensions appear, name them in your reply and ask the user wheth
 
 ## You ran more than 2 Bash calls on a query turn
 
-Budget violation. Stop, write `final_answer` from what you have, write `./output.json`, end the turn. Long turns cost ~5× a disciplined turn and usually still produce the wrong answer.
+Budget violation. Stop, write `final_answer` from what you have, end the turn. Long turns cost ~5× a disciplined turn and usually still produce the wrong answer.
 
 ## Query-turn cost discipline (recap)
 

From 2c22e39afcfc971e92b74322e80f9a77212f8483 Mon Sep 17 00:00:00 2001
From: sosahi <syousefisahi@nvidia.com>
Date: Fri, 29 May 2026 12:36:14 -0700
Subject: [PATCH 07/10] fix pitfall to troubleshoot

---
 skills/nemo-retriever/SKILL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/skills/nemo-retriever/SKILL.md b/skills/nemo-retriever/SKILL.md
index d4b25e792..0947801f7 100644
--- a/skills/nemo-retriever/SKILL.md
+++ b/skills/nemo-retriever/SKILL.md
@@ -21,7 +21,7 @@ If `command -v retriever` returns nothing, follow `references/install.md` to ins
 | :--- | :--- | :--- |
 | **Setup turn** (first turn — `./lancedb/nv-ingest.lance` doesn't exist) | `references/setup.md` | Build the index |
 | **Query turn** (every subsequent turn — user asks a question) | `references/query.md` | One `retriever query` call |
-| Anything errored or returned empty | `references/pitfalls.md` | Apply the named recovery; do not improvise |
+| Anything errored or returned empty | `references/troubleshooting.md` | Apply the named recovery; do not improvise |
 
 For the full `retriever ingest` / `retriever query` CLI specs, see `references/cli/ingest.md` and `references/cli/query.md`. You do not need these for routine turns — `<RETRIEVER_VENV>/bin/retriever <subcommand> --help` is faster.
 

From 84324100840a96f31b566f54f8f9a8a3afcacfc0 Mon Sep 17 00:00:00 2001
From: sosahi <syousefisahi@nvidia.com>
Date: Fri, 29 May 2026 12:58:35 -0700
Subject: [PATCH 08/10] fix frontmatter syntax error

---
 skills/nemo-retriever/SKILL.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/skills/nemo-retriever/SKILL.md b/skills/nemo-retriever/SKILL.md
index 0947801f7..f6da8a6f2 100644
--- a/skills/nemo-retriever/SKILL.md
+++ b/skills/nemo-retriever/SKILL.md
@@ -1,3 +1,4 @@
+
 ---
 name: nemo-retriever
 description: Use when the user wants to search, query, extract, transcribe, describe, quote, filter, or aggregate across documents — PDFs, scanned forms / images (`.jpg` `.png` `.tiff`), Office (`.docx` `.pptx`), text (`.html` `.txt`), audio (`.mp3` `.wav` `.m4a`), or video (`.mp4` `.mov`). Prefer this over native Read / Grep for multi-file or non-PDF corpora. Not for: editing files, web browsing, single-file plain-text lookups, fine-tuning.

From 617590682325fa927f3a5fa09db31a03ea277be4 Mon Sep 17 00:00:00 2001
From: sosahi <syousefisahi@nvidia.com>
Date: Fri, 29 May 2026 13:34:50 -0700
Subject: [PATCH 09/10] fix this annoying frontmatter syntax issue

---
 skills/nemo-retriever/SKILL.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/skills/nemo-retriever/SKILL.md b/skills/nemo-retriever/SKILL.md
index f6da8a6f2..0947801f7 100644
--- a/skills/nemo-retriever/SKILL.md
+++ b/skills/nemo-retriever/SKILL.md
@@ -1,4 +1,3 @@
-
 ---
 name: nemo-retriever
 description: Use when the user wants to search, query, extract, transcribe, describe, quote, filter, or aggregate across documents — PDFs, scanned forms / images (`.jpg` `.png` `.tiff`), Office (`.docx` `.pptx`), text (`.html` `.txt`), audio (`.mp3` `.wav` `.m4a`), or video (`.mp4` `.mov`). Prefer this over native Read / Grep for multi-file or non-PDF corpora. Not for: editing files, web browsing, single-file plain-text lookups, fine-tuning.

From 24710cc142396a726d2a98a25982bbf01c1de172 Mon Sep 17 00:00:00 2001
From: sosahi <syousefisahi@nvidia.com>
Date: Fri, 29 May 2026 14:05:19 -0700
Subject: [PATCH 10/10] fix this annoying frontmatter syntax issue

---
 skills/nemo-retriever/SKILL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/skills/nemo-retriever/SKILL.md b/skills/nemo-retriever/SKILL.md
index 0947801f7..48289a5ae 100644
--- a/skills/nemo-retriever/SKILL.md
+++ b/skills/nemo-retriever/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: nemo-retriever
-description: Use when the user wants to search, query, extract, transcribe, describe, quote, filter, or aggregate across documents — PDFs, scanned forms / images (`.jpg` `.png` `.tiff`), Office (`.docx` `.pptx`), text (`.html` `.txt`), audio (`.mp3` `.wav` `.m4a`), or video (`.mp4` `.mov`). Prefer this over native Read / Grep for multi-file or non-PDF corpora. Not for: editing files, web browsing, single-file plain-text lookups, fine-tuning.
+description: "Use when the user wants to search, query, extract, transcribe, describe, quote, filter, or aggregate across documents — PDFs, scanned forms / images (`.jpg` `.png` `.tiff`), Office (`.docx` `.pptx`), text (`.html` `.txt`), audio (`.mp3` `.wav` `.m4a`), or video (`.mp4` `.mov`). Prefer this over native Read / Grep for multi-file or non-PDF corpora. Not for: editing files, web browsing, single-file plain-text lookups, fine-tuning."
 license: Apache-2.0
 allowed-tools: Bash Write Read
 ---