NVIDIA · kheiss-uwzoo · May 30, 2026 · greptile-apps · May 30, 2026 · greptile-apps
@@ -63,7 +63,7 @@ This pipeline enables retrieval at the speech segment level when you enable segm
 
 Use the following procedure to run the NIM on your own infrastructure. Self-hosted Parakeet runs on Kubernetes via the [NeMo Retriever Helm chart](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md). Enable the ASR NIM per [Optional Helm NIMs](prerequisites-support-matrix.md#optional-helm-nims-not-auto-wired-by-default) and the [Helm chart — NIM operator sub-stack](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md#nim-operator-sub-stack); pin the workload to a dedicated GPU and wire the ASR endpoint in your pipeline.
 
-!!! important
+After deploy, call the pipeline from Python:
 
     Pin the Parakeet workload to the dedicated GPU with your Helm values or the [NIM Operator](https://docs.nvidia.com/nim-operator/latest/index.html) (for example, node selectors, resource limits, or device requests appropriate to your cluster).
 
@@ -87,15 +87,14 @@ Use the following procedure to run the NIM on your own infrastructure. Self-host
             asr_params=ASRParams(segment_audio=True),
         )
     )
-    ```
+)
+```
-        )
-    )
-    ```
-)
-```
+        )
+    )
-        )
-    )
-    ```
-)
-```
+        )
+    )
 
+To generate one extracted element for each sentence-like ASR segment, include `extract_audio_params={"segment_audio": True}` when calling `.extract(...)`. This option applies when audio extraction runs with a self-hosted Parakeet NIM or using build.nvidia.com hosted inference, but has no effect when using the local Hugging Face Parakeet model.
 
     To generate one extracted element for each sentence-like ASR segment, pass `asr_params=ASRParams(segment_audio=True)` to `.extract_audio(...)`. This option applies when audio extraction runs with a self-hosted Parakeet NIM or using build.nvidia.com hosted inference, but has no effect when using the local Hugging Face Parakeet model.
 
-
-    !!! tip
-
-        For more Python examples, refer to [Python Quick Start Guide](https://github.com/NVIDIA/NeMo-Retriever/blob/main/client/client_examples/examples/python_client_usage.ipynb).
+    For more Python examples, refer to [Python Quick Start Guide](https://github.com/NVIDIA/NeMo-Retriever/blob/main/client/client_examples/examples/python_client_usage.ipynb).
 
 ## Parakeet with hosted inference (build.nvidia.com) { #parakeet-hosted-inference-build-nvidia }
 

@@ -36,6 +36,6 @@ Token-based splitting uses the Llama 3.2 1B tokenizer (default `meta-llama/Llama
 
 - **Library mode** — Run without the full container stack where appropriate; see [Deployment options](deployment-options.md).
 - **Kubernetes / Helm (self-hosted)** — See [Deploy (Helm chart)](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md) and [deployment options](deployment-options.md) for running the full microservices pipeline on your infrastructure.
-- **Notebooks** — [Jupyter examples](notebooks.md) for experimentation and RAG demos.
+- **Notebooks** — [Jupyter examples](notebooks/index.md) for experimentation and RAG demos.
 
 For a concise comparison, refer to [Deployment options](deployment-options.md).
@@ -1,74 +1,42 @@
-# Use Custom Metadata to Filter Search Results
+# Custom metadata and filtering
 
-You can upload custom metadata for documents during ingestion. 
-By uploading custom metadata you can attach additional information to documents, 
-and use it for filtering results during retrieval operations. 
-For example, you can add author metadata to your documents, and filter by author when you retrieve results. 
-To create filters at query time, use predicates supported by [LanceDB SQL](https://lancedb.github.io/lancedb/sql/) against your table schema (custom fields are serialized into the `metadata` column with your ingested chunks). For a worked example, see the repository notebook linked at the end of this page.
+Use this documentation to attach per-document metadata during ingestion and to narrow [LanceDB](vdbs.md) search results in [NeMo Retriever Library](overview.md). Implementation details live in the package [Vector DB operators and LanceDB](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/src/nemo_retriever/vdb#metadata-filtering) README.
 
-Use this documentation to use custom metadata to filter search results when you work with [NeMo Retriever Library](overview.md).
+## On this page { #on-this-page }
 
+- [Attach metadata at ingestion](#attach-metadata-at-ingestion)
+- [How metadata is stored](#how-metadata-is-stored)
+- [Filter results at query time](#filter-results-at-query-time)
+- [Writing `where` predicates](#writing-where-predicates)
+- [Server-side vs client-side filters](#server-side-vs-client-side-filters)
+- [Inspect hit metadata](#inspect-hit-metadata)
+- [Limitations](#limitations)
+- [Related content](#related-content)
 
-## Limitations
+## Attach metadata at ingestion { #attach-metadata-at-ingestion }
 
-The following are limitation when you use custom metadata:
+Pass a **sidecar metadata table** on `vdb_upload` so selected columns are merged into each chunk's `content_metadata` before LanceDB upload. All three parameters must be set together:
 
-- Metadata fields must be consistent across documents in the same collection.
-- Complex filter expressions may impact retrieval performance.
-- If you update your custom metadata, you must ingest your documents again to use the new metadata.
+| Parameter | Purpose |
+|-----------|---------|
+| `meta_dataframe` | Path to CSV, JSON, or Parquet, or an in-memory `pandas.DataFrame` |
+| `meta_source_field` | Column that identifies each document (must match ingest paths or basenames per `meta_join_key`) |
+| `meta_fields` | Non-empty list of column names to copy into `content_metadata` |
 
-
-
-## Add Custom Metadata During Ingestion
-
-You can add custom metadata during the document ingestion process. 
-You can specify metadata for each file, 
-and you can specify different metadata for different documents in the same ingestion batch.
-
-
-### Metadata Structure
-
-You specify custom metadata as a dataframe or a file (json, csv, or parquet). 
-
-The following example contains metadata fields for category, department, and timestamp. 
-You can create whatever metadata is helpful for your scenario.
+Optional `meta_join_key` controls how rows are matched to documents: `auto` (try full path then basename), `source_id` (full path), or `source_name` (basename only).
 
 ```python
 import pandas as pd
+from nemo_retriever import create_ingestor
 
 meta_df = pd.DataFrame(
     {
         "source": ["data/woods_frost.pdf", "data/multimodal_test.pdf"],
-        "category": ["Alpha", "Bravo"],
-        "department": ["Language", "Engineering"],
-        "timestamp": ["2025-05-01T00:00:00", "2025-05-02T00:00:00"]
+        "meta_a": ["alpha", "bravo"],
+        "meta_b": [10, 20],
     }
 )
 
-# Convert the dataframe to a csv file, 
-# to demonstrate how to ingest a metadata file in a later step.
-
-file_path = "./meta_file.csv"
-meta_df.to_csv(file_path)
-```
-
-
-### Example: Add Custom Metadata During Ingestion
-
-The following example adds custom metadata during ingestion. 
-For more information about `create_ingestor` and run modes, refer to [Use the Python API](nemo-retriever-api-reference.md).
-For more information about the `vdb_upload` method, refer to [Upload Data](vdbs.md).
-
-```python
-from nemo_retriever import create_ingestor
-
-# Service-backed pipeline: point `base_url` at your running retriever service.
-# For local graph execution instead, see [Use the Python API](nemo-retriever-api-reference.md).
-
-hostname = "localhost"
-table_name = "nemo_retriever_collection"
-lancedb_uri = "./lancedb_data"
-
 ingestor = (
     create_ingestor(run_mode="service", base_url=f"http://{hostname}:7670")
         .files(["data/woods_frost.pdf", "data/multimodal_test.pdf"])
-ingestor = (
-    create_ingestor(run_mode="service", base_url=f"http://{hostname}:7670")
-        .files(["data/woods_frost.pdf", "data/multimodal_test.pdf"])
+hostname = "localhost"
+table_name = "nemo_retriever_collection"
+lancedb_uri = "./lancedb_data"
+
+ingestor = (
+    create_ingestor(run_mode="service", base_url=f"http://{hostname}:7670")
+        .files(["data/woods_frost.pdf", "data/multimodal_test.pdf"])
-ingestor = (
-    create_ingestor(run_mode="service", base_url=f"http://{hostname}:7670")
-        .files(["data/woods_frost.pdf", "data/multimodal_test.pdf"])
+hostname = "localhost"
+table_name = "nemo_retriever_collection"
+lancedb_uri = "./lancedb_data"
+
+ingestor = (
+    create_ingestor(run_mode="service", base_url=f"http://{hostname}:7670")
+        .files(["data/woods_frost.pdf", "data/multimodal_test.pdf"])
@@ -150,9 +118,11 @@ hits = retriever.query(
 )
 ```
 
+For a runnable end-to-end flow (ingest, `Retriever.query`, and both filter modes), see [nemo_retriever_retriever_query_metadata_filter.ipynb](https://github.com/NVIDIA/NeMo-Retriever/blob/main/examples/nemo_retriever_retriever_query_metadata_filter.ipynb).
 
+When you ingest through the **retriever service**, upload the sidecar with [`POST /v1/ingest/sidecar`](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/src/nemo_retriever/service/routers/ingest.py#L1040-L1129) (multipart file; response [`SidecarUploadResponse`](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/src/nemo_retriever/service/models/responses.py#L60-L68)), then pass the returned `sidecar_id` as `meta_dataframe_id` with `meta_source_field` and `meta_fields` in `pipeline.vdb_upload_params` on [`POST /v1/ingest`](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/src/nemo_retriever/service/models/requests.py#L15-L32) ([`PipelineSpec`](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/src/nemo_retriever/service/models/pipeline_spec.py#L55-L78)). Request and response shapes, form fields, and auth headers are in the service OpenAPI UI at `/docs` (or `/openapi.json`) on your retriever base URL (for example `http://localhost:7670/docs` after `retriever service start`). Do not send a raw local path as `meta_dataframe` on the service spec.
 
-## Related Content
+## How metadata is stored { #how-metadata-is-stored }
 
 - [Vector databases](vdbs.md) — canonical LanceDB upload and retrieval guide
 - [metadata_and_filtered_search.ipynb](https://github.com/NVIDIA/NeMo-Retriever/blob/main/examples/metadata_and_filtered_search.ipynb) — CLI and graph ingest with sidecar metadata
@@ -32,7 +32,7 @@ environments), use a custom service image that already contains `ffmpeg` and
 
 ### I want examples and notebooks
 
-1. [Jupyter Notebooks](notebooks.md)
+1. [Jupyter Notebooks](notebooks/index.md)
 2. [Integrate with LangChain, LlamaIndex, Haystack](integrations-langchain-llamaindex-haystack.md)
 
 ### I need API details and keys

@@ -20,10 +20,12 @@ For more information, refer to [Vector databases](vdbs.md).
 
 For images that `nemoretriever-page-elements-v3` does not classify as tables, charts, or infographics,
 you can use our VLM caption task to create a dense caption of the detected image. 
-That caption is then be embedded along with the rest of your content. 
-For more information, refer to [Extract Captions from Images](nemo-retriever-api-reference.md).
+That caption is then embedded along with the rest of your content. 
+For chart-labeled PDF regions and other caption scope limits, see [Are PDF chart or figure regions captioned when Omni is enabled?](#are-pdf-chart-or-figure-regions-captioned-when-omni-is-enabled). For more information, refer to [Extract Captions from Images](nemo-retriever-api-reference.md).
 
+## Are PDF chart or figure regions captioned when Omni is enabled?
 
+No. Chart-labeled PDF regions are not routed through Omni captioning. See [Image captioning](prerequisites-support-matrix.md#image-captioning-2605) for scope, validation, and what the caption stage covers.
 
 ## When should I consider advanced visual parsing?
 

@@ -10,6 +10,6 @@ Typical order:
     - [Deployment options](deployment-options.md) for how to run NeMo Retriever Library
     - **Supported:** [Helm chart](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md) for Kubernetes, plus [NeMo Retriever Library install docs](https://docs.nvidia.com/nemo/retriever/latest/extraction/overview/) for the published charts
     - **Unsupported (developer-only):** [Docker Compose (local)](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/docker.md) — not a supported NIM deployment path
-4. Explore [Jupyter Notebooks](notebooks.md) for end-to-end examples.
+4. Explore [Jupyter Notebooks](notebooks/index.md) for end-to-end examples.
 
 If you are new to the product, read [What is NeMo Retriever Library?](overview.md) and [Concepts](concepts.md) under **Introduction** first.
@@ -9,7 +9,7 @@ The repository includes notebooks that demonstrate multimodal RAG patterns:
 - [Multimodal RAG with LangChain](https://github.com/NVIDIA/NeMo-Retriever/blob/main/examples/langchain_multimodal_rag.ipynb)
 - [Multimodal RAG with LlamaIndex](https://github.com/NVIDIA/NeMo-Retriever/blob/main/examples/llama_index_multimodal_rag.ipynb)
 
-These are also linked from [Jupyter Notebooks](notebooks.md) and the [FAQ](faq.md).
+These are also linked from [Jupyter Notebooks](notebooks/index.md) and the [FAQ](faq.md).
 
 ## Haystack
 

@@ -49,8 +49,9 @@ NeMo Retriever Library detects tables as structured page elements, processes the
 
 Charts and infographic regions are classified with other page layout elements (tables, text blocks, titles) and processed through layout detection and OCR. `extract_charts` and `extract_infographics` are enabled by default. Outputs use the same metadata schema as other extracted objects.
 
+Chart-labeled PDF regions are **not** routed through the Omni caption stage; they remain on the layout-and-OCR path. For scope and validation guidance, see [Image captioning](prerequisites-support-matrix.md#image-captioning-2605).
 
-For natural-language infographic descriptions, optionally enable [image captioning](#image-captioning).
+For natural-language infographic descriptions, optionally enable [image captioning](#image-captioning) and set `caption_infographics=True` when you need VLM captions on infographic regions.
 
 **Related**
 
@@ -62,7 +63,7 @@ For natural-language infographic descriptions, optionally enable [image captioni
 
 Scanned PDFs and image-only pages rely on OCR and hybrid paths that combine native text extraction with OCR when needed. For extract methods such as `ocr` and `pdfium_hybrid`, refer to the [Python API reference](nemo-retriever-api-reference.md).
 
-The default OCR engine is **Nemotron OCR v2**. When you run extraction **locally with HuggingFace models**, v2 operates in **multilingual** mode by default. For CLI flags and API parameters, see [Nemotron OCR v2 — language mode](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/docs/cli/README.md#nemotron-ocr-v2-language-mode). For Kubernetes installs, see [Nemotron OCR v2 — language mode](prerequisites-support-matrix.md#nemotron-ocr-v2-language-mode) in the support matrix.
+OCR artifacts depend on how you deploy. **Helm / NIM:** the production chart uses **Nemotron OCR v1** (`nvcr.io/nim/nvidia/nemotron-ocr-v1:1.3.0`). **Local Hugging Face inference:** the default engine is **Nemotron OCR v2**, which operates in **multilingual** mode by default. For CLI flags and API parameters, see [Nemotron OCR v2 — language mode](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/docs/cli/README.md#nemotron-ocr-v2-language-mode). For Kubernetes defaults and the Helm-vs-local split, see [OCR artifacts (Helm vs local Hugging Face)](prerequisites-support-matrix.md#nemotron-ocr-v2-language-mode) in the support matrix.
 
 **Related**
 

@@ -1,6 +1,6 @@
 # Notebooks for NeMo Retriever Library
 
-To get started using [NeMo Retriever Library](overview.md), you can try one of the ready-made notebooks that are available.
+To get started using [NeMo Retriever Library](../overview.md), you can try one of the ready-made notebooks that are available.
 
 ## Dataset Downloads for Benchmarking
 
@@ -23,11 +23,3 @@ For more advanced scenarios, try one of the following notebooks:
 - [Evaluate bo767 retrieval recall accuracy with NeMo Retriever Library](https://github.com/NVIDIA/NeMo-Retriever/blob/main/evaluation/bo767_recall.ipynb)
 - [Multimodal RAG with LangChain](https://github.com/NVIDIA/NeMo-Retriever/blob/main/examples/langchain_multimodal_rag.ipynb)
 - [Multimodal RAG with LlamaIndex](https://github.com/NVIDIA/NeMo-Retriever/blob/main/examples/llama_index_multimodal_rag.ipynb)
-
-
-
-## Related Topics
-
-- [Pre-Requisites & Support Matrix](prerequisites-support-matrix.md)
-- [Deployment options](deployment-options.md)
-- [Deploy with Helm](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md)
@@ -15,6 +15,7 @@ NeMo Retriever Library does the following:
 
 - Accept directories of input files and a series of configurable ingestion tasks to perform on that input
 - Allow the extracted content be retrieved from a VDB containing discrete metadata element
+- Support multiple extraction methods per document type—for example, PDFs can use **pdfium** or [Nemotron Parse](https://build.nvidia.com/nvidia/nemotron-parse) as an alternate method (`extract_method="nemotron_parse"`)
 - Support various types of pre- and post- processing operations, including text splitting and chunking, transform and filtering, embedding generation, and image offloading to storage.
 
 !!! note
@@ -49,5 +50,5 @@ NeMo Retriever Library supports the following file types:
 - [Deploy on Kubernetes with Helm](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md)
 - [NeMo Retriever Library — prerequisites / deployment](https://docs.nvidia.com/nemo/retriever/latest/extraction/overview/) (supported Helm charts)
 - [Docker Compose (unsupported, developer)](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/docker.md)
-- [Notebooks](notebooks.md)
+- [Notebooks](notebooks/index.md)
 - [NVIDIA AI Blueprints catalog](https://build.nvidia.com/explore/discover) — solution cards, enterprise RAG blueprints, and end-to-end patterns (including [Enterprise RAG — multimodal PDF data extraction](https://build.nvidia.com/nvidia/multimodal-pdf-data-extraction-for-enterprise-rag)); for integration pathways, refer to [Integrations](integrations-langchain-llamaindex-haystack.md).