Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 23 additions & 23 deletions docs/docs/extraction/releasenotes.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,36 +2,36 @@

This documentation contains the release notes for [NeMo Retriever Library](overview.md).

## 26.03 Release Notes (26.3.0)
## 26.05 Release Notes (26.5.0)

NVIDIA® NeMo Retriever Library version 26.03 adds broader hardware and software support along with many pipeline, evaluation, and deployment enhancements.
NVIDIA® NeMo Retriever Library version 26.05 builds on the 26.03 foundation with a graph-based ingest architecture, expanded multimodal and tabular capabilities, production-oriented service deployment, and documentation aligned to a Helm-first supported path.
Comment thread
kheiss-uwzoo marked this conversation as resolved.

To upgrade the Helm charts for this release, refer to the [NeMo Retriever Library Helm Charts](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md).

Highlights for the 26.03 release include:

- Legacy ingestion repository consolidated under NeMo-Retriever
- NeMo Retriever Extraction pipeline renamed to NeMo Retriever Library
- NeMo Retriever Library now supports two deployment options:
- A new no-container, pip-installable in-process library for development (available on PyPI)
- Existing production-ready Helm chart with NIMs
- Added documentation notes on Air-gapped deployment support
- Added documentation notes on OpenShift support
- Added support for RTX4500 Pro Blackwell SKU
- Added support for llama-nemotron-embed-vl-v2 in text and text+image modes
- New extract methods `pdfium_hybrid` and `ocr` target scanned PDFs to improve text and layout extraction from image-based pages
- VLM-based image caption enhancements:
- Infographics can be captioned
- Reasoning mode is configurable
- Enabled hybrid search with Lancedb
- Added retrieval_bench subfolder with generalizable agentic retrieval pipeline
- The project now uses UV as the primary environment and package manager instead of Conda, resulting in faster installs and simpler dependency handling
- Default TTL for long-running pipeline job state increased from 1–2 hours to 48 hours so long-running jobs (for example, VLM captioning) do not expire before completion
- NeMo Retriever Library currently does not support image captioning via VLM; this feature will be added in the next release
- Documentation: multimodal extraction is covered on one page with an in-page table of contents and redirects from the former per-topic URLs
Highlights for the 26.05 release include:

- **Graph-based ingest pipeline** — `graph_pipeline` and the graph stage registry are the canonical ingestion path; mode-specific example scripts are consolidated around this model
- **Root CLI** — `retriever ingest` and `retriever query` with NIM URL flags, batch tuning, and LanceDB controls (overwrite/append)
- **Retriever Service v2** — scalable multi-pod architecture with gateway, process isolation, and VectorDB integration
- **Nemotron OCR v2** — default OCR engine with CLI language selectors and unified OCR actors
- **VLM image captioning** — image captioning via vLLM (including Omni caption model profiles); addresses the capability deferred in 26.03
- **vLLM inference stack** — vLLM-backed text and vision-language embedders, multimodal VL reranker, and torch 2.11 stack for local GPU installs
- **Video retrieval pipeline** — frame extraction, OCR, audio-visual fusion, and text deduplication for video corpora
- **Text-to-SQL** — agent graph and tabular tooling for structured data retrieval
- **Live RAG SDK** — `Retriever.answer()` and optional batch operator graph via LiteLLM (`[llm]` extra)
- **Vector database** — VDB operators integrated directly in the pipeline; custom metadata support; LanceDB hybrid search guidance updated
- **Evaluation** — BEIR-centric evaluation overhaul; `retriever skill-eval` benchmark CLI for the NeMo Retriever skill
- **Packaging** — optional install extras (`[local]`, `[multimedia]`, `[llm]`, `[tabular]`, `[nemotron-parse]`, `[service]`, and others) including slim remote/NIM-only installs on Mac and Windows
- **Audio** — long-audio Parakeet chunking with time-aligned segments; punctuation-based audio segmenting
- **`allow_no_gpu`** — option to skip GPU requirement during ingest for CPU-only experimentation
- **Chunking API** — text splitting moved into `.extract(split_config=...)`
- **Documentation** — Helm-first deployment story; [Docker Compose for local development](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/docker.md) documented as **unsupported** developer tooling (not a production NIM deployment path)
- **Documentation** — duplicate user-defined stages page removed; UDF and custom stages guidance consolidated in the [graph README](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/src/nemo_retriever/graph#nemo-retriever-graph)
- **Documentation** — consolidated extraction concepts, ingest workflow, embeddings, and audio/video guides

## Release Notes for Previous Versions

| [26.03](https://docs.nvidia.com/nemo/retriever/26.3.0/extraction/releasenotes/)
| [26.1.2](https://docs.nvidia.com/nemo/retriever/26.1.2/extraction/releasenotes/)
| [26.1.1](https://docs.nvidia.com/nemo/retriever/26.1.1/extraction/releasenotes/)
| [25.9.0](https://docs.nvidia.com/nemo/retriever/25.9.0/extraction/releasenotes/)
Expand Down
Loading