-
Notifications
You must be signed in to change notification settings - Fork 322
docs(extraction): OCR v2 defaults, captioning link, B200 nemotron-parse (26.05, NVBug 6204537) #2103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 26.05
Are you sure you want to change the base?
docs(extraction): OCR v2 defaults, captioning link, B200 nemotron-parse (26.05, NVBug 6204537) #2103
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -70,6 +70,14 @@ The production Helm chart enables these NIM microservices **by default** (for ex | |
| | `ocr` | [nemotron-ocr-v2](https://huggingface.co/nvidia/nemotron-ocr-v2) | Image OCR | | ||
| | `vlm_embed` | [llama-nemotron-embed-vl-1b-v2](https://huggingface.co/nvidia/llama-nemotron-embed-vl-1b-v2) | Multimodal (VL) embedding | | ||
|
|
||
| ### Nemotron OCR v2 language mode { #nemotron-ocr-v2-language-mode } | ||
|
|
||
| !!! note | ||
|
|
||
| **Local Hugging Face inference:** When you deploy locally with HuggingFace model weights (for example `pip install "nemo-retriever[local]"` and GPU inference without remote OCR NIM URLs), the default OCR engine is **Nemotron OCR v2**, which runs in **multilingual** mode by default (`multi`). For English-only v2, pass `--ocr-lang english` on the [CLI](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli) or set the equivalent `ocr_lang` parameter in the Python API. Use `--ocr-version v1` for the legacy English-only engine. Remote OCR NIM endpoints use their own model and language behavior; local OCR language selectors are not sent on remote requests. | ||
|
|
||
| **Helm / NIM (26.05):** The [NeMo Retriever Helm chart](https://github.com/NVIDIA/NeMo-Retriever/blob/26.05/nemo_retriever/helm/README.md) deploys the core OCR NIM under [`nimOperator.ocr`](https://github.com/NVIDIA/NeMo-Retriever/blob/26.05/nemo_retriever/helm/values.yaml#L817-L852). When that block targets **nemotron-ocr-v2** for your release, the deployed NIM also runs in multilingual mode by default. Confirm the `repository` and `tag` in `values.yaml` before you upgrade. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The URL Prompt To Fix With AIThis is a comment left during a code review.
Path: docs/docs/extraction/prerequisites-support-matrix.md
Line: 79
Comment:
**Hardcoded line-range anchor in `values.yaml` link will go stale**
The URL `values.yaml#L817-L852` pins specific line numbers that will drift the moment anyone adds or removes lines above that block in `values.yaml`. When the anchor breaks, readers land at the wrong section with no error. Consider linking to the file root (`values.yaml`) without the fragment, or to a named heading/comment in `values.yaml` that is stable across edits.
How can I resolve this? If you propose a fix, please make it concise. |
||
|
|
||
| Default VL embedder container and model for release deployments: | ||
|
|
||
| - **Image:** `nvcr.io/nim/nvidia/llama-nemotron-embed-vl-1b-v2:1.12.0` | ||
|
|
@@ -110,8 +118,8 @@ Model repositories and NIM references are linked in [Core and Advanced Pipeline | |
| | Core Features | — | Total Disk Space | ~150GB | ~150GB | ~150GB | ~150GB | ~150GB | ~150GB | ~150GB | ~150GB | ~150GB | | ||
| | Audio (parakeet-1-1b-ctc-en-us) | ~4.0 GiB (`model.safetensors`; the repo also ships `parakeet-ctc-1.1b.nemo` of similar size—use one format to avoid roughly doubling disk use) | Additional Dedicated GPUs | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1¹ | | ||
| | Audio (parakeet-1-1b-ctc-en-us) | — | Additional Disk Space | ~37GB | ~37GB | ~37GB | ~37GB | ~37GB | ~37GB | ~37GB | ~37GB | ~37GB¹ | | ||
| | nemotron-parse | ~3.5 GiB | Additional Dedicated GPUs | Not supported | Not supported | Not supported | 1 | 1 | 1 | 1 | 1 | Not supported² | | ||
| | nemotron-parse | — | Additional Disk Space | Not supported | Not supported | Not supported | ~16GB | ~16GB | ~16GB | ~16GB | ~16GB | Not supported² | | ||
| | nemotron-parse | ~3.5 GiB | Additional Dedicated GPUs | Not supported | 1 | Not supported | 1 | 1 | 1 | 1 | 1 | Not supported² | | ||
| | nemotron-parse | — | Additional Disk Space | Not supported | ~16GB | Not supported | ~16GB | ~16GB | ~16GB | ~16GB | ~16GB | Not supported² | | ||
| | Omni caption (nemotron-3-nano-omni-30b-a3b-reasoning) | ~62 GiB (BF16); ~33 GiB (FP8); ~21 GiB (NVFP4) | Additional Dedicated GPUs | 1 | 1 | 1 | 1 | 1 | Not supported | Not supported | 2 | Not supported³ | | ||
| | Omni caption (nemotron-3-nano-omni-30b-a3b-reasoning) | — | Additional Disk Space (HF) | ~21–62GB | ~21–62GB | ~21–62GB | ~21–62GB | ~21–62GB | Not supported | Not supported | ~21–62GB | Not supported³ | | ||
| | Omni caption (nemotron-3-nano-omni-30b-a3b-reasoning) | — | Additional Disk Space (NIM) | ~80GB | ~80GB | ~80GB | ~80GB | ~80GB | Not supported | Not supported | ~80GB | Not supported³ | | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maininstead of the26.05branchThe anchor text says this is 26.05-specific guidance, but the CLI link resolves to
https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli. If the CLI interface diverges betweenmainand26.05, readers following this link from the versioned docs will see instructions that may not match their installed release. Consider pinning to26.05(or the appropriate release tag) for consistency with the rest of this section.Prompt To Fix With AI