Asr fixes#2147
Conversation
When source-branch is empty, use the version input as the git checkout ref so Helm and other artifacts match the tagged release instead of the workflow UI branch. Only overlay ci/scripts when workflow and source refs differ.
Greptile SummaryThis PR introduces configurable ASR inference mode (
|
| Filename | Overview |
|---|---|
| nemo_retriever/src/nemo_retriever/api/internal/primitives/nim/model_interface/parakeet.py | Adds AudioInferMode type alias, resolve_audio_infer_mode(), and infer_mode parameter to ParakeetClient and create_audio_inference_client, enabling offline Riva RPC alongside the existing streaming path. Logic is clean; the _StreamingResponseShim return shape is compatible with the offline RecognizeResponse. |
| nemo_retriever/src/nemo_retriever/pipeline/main.py | Refactors _build_ingestor to unify service and graph extraction paths via _attach_extract_stage; adds service-mode caption wiring with a helpful --caption-invoke-url warning. Gap: _split_config_for_input_type doesn't handle txt/html, silently dropping chunk params in service mode for those input types. |
| nemo_retriever/src/nemo_retriever/service_ingestor.py | Introduces _wire_client_stage_params and _filter_policy_allowed helpers; changes _params_to_dict to exclude_unset=True so only caller-set fields are forwarded to the service, preventing model defaults from tripping the server allowlist. Clean refactoring with well-documented intent. |
| nemo_retriever/helm/templates/nims/nemotron-ocr-v2.yaml | File deleted; it contained unresolved git merge conflict markers and was not usable — correct cleanup. |
| nemo_retriever/helm/values.yaml | Adds nimServiceName: nemotron-ocr-v1 under nimOperator.ocr and updates the Parakeet NIM_TAGS_SELECTOR from mode=ofl to mode=str,vad=default,diarizer=disabled, matching the docker-compose.yaml change. |
| nemo_retriever/tests/test_parakeet_infer_mode.py | New test file covering resolve_audio_infer_mode resolution, rejection of unknown modes, and all three dispatch paths (streaming self-hosted, streaming NVCF, offline explicit). Good coverage for the new infer-mode feature. |
Sequence Diagram
sequenceDiagram
participant CLI as CLI / __main__.py
participant AI as _attach_extract_stage
participant SI as ServiceIngestor
participant WCS as _wire_client_stage_params
participant PC as ParakeetClient
participant Riva as Riva gRPC
CLI->>AI: run_mode, input_type, extract_params, infer_mode, ...
AI->>SI: extract(extract_params, split_config, extraction_mode)
SI->>WCS: "merge params + filter allowlist (exclude_unset=True)"
WCS-->>SI: policy-filtered params dict
SI-->>CLI: ServiceIngestor (fluent)
Note over CLI,PC: Audio path (local graph mode)
CLI->>PC: transcribe(audio_content)
alt "infer_mode == offline"
PC->>Riva: offline_recognize(mono_audio_bytes, config)
Riva-->>PC: RecognizeResponse
else "infer_mode == online / auto"
PC->>Riva: StreamingRecognize(chunked PCM)
Riva-->>PC: streaming results
PC-->>PC: _StreamingResponseShim(.results)
end
PC-->>CLI: response
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
nemo_retriever/src/nemo_retriever/pipeline/__main__.py:531-543
**`txt`/`html` split_config silently dropped in service mode**
`_split_config_for_input_type` handles `pdf`/`doc`, `audio`, and `video`, but returns `None` for `txt` and `html`. When a user runs `--input-type txt --enable-text-chunk` in service mode, `chunk_dict` is computed correctly from `_service_text_chunk_dict`, but `_split_config_for_input_type("txt", chunk_dict)` returns `None`, so `ServiceIngestor.extract()` is called with `split_config=None` and the chunking params are silently discarded without any error or warning. If the service API supports `split_config` for text/html extraction (with a key like `"text"` or `"html"`), these cases should be added here; otherwise a warning or `NotImplementedError` should be raised so the caller knows the option has no effect in this mode.
Reviews (2): Last reviewed commit: "Revert back to nemotron-ocr-v1 since nem..." | Re-trigger Greptile
| def resolve_audio_infer_mode(mode: str, endpoint: str) -> ResolvedAudioInferMode: | ||
| """Pick offline vs streaming Riva RPC for a Parakeet endpoint. | ||
| NVCF (``grpc.nvcf.nvidia.com``) and the Helm chart Parakeet NIM (``mode=str``) | ||
| register streaming (online) models. Use ``audio_infer_mode='offline'`` only when | ||
| the NIM was deployed with an offline profile (``mode=ofl``). | ||
| """ | ||
| normalized = (mode or "auto").lower() | ||
| if normalized == "online": | ||
| return "online" | ||
| if normalized == "offline": | ||
| return "offline" | ||
| if normalized != "auto": | ||
| raise ValueError(f"audio_infer_mode must be 'auto', 'online', or 'offline', got {mode!r}") | ||
| return "online" |
There was a problem hiding this comment.
Dead
endpoint parameter in public function
resolve_audio_infer_mode accepts endpoint: str but never uses it — the auto branch unconditionally returns "online" regardless of the endpoint value. This misleads callers (and the docstring, which explicitly mentions NVCF endpoint detection) into believing the endpoint influences the decision. Any code passing a different endpoint expecting different behavior will be silently surprised. If endpoint-based auto-detection is planned for the future, a # TODO comment would clarify intent; if it was intentionally dropped, the parameter should be removed from the signature to keep the public contract honest.
Rule Used: Every public class and function in nemo_retriever ... (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/api/internal/primitives/nim/model_interface/parakeet.py
Line: 56-70
Comment:
**Dead `endpoint` parameter in public function**
`resolve_audio_infer_mode` accepts `endpoint: str` but never uses it — the `auto` branch unconditionally returns `"online"` regardless of the endpoint value. This misleads callers (and the docstring, which explicitly mentions NVCF endpoint detection) into believing the endpoint influences the decision. Any code passing a different endpoint expecting different behavior will be silently surprised. If endpoint-based auto-detection is planned for the future, a `# TODO` comment would clarify intent; if it was intentionally dropped, the parameter should be removed from the signature to keep the public contract honest.
**Rule Used:** Every public class and function in nemo_retriever ... ([source](https://app.greptile.com/review/custom-context?memory=public-api-contract))
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| @@ -1,25 +1,10 @@ | |||
| {{- if and (and (.Capabilities.APIVersions.Has "apps.nvidia.com/v1alpha1") .Values.nims.enabled) (eq .Values.nimOperator.ocr.enabled true) -}} | |||
| <<<<<<< HEAD | |||
| {{- $name := .Values.nimOperator.ocr.nimServiceName | default "nemotron-ocr-v2" -}} | |||
| {{- $name := .Values.nimOperator.ocr.nimServiceName | default "nemotron-ocr-v1" -}} | |||
There was a problem hiding this comment.
The file is named
nemotron-ocr-v2.yaml, yet the default fallback for $name is "nemotron-ocr-v1". With values.yaml now explicitly setting nimServiceName: nemotron-ocr-v1, the default is rarely hit, but anyone deploying the Helm chart without setting nimServiceName will get a resource named nemotron-ocr-v1 from a template file named v2, which is confusing. The default should match what the file's name implies, or the file should be renamed.
| {{- $name := .Values.nimOperator.ocr.nimServiceName | default "nemotron-ocr-v1" -}} | |
| {{- $name := .Values.nimOperator.ocr.nimServiceName | default "nemotron-ocr-v2" -}} |
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/helm/templates/nims/nemotron-ocr-v2.yaml
Line: 2
Comment:
The file is named `nemotron-ocr-v2.yaml`, yet the default fallback for `$name` is `"nemotron-ocr-v1"`. With `values.yaml` now explicitly setting `nimServiceName: nemotron-ocr-v1`, the default is rarely hit, but anyone deploying the Helm chart without setting `nimServiceName` will get a resource named `nemotron-ocr-v1` from a template file named `v2`, which is confusing. The default should match what the file's name implies, or the file should be renamed.
```suggestion
{{- $name := .Values.nimOperator.ocr.nimServiceName | default "nemotron-ocr-v2" -}}
```
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
… downloading the model profile from NGC
Description
Checklist