Skip to content

fix: fix nat crash in nvbug 6336437#417

Open
gabwow wants to merge 2 commits into
mainfrom
nvbug6336437-fix-nat-crash/agabow
Open

fix: fix nat crash in nvbug 6336437#417
gabwow wants to merge 2 commits into
mainfrom
nvbug6336437-fix-nat-crash/agabow

Conversation

@gabwow

@gabwow gabwow commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Address https://nvbugspro.nvidia.com/bug/6336437 by pinning all nat to 1.7 add a very small test.

Summary by CodeRabbit

  • Chores
    • Updated the agent base image to install the full nvidia-nat* package family with consistent version pinning (1.7.0) to avoid installation/version drift.
    • Added an automated test that detects mismatched nvidia-nat* versions and fails with a clear report, ensuring alignment stays intact over time.

@gabwow gabwow requested review from a team as code owners June 23, 2026 19:05
@gabwow gabwow force-pushed the nvbug6336437-fix-nat-crash/agabow branch from 9aa00e4 to ad3e694 Compare June 23, 2026 19:08
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: fb5d3672-5117-460c-860b-1152657bd193

📥 Commits

Reviewing files that changed from the base of the PR and between ad3e694 and 4dcfb06.

📒 Files selected for processing (2)
  • Dockerfile.agentic-base
  • tests/agentic-use/tests/test_nat_version_consistency.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • Dockerfile.agentic-base
  • tests/agentic-use/tests/test_nat_version_consistency.py

📝 Walkthrough

Walkthrough

All nvidia-nat* packages are now pinned to 1.7.0 in Dockerfile.agentic-base, replacing previously unpinned installs. A new pytest module enumerates installed nvidia-nat* distributions and fails if any version mismatch is detected.

Changes

nvidia-nat Version Pinning and Drift Guard

Layer / File(s) Summary
Pin nvidia-nat* to 1.7.0
Dockerfile.agentic-base
Replaces unpinned nvidia-nat[most], nvidia-nat-atif, nvidia-nat-eval, nvidia-nat-mcp with explicit ==1.7.0 pins; adds comments explaining the prior ImportError from version lag.
Version consistency test
tests/agentic-use/tests/test_nat_version_consistency.py
_nat_distributions() scans installed packages for nvidia-nat* entries; test_nvidia_nat_family_versions_are_aligned() skips if none found, fails with a per-package mismatch report if versions diverge.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main change: pinning NAT to version 1.7 to fix a crash in nvbug 6336437. It accurately reflects the primary objective of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch nvbug6336437-fix-nat-crash/agabow

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/agentic-use/tests/test_nat_version_consistency.py (1)

18-18: 📐 Maintainability & Code Quality | 🔵 Trivial

Remove from __future__ import annotations to align with concrete type hints.

Line 18 conflicts with the repo guideline to prefer concrete type hints. The file uses only concrete annotations (dict[str, str], None) and has no forward references requiring postponed evaluation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/agentic-use/tests/test_nat_version_consistency.py` at line 18, Remove
the `from __future__ import annotations` import statement from the file. Since
the test file only uses concrete type hints like `dict[str, str]` and `None`
with no forward references, the postponed evaluation feature is unnecessary and
conflicts with the repository's preference for concrete type annotations. Simply
delete line 18 containing this import.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/agentic-use/tests/test_nat_version_consistency.py`:
- Around line 6-8: The docstring in the test file incorrectly states that the
agentic-base image installs the NAT family "unpinned," but this is outdated
since Dockerfile.agentic-base now pins nvidia-nat to 1.7.0. Update the docstring
text starting with "The agentic-base image installs the NAT family unpinned" to
accurately reflect that NAT is currently pinned to version 1.7.0 in the
Dockerfile.

---

Nitpick comments:
In `@tests/agentic-use/tests/test_nat_version_consistency.py`:
- Line 18: Remove the `from __future__ import annotations` import statement from
the file. Since the test file only uses concrete type hints like `dict[str,
str]` and `None` with no forward references, the postponed evaluation feature is
unnecessary and conflicts with the repository's preference for concrete type
annotations. Simply delete line 18 containing this import.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3dc72581-4675-47ea-b0bb-75dd9fe5a55b

📥 Commits

Reviewing files that changed from the base of the PR and between d74f416 and 9aa00e4.

📒 Files selected for processing (2)
  • Dockerfile.agentic-base
  • tests/agentic-use/tests/test_nat_version_consistency.py

Comment on lines +6 to +8
The agentic-base image installs the NAT family unpinned (see
Dockerfile.agentic-base / tests/agentic-use/requirements-nat.txt). The CLI/meta
package historically lagged the plugin packages (e.g. ``nvidia-nat==1.4.3`` vs

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Update stale docstring claim about pinning.

Line 6 says NAT is installed “unpinned,” but Dockerfile.agentic-base now pins nvidia-nat* to 1.7.0. Update this text to reflect current behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/agentic-use/tests/test_nat_version_consistency.py` around lines 6 - 8,
The docstring in the test file incorrectly states that the agentic-base image
installs the NAT family "unpinned," but this is outdated since
Dockerfile.agentic-base now pins nvidia-nat to 1.7.0. Update the docstring text
starting with "The agentic-base image installs the NAT family unpinned" to
accurately reflect that NAT is currently pinned to version 1.7.0 in the
Dockerfile.

@gabwow gabwow changed the title Nvbug6336437 fix nat crash/agabow fix: fix nat crash in nvbug 6336437 Jun 23, 2026
@github-actions github-actions Bot added the fix label Jun 23, 2026
gabwow added 2 commits June 23, 2026 15:12
Signed-off-by: Aaron Gabow <agabow@nvidia.com>
Signed-off-by: Aaron Gabow <agabow@nvidia.com>
@gabwow gabwow force-pushed the nvbug6336437-fix-nat-crash/agabow branch from ad3e694 to 4dcfb06 Compare June 23, 2026 19:13
@github-actions

Copy link
Copy Markdown
Contributor
Suite Lines Covered Line Rate Branch Rate
Unit Tests 21176/27762 76.3% 61.2%
Integration Tests 12216/26531 46.0% 19.5%

@ironcommit ironcommit left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread Dockerfile.agentic-base
# nvidia-nat-core/eval/langchain==1.7.0), which caused ImportErrors at plugin
# discovery (e.g. register_dataset_loader) and crashed `nat start fastapi`.
RUN uv pip install --python /app/.venv/bin/python \
"nvidia-nat[most]==1.7.0" nvidia-nat-atif==1.7.0 nvidia-nat-eval==1.7.0 nvidia-nat-mcp==1.7.0 && \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way this can be enforced in one place, i.e., a text file or smth we copy in? this is good to fix though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants