Skip to content

Conversation

@willkill07
Copy link
Member

@willkill07 willkill07 commented Feb 2, 2026

Description

Adds two new scripts:

  • ./ci/scripts/license_diff.py - shows the license/package changes from HEAD to a base branch (default is develop). Very useful for finding new/updated/removed packages for PRs. Output written to standard output.
  • ./ci/scripts/sbom_list.py - shows the full package SBOM with package name, version, and license. Exported as sbom_list.tsv.

This also re-adds all examples into the top-level so we can track all example dependencies in the top-level uv.lock

And we also now pin a specific version of the uv containers to err on the side of caution.

Example output of license_diff.py is below. (For this very PR)

Notable output considerations:

  • license changes are only printed if it has changed
  • source packages (e.g. those shipped with the repo) are indicated via (source)
Added packages:
- nat-adk-demo (source)
- nat-agents-examples (source)
- nat-agno-personal-finance (source)
- nat-alert-triage-agent (source)
- nat-autogen-demo (source)
- nat-automated-description-generation (source)
- nat-currency-agent-a2a (source)
- nat-documentation-guides (source)
- nat-dpo-tic-tac-toe (source)
- nat-email-phishing-analyzer (source)
- nat-haystack-deep-research-agent (source)
- nat-kaggle-mcp (source)
- nat-math-assistant-a2a (source)
- nat-math-assistant-a2a-protected (source)
- nat-multi-frameworks (source)
- nat-notebooks (source)
- nat-per-user-workflow (source)
- nat-plot-charts (source)
- nat-por-to-jiratickets (source)
- nat-profiler-agent (source)
- nat-react-benchmark-agent (source)
- nat-redis-example (source)
- nat-retail-agent (source)
- nat-rl-with-openpipe-art (source)
- nat-router-agent (source)
- nat-semantic-kernel-demo (source)
- nat-sequential-executor (source)
- nat-service-account-auth-mcp (source)
- nat-simple-auth (source)
- nat-simple-auth-mcp (source)
- nat-simple-calculator (source)
- nat-simple-calculator-custom-routes (source)
- nat-simple-calculator-eval (source)
- nat-simple-calculator-hitl (source)
- nat-simple-calculator-mcp (source)
- nat-simple-calculator-mcp-protected (source)
- nat-simple-calculator-observability (source)
- nat-simple-rag (source)
- nat-simple-web-query (source)
- nat-simple-web-query-eval (source)
- nat-strands-demo (source)
- nat-swe-bench (source)
- nat-user-report (source)
Changed packages:
- lxml 5.4.0 -> 6.0.2 (License :: OSI Approved :: BSD License -> BSD-3-Clause)
- openinference-instrumentation-langchain 0.1.29 -> 0.1.58 (License :: OSI Approved :: Apache Software License -> Apache-2.0)
- uvloop 0.21.0 -> 0.22.1

Closes

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

Release Notes

  • Documentation

    • Updated guidance for discovering available plugins and optional dependencies via project configuration.
  • New Features

    • Expanded examples catalog now available in project configuration for easier reference and discovery.
  • Chores

    • Updated container image and tooling versions across CI pipelines and build configurations.

Signed-off-by: Will Killian <wkillian@nvidia.com>
Signed-off-by: Will Killian <wkillian@nvidia.com>
Signed-off-by: Will Killian <wkillian@nvidia.com>
@willkill07 willkill07 self-assigned this Feb 2, 2026
@willkill07 willkill07 requested a review from a team as a code owner February 2, 2026 20:49
@willkill07 willkill07 added the feature request New feature or request label Feb 2, 2026
@willkill07 willkill07 requested a review from a team as a code owner February 2, 2026 20:49
@willkill07 willkill07 added the non-breaking Non-breaking change label Feb 2, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 2, 2026

Walkthrough

This pull request introduces new CI scripts for license tracking, expands project metadata with example packages and UV sources, updates documentation and guidelines, and bumps Docker image versions for the uv tool across multiple configurations and Dockerfiles.

Changes

Cohort / File(s) Summary
Configuration & Guidelines
.coderabbit.yaml, .cursor/rules/nat-setup/nat-toolkit-installation.mdc
Updated development guidelines to reference root pyproject.toml examples list; expanded plugin discovery documentation with guidance on inspecting optional dependencies and viewing extras via uv pip show.
CI/CD License Scripts
ci/scripts/license_diff.py, ci/scripts/sbom_list.py
Added two new Python scripts for software composition analysis: license_diff.py compares licenses between branches using PyPI metadata; sbom_list.py generates SBOM-like TSV output with name, version, and license data. Both include PyPI metadata fetching with fallback handling.
Project Metadata
pyproject.toml
Added examples optional dependency list (54+ NAT example identifiers) and expanded [tool.uv.sources] with entries for 25+ packages and 54+ example packages, each marked as editable.
CI Pipeline Configuration
.github/workflows/pr.yaml, .gitlab-ci.yml, ci/scripts/run_ci_local.sh
Updated container image references from unpinned tags to version 0.9.28 (e.g., ghcr.io/astral-sh/uv:0.9.28-python3.x-bookworm) for consistent CI environments.
Docker Image Updates
docker/Dockerfile, examples/evaluation_and_profiling/email_phishing_analyzer/Dockerfile, examples/frameworks/agno_personal_finance/Dockerfile, examples/frameworks/strands_demo/bedrock_agentcore/Dockerfile, examples/frameworks/strands_demo/bedrock_agentcore/README.md, examples/front_ends/simple_auth/Dockerfile, examples/getting_started/simple_calculator/Dockerfile, examples/getting_started/simple_web_query/Dockerfile
Updated uv tool image version from 0.9.15 to 0.9.28 in multi-stage Docker builds across root and example Dockerfiles.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(ci-scripts): utility scripts for license updates and SBOM' clearly and concisely describes the main changes: two new CI utility scripts for license tracking and SBOM generation. It uses imperative mood, follows the required format, and is 62 characters, well within the ~72 character limit.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In `@ci/scripts/license_diff.py`:
- Around line 86-89: The added_packages and removed_packages comprehension
currently include internal packages despite the intent to skip
"nvidia-nat*"—update the comprehensions that build added_packages and
removed_packages to filter out packages whose names start with "nvidia-nat" the
same way changed_packages does (i.e., use the same pkg.startswith("nvidia-nat")
check), ensuring all three dictionaries consistently exclude internal packages;
locate the comprehensions that set added_packages, removed_packages, and
changed_packages and apply the filter to the first two.
- Around line 111-124: The change lines currently render "head_version ->
base_version" and "(head_license -> base_license)" which inverts the direction;
update the two formatted strings in the list_of_changes append calls to show
base first then head (i.e., use base_version -> head_version and base_license ->
head_license) so the diff reads "base -> head"; locate the code around the loop
using changed_packages, head_packages, base_packages and pypi_license and swap
the order of the version and license interpolations in both appended strings.
- Around line 42-45: The urllib.request.urlopen calls in pypi_license() (PyPI
metadata fetch using variable url) and in main (GitHub uv.lock fetch) lack
timeouts; either add timeout=10 to both urlopen(...) calls or, preferably,
replace these requests with an httpx.Client() usage (create a client with
default verify=True) and perform client.get(url, timeout=10) to fetch and
json.loads() the response content; update pypi_license() and the main fetch
logic to use the httpx client and ensure responses are checked for successful
status before parsing.
- Around line 134-137: Validate and sanitize the CLI input for --base-branch
(args.base_branch) after parsing to prevent malformed GitHub URLs; specifically,
restrict it to a safe character set (e.g., allow letters, digits, dot,
underscore, hyphen and slash via a regex like r'^[A-Za-z0-9._/-]+$') and call
parser.error(...) or exit with a clear message when the value fails validation,
or alternatively percent-encode the branch name before using it in the GitHub
API URL construction that interpolates args.base_branch.

In `@ci/scripts/sbom_list.py`:
- Around line 62-68: Rename the unused parameter base_name in the function
process_uvlock to _base_name to signal it's intentionally unused (update the
function signature accordingly), and update the docstring parameter section to
document _base_name instead of base_name while keeping the compatibility note;
verify there are no internal references to base_name that need changing and run
tests/linting to ensure no unused-parameter warnings remain.
- Around line 42-44: Replace the blocking urllib.request.urlopen call with the
project's preferred httpx synchronous client: create/reuse an httpx.Client in
main() (or the calling scope), fetch the PyPI URL via client.get(url,
timeout=10) and parse the JSON via response.json(), and ensure the client is
closed (use a with httpx.Client() as client or store and close it) so requests
to the URL in the sbom_list.py function replace urllib.request.urlopen(url) and
json.load(r) with client.get(...).json() using timeout and proper lifecycle
management.

In `@pyproject.toml`:
- Around line 109-154: The examples list contains "text_file_ingest" which lacks
the required nat_ prefix; update the examples array entry to
"nat_text_file_ingest" and also update the corresponding entry in
tool.uv.sources (where "text_file_ingest" is referenced) to
"nat_text_file_ingest" so both the examples list and tool.uv.sources use the
required nat_ prefix.

Copy link

@Salonijain27 Salonijain27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved from a dependency point of view

Signed-off-by: Will Killian <wkillian@nvidia.com>
Signed-off-by: Will Killian <wkillian@nvidia.com>
Signed-off-by: Will Killian <wkillian@nvidia.com>
Copy link
Contributor

@ericevans-nv ericevans-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved conditionally, pending implementation of the CodeRabbit suggestions. Adding timeouts to the requests would be helpful to prevent CI from hanging indefinitely.

Signed-off-by: Will Killian <wkillian@nvidia.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@ci/scripts/license_diff.py`:
- Around line 43-48: The current broad except Exception around the network/json
fetch hides programming errors; replace it by catching only the expected
failures from urllib.request.urlopen and json.load: catch urllib.error.URLError
and urllib.error.HTTPError (or the umbrella urllib.error.URLError which covers
HTTPError) and json.JSONDecodeError (and optionally socket.timeout if timeouts
are used) and return "(License not found)" for those cases, but let other
exceptions propagate; ensure you import urllib.error and json.JSONDecodeError
and use an except (urllib.error.URLError, json.JSONDecodeError) as e block
around the urlopen/json.load sequence (referencing url, urllib.request.urlopen,
and json.load) so only network/JSON errors are swallowed.

In `@ci/scripts/sbom_list.py`:
- Around line 42-47: The try/except around the PyPI fetch is too broad; narrow
it to only handle expected network and JSON errors by catching
urllib.error.HTTPError and urllib.error.URLError from urllib.request.urlopen and
json.JSONDecodeError (and optionally ValueError) from json.load, return
"(License not found)" for those cases, and re-raise any other unexpected
exceptions so programming errors aren't swallowed; reference the existing url
construction, urllib.request.urlopen, and json.load when locating where to
replace the broad "except Exception" with these specific exception types.
🧹 Nitpick comments (2)
ci/scripts/sbom_list.py (2)

80-88: Open TSV with newline="" and UTF-8 to avoid CSV quirks.

csv.writer recommends newline="" to prevent extra blank lines on Windows, and explicit UTF-8 avoids locale issues in license text.

♻️ Suggested change
-    with open("licenses.tsv", "w") as f:
+    with open("licenses.tsv", "w", newline="", encoding="utf-8") as f:
         writer = csv.writer(f, delimiter="\t")

104-111: Sort package names for deterministic sbom_list.tsv.

licenses.tsv is sorted; do the same here to make diffs stable across runs.

♻️ Suggested change
-    for pkg in tqdm(pkgs.keys(), desc="Processing packages", unit="packages"):
+    for pkg in tqdm(sorted(pkgs.keys()), desc="Processing packages", unit="packages"):
         try:
             sbom_list.append({

@willkill07
Copy link
Member Author

willkill07 commented Feb 3, 2026

@ericevans-nv

Adding timeouts to the requests would be helpful to prevent CI from hanging indefinitely.

This is a user manually invoked script. It does not run in CI at all. I placed it here because it is maintainer-oriented rather than developer-oriented and I didn't want to confuse consumers of the library.

Copy link
Contributor

@mnajafian-nv mnajafian-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, I made some suggestions for your review on Exception Handling, Missing Timeouts, and Dead Code.

@willkill07
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 77a0450 into NVIDIA:develop Feb 3, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants