Skip to content

Test nim inference workflow and make minor updates#34

Merged
ktangsali merged 9 commits into
NVIDIA:mainfrom
ktangsali:debug-nim-inference
May 8, 2026
Merged

Test nim inference workflow and make minor updates#34
ktangsali merged 9 commits into
NVIDIA:mainfrom
ktangsali:debug-nim-inference

Conversation

@ktangsali
Copy link
Copy Markdown
Collaborator

PhysicsNeMo-CFD Pull Request

Description

Tested the nim inference workflow and made minor edits as a part of the testing. The l2_error_vs_sdf now clip first and then compute the metrics. Also, device is handled better.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • The CHANGELOG.md is up to date with these changes.

Dependencies

@ktangsali ktangsali requested a review from peterdsharpe May 7, 2026 00:16
@ktangsali
Copy link
Copy Markdown
Collaborator Author

/blossom-ci

Copy link
Copy Markdown
Collaborator

@peterdsharpe peterdsharpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitting initial comments on everything except the notebooks (will review locally).

Other than the comments here, is it possible to save the PNG files (specifically resampled_volume_errors.png and variations_due_to_checkpoint.png) in a smaller file size? Right now those are adding ~25 MB to the repo size in total. Maybe JPEG or lower-res PNG?

Comment thread physicsnemo/cfd/postprocessing_tools/metrics/l2_errors.py
Comment thread physicsnemo/cfd/postprocessing_tools/metrics/l2_errors.py Outdated
Comment thread physicsnemo/cfd/postprocessing_tools/metrics/l2_errors.py Outdated
@ktangsali
Copy link
Copy Markdown
Collaborator Author

Submitting initial comments on everything except the notebooks (will review locally).

Other than the comments here, is it possible to save the PNG files (specifically resampled_volume_errors.png and variations_due_to_checkpoint.png) in a smaller file size? Right now those are adding ~25 MB to the repo size in total. Maybe JPEG or lower-res PNG?

Thanks! Compressed the images.

@ktangsali
Copy link
Copy Markdown
Collaborator Author

/blossom-ci

Comment thread .github/workflows/ci-precommit.yml
Comment thread physicsnemo/cfd/postprocessing_tools/metrics/l2_errors.py Outdated
Comment thread physicsnemo/cfd/postprocessing_tools/metrics/l2_errors.py Outdated
Comment thread physicsnemo/cfd/postprocessing_tools/metrics/l2_errors.py Outdated
Copy link
Copy Markdown
Collaborator

@peterdsharpe peterdsharpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Leaving my notebook comments here, since GitHub won't let me add line-by-line annotations within *.ipynb's.


General items:

  • Please update changelog (warp version bump), Dependencies section.

A blocker in our notebooks, relating to us pointing users to deprecated APIs:

  • The cell that runs NIM inference uses _create_nbrs_surface and _interpolate. These are explicitly deprecated in the same file, and we direct users to use interpolate_mesh_to_pc or physicsnemo.nn.functional.neighbors.knn instead. Can we fix our tutorial code here to use non-deprecated APIs? The NIM produces a sparse point cloud output_dict["surface_coordinates"][0] with fields. Wrap it as pv.PolyData(coords) with point_data populated, then call interpolate_mesh_to_pc with pc=mesh_cell_centers, mesh=nim_pc, mesh_dtype="point", device="gpu". About 5 lines of glue.

Another blocker in our notebooks, related to credentials leaking:

The cells that run !wget "{url}" for drivaer_418.stl and drivaer_420.stl capture the redirect chain, including:

https://cas-bridge.xethub.hf.co/xet-bridge-us/.../drivaer_418.stl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=cas%2F...&X-Amz-Signature=155cd1bfd12bedc99e9a5c8eafca8602bb1edbbf108495138d9d0af5906e971d&...

These are pre-signed S3-style URLs. They expired ~1 hour after generation (so this PR's URLs are dead now), but committing them is bad hygiene: it normalizes embedding presigned URLs in public artifacts, and future re-runs will commit live ones. Concrete fixes, in order of preference:

  1. Replace !wget "{url}" -O "{filename}" with huggingface_hub.hf_hub_download (already a transitive dep via evaluation-hf). It logs nothing comparable.
  2. Use requests.get(url, stream=True) with iter_content and a small progress bar that doesn't echo the URL.
  3. At minimum, add --quiet to wget and clear the output cells before commit.

@ktangsali
Copy link
Copy Markdown
Collaborator Author

Overall looks good. Leaving my notebook comments here, since GitHub won't let me add line-by-line annotations within *.ipynb's.

General items:

  • Please update changelog (warp version bump), Dependencies section.

A blocker in our notebooks, relating to us pointing users to deprecated APIs:

  • The cell that runs NIM inference uses _create_nbrs_surface and _interpolate. These are explicitly deprecated in the same file, and we direct users to use interpolate_mesh_to_pc or physicsnemo.nn.functional.neighbors.knn instead. Can we fix our tutorial code here to use non-deprecated APIs? The NIM produces a sparse point cloud output_dict["surface_coordinates"][0] with fields. Wrap it as pv.PolyData(coords) with point_data populated, then call interpolate_mesh_to_pc with pc=mesh_cell_centers, mesh=nim_pc, mesh_dtype="point", device="gpu". About 5 lines of glue.

Another blocker in our notebooks, related to credentials leaking:

The cells that run !wget "{url}" for drivaer_418.stl and drivaer_420.stl capture the redirect chain, including:

https://cas-bridge.xethub.hf.co/xet-bridge-us/.../drivaer_418.stl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=cas%2F...&X-Amz-Signature=155cd1bfd12bedc99e9a5c8eafca8602bb1edbbf108495138d9d0af5906e971d&...

These are pre-signed S3-style URLs. They expired ~1 hour after generation (so this PR's URLs are dead now), but committing them is bad hygiene: it normalizes embedding presigned URLs in public artifacts, and future re-runs will commit live ones. Concrete fixes, in order of preference:

  1. Replace !wget "{url}" -O "{filename}" with huggingface_hub.hf_hub_download (already a transitive dep via evaluation-hf). It logs nothing comparable.
  2. Use requests.get(url, stream=True) with iter_content and a small progress bar that doesn't echo the URL.
  3. At minimum, add --quiet to wget and clear the output cells before commit.

Thanks, I have added --quiet flag, updated the interpolation code and added a changelog line.

@ktangsali
Copy link
Copy Markdown
Collaborator Author

/blossom-ci

Copy link
Copy Markdown
Collaborator

@peterdsharpe peterdsharpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving after offline discussion.

One remaining item that we agreed to do before merge - re-running the notebooks to do a) end-to-end integration testing, and b) cleanup outdated output cells that could mislead end-users using this as primary documentation.

@ktangsali ktangsali merged commit 3f316b6 into NVIDIA:main May 8, 2026
2 checks passed
@ktangsali ktangsali deleted the debug-nim-inference branch May 8, 2026 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants