Avoids bug in `tensordict==0.12.x` by upper-bounding `tensordict` version by peterdsharpe · Pull Request #1658 · NVIDIA/physicsnemo

peterdsharpe · 2026-05-20T21:52:46Z

PhysicsNeMo Pull Request

Limits tensordict upper version to <0.12 until the following torch.compile regressions are fixed, merged, and released:

[BUG] @tensorclass field defaults are not applied under torch.compile, raising KeyError pytorch/tensordict#1710
[BUG] @tensorclass silently skips __post_init__ under torch.compile, producing wrong output pytorch/tensordict#1708

Also adds a test that would have caught this regression earlier.

Description

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
The CHANGELOG.md is up to date with these changes.
An issue is linked to this pull request.
If I am implementing a new model or modifying any existing model, I have followed the Models Implementation Coding Standards.

Dependencies

Review Process

All PRs are reviewed by the PhysicsNeMo team before merging.

Depending on which files are changed, GitHub may automatically assign a maintainer for review.

We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.

AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.

… Mesh under torch.compile - Adjusted the tensordict dependency in pyproject.toml to be upper-bounded due to regressions in version 0.12.x, with a note to drop the upper bound once the related PR is merged. - Introduced a new test file for regression testing of the Mesh class to ensure compatibility with torch.compile, specifically addressing issues caused by the tensordict 0.12.x changes. The tests validate that cached properties and data fields behave correctly when compiled.

- Added a new entry in CHANGELOG detailing the fix for constructing a Mesh inside a torch.compile-traced function, addressing regressions from tensordict 0.12.0. - Updated the mlflow and starlette package versions to 3.12.0 and 0.52.1 respectively, along with their corresponding source distribution and wheel URLs. - Adjusted tensordict dependency constraints to ensure compatibility with the latest changes.

copy-pr-bot · 2026-05-20T21:52:50Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

peterdsharpe · 2026-05-20T21:54:02Z

/blossom-ci

greptile-apps · 2026-05-20T21:56:10Z

Greptile Summary

This hotfix pins tensordict to >=0.11.0,<0.12 to avoid two torch.compile regressions introduced in tensordict 0.12.x (pytorch/tensordict#1708, #1710), and adds a dedicated regression test suite that constructs a Mesh inside a compiled function to prevent silent re-introduction of the bug on a future pin bump.

pyproject.toml: tensordict pin tightened to >=0.11.0,<0.12; redundant datapipes-extras tensordict entry removed (core dependency now satisfies the >=0.11.0 requirement already).
test/mesh/mesh/test_compile.py: Comprehensive regression tests cover cached-property access, data-field defaults, and the __post_init__ → cache round-trip, all under torch.compile.
uv.lock: Expected tensordict downgrade to 0.11.0; incidental side-effects include mlflow 3.11.1 → 3.12.0 and a starlette 1.0.0 → 0.52.1 downgrade (driven by mlflow-skinny 3.12.0 adding starlette as a direct dependency with an implicit upper bound).

Important Files Changed

Filename	Overview
pyproject.toml	Pins tensordict to >=0.11.0,<0.12 (well-commented) and removes the now-redundant datapipes-extras tensordict entry since core already covers >=0.11.0.
test/mesh/mesh/test_compile.py	New regression test file; well-structured, covers cached properties, data-field defaults, and the full __post_init__→cache round-trip under torch.compile.
uv.lock	tensordict downgraded to 0.11.0 (expected); incidental side-effects include mlflow 3.11.1→3.12.0 and starlette 1.0.0→0.52.1 downgrade driven by mlflow-skinny 3.12.0's new starlette dependency.
CHANGELOG.md	CHANGELOG updated with both a Fixed entry and a Dependencies entry, clearly explaining the tensordict pin and upstream issue references.

Comments Outside Diff (1)

uv.lock, line 7216-7232 (link)

Starlette version downgrade as lock-regen side-effect

starlette moved from 1.0.0 → 0.52.1 as a side effect of mlflow-skinny 3.12.0 now declaring starlette as an explicit dependency — mlflow-skinny likely constrains it below 1.0.0. Since starlette is not a direct dependency of physicsnemo the practical risk is low, but it is worth confirming no other transitive consumer of starlette in the environment expects the 1.x API (which introduced breaking changes relative to 0.x).

_{Reviews (1): Last reviewed commit: "format" | Re-trigger Greptile}

ktangsali · 2026-05-20T22:13:48Z

/blossom-ci

peterdsharpe · 2026-05-21T20:47:50Z

FYI, tensordict is planning to cut a patch-version release to fix this; after that lands, we can pin >=0.12.4.

pytorch/tensordict#1709 (comment)

peterdsharpe · 2026-05-22T00:49:14Z

/blossom-ci

peterdsharpe · 2026-05-22T00:49:25Z

/blossom-ci

peterdsharpe added 3 commits May 20, 2026 17:30

format

c11317c

peterdsharpe requested review from coreyjadams and ktangsali as code owners May 20, 2026 21:52

peterdsharpe changed the title ~~Psharpe/tensordict compile hotfix~~ Avoids bug in tensordict==0.12.x by upper-bounding tensordict version May 20, 2026

peterdsharpe requested a review from abokov-nv May 20, 2026 21:53

abokov-nv approved these changes May 20, 2026

View reviewed changes

ktangsali approved these changes May 20, 2026

View reviewed changes

Merge branch '2.1.0-rc' into psharpe/tensordict-compile-hotfix

74378b3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoids bug in `tensordict==0.12.x` by upper-bounding `tensordict` version#1658

Avoids bug in `tensordict==0.12.x` by upper-bounding `tensordict` version#1658
peterdsharpe wants to merge 4 commits into
NVIDIA:2.1.0-rcfrom
peterdsharpe:psharpe/tensordict-compile-hotfix

peterdsharpe commented May 20, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented May 20, 2026

Uh oh!

peterdsharpe commented May 20, 2026

Uh oh!

greptile-apps Bot commented May 20, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

ktangsali commented May 20, 2026

Uh oh!

peterdsharpe commented May 21, 2026

Uh oh!

peterdsharpe commented May 22, 2026

Uh oh!

peterdsharpe commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

peterdsharpe commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PhysicsNeMo Pull Request

Description

Checklist

Dependencies

Review Process

Uh oh!

copy-pr-bot Bot commented May 20, 2026

Uh oh!

peterdsharpe commented May 20, 2026

Uh oh!

greptile-apps Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Important Files Changed

Comments Outside Diff (1)

Uh oh!

ktangsali commented May 20, 2026

Uh oh!

peterdsharpe commented May 21, 2026

Uh oh!

peterdsharpe commented May 22, 2026

Uh oh!

peterdsharpe commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

peterdsharpe commented May 20, 2026 •

edited

Loading

greptile-apps Bot commented May 20, 2026 •

edited

Loading