[WIP] Refactor autograd function handling to utilize `non_differentiable_idx` by mattteochen · Pull Request #2802 · Lightning-AI/lightning-thunder

mattteochen · 2025-12-15T11:03:01Z

No description provided.

…x` for managing gradient outputs. Removed `args_tensor_mask` references and updated related tests accordingly.

Copilot

Pull request overview

This PR refactors the autograd function handling in Thunder to remove the args_tensor_mask parameter and utilize non_differentiable_idx instead. This change aligns with PyTorch's approach for marking outputs as non-differentiable in custom autograd functions.

Key Changes:

Removed args_tensor_mask parameter from autograd function APIs
Implemented logic to use non_differentiable_idx to determine which output gradients should be None
Added boundary validation for non_differentiable_idx indices

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
thunder/torch/init.py	Removed `args_tensor_mask` parameter from `autograd_function_apply`, `augmented_forward_autograd_function_apply`, and `backward_autograd_function_apply` function signatures
thunder/tests/test_jit_general.py	Updated test calls to remove the `args_tensor_mask=[True]` argument from all autograd function apply invocations
thunder/core/jit_ext.py	Replaced `args_tensor_mask` logic with `non_differentiable_idx` handling; added boundary checks and updated gradient creation logic to set `None` for non-differentiable outputs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-15T14:38:19Z


        primal, residuals = interpret_trace(aliased_aug_fwd_trace, *args, **kwargs)
-        grads = tree_map(lambda t: get_grad(t), sequencify(primal))
+        primal_seq = tuple(sequencify(primal))


The code uses a list for grads and then mutates it by index assignment. However, this could cause an IndexError if any index in nd_set is >= len(primal_seq). While there is boundary checking earlier in the function (lines 978-980), that check validates non_differentiable_idx against output_seq from the forward pass. In the grad_transform, we're checking against primal_seq which should be the same length, but the correlation is not immediately clear. Consider adding an assertion or comment to make it clear that primal_seq and output_seq have the same length.

Suggested change

primal_seq = tuple(sequencify(primal))

primal_seq = tuple(sequencify(primal))

# NOTE: It is assumed that primal_seq and output_seq (from the forward pass) always have the same length.

# This is required for the index assignment below to be safe. If this invariant changes, this code may break.

# If output_seq is not available here, ensure that nd_set only contains valid indices for primal_seq.

Copilot · 2025-12-15T14:38:20Z

+    # PyTorch uses `non_differentiable_idx` to indicate output indices for which the
+    # grad-output should be `None`.
+    # https://github.com/pytorch/pytorch/pull/166788


The comment mentions PyTorch PR #166788 but the earlier comment on line 902 references PR #38114ec. It would be helpful to ensure consistency in how PyTorch references are documented (either full PR numbers or commit hashes) and to verify that PR #166788 is the correct reference for this change.

Copilot · 2025-12-15T14:38:20Z

            backward,
            x,
-            args_tensor_mask=[True],
            non_differentiable_idx=[],


All test cases use non_differentiable_idx=[] (empty list), which doesn't test the core functionality being refactored in this PR. Consider adding test cases where non_differentiable_idx contains actual indices to verify that non-differentiable outputs are correctly handled (i.e., their gradients are set to None). For example, test a function that returns multiple outputs where some are marked as non-differentiable, and verify that gradients are only computed for differentiable outputs.

mattteochen · 2025-12-17T11:37:04Z

Closing in favor of #2808

Refactor autograd function handling to utilize `non_differentiable_id…

288940a

…x` for managing gradient outputs. Removed `args_tensor_mask` references and updated related tests accordingly.

mattteochen requested review from KaelanDt, lantiga and mruberry as code owners December 15, 2025 11:03

mattteochen changed the title ~~[WIP] Refactor autograd function handling to utilize `non_differentiable_id…~~ [WIP] Refactor autograd function handling to utilize non_differentiable_idx Dec 15, 2025

mattteochen marked this pull request as draft December 15, 2025 14:10

IvanYashchuk requested a review from Copilot December 15, 2025 14:30

IvanYashchuk mentioned this pull request Dec 15, 2025

Fix KeyError for args_tensor_mask removed in PyTorch nightly #2804

Closed

Copilot started reviewing on behalf of IvanYashchuk December 15, 2025 14:34 View session

Copilot AI reviewed Dec 15, 2025

View reviewed changes

mattteochen closed this Dec 17, 2025

github-actions Bot deleted the kaixi/args_tensor_mask branch March 17, 2026 00:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Refactor autograd function handling to utilize `non_differentiable_idx`#2802

[WIP] Refactor autograd function handling to utilize `non_differentiable_idx`#2802
mattteochen wants to merge 1 commit intomainfrom
kaixi/args_tensor_mask

mattteochen commented Dec 15, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 15, 2025

Uh oh!

Copilot AI Dec 15, 2025

Uh oh!

Copilot AI Dec 15, 2025

Uh oh!

mattteochen commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        primal_seq = tuple(sequencify(primal))
+        primal_seq = tuple(sequencify(primal))
+        # NOTE: It is assumed that primal_seq and output_seq (from the forward pass) always have the same length.
+        # This is required for the index assignment below to be safe. If this invariant changes, this code may break.
+        # If output_seq is not available here, ensure that nd_set only contains valid indices for primal_seq.

Conversation

mattteochen commented Dec 15, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

mattteochen commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants