Skip to content

Fix: Gradients don't propagate through array of structs#1207

Open
Adityakk9031 wants to merge 8 commits intoNVIDIA:mainfrom
Adityakk9031:#1174
Open

Fix: Gradients don't propagate through array of structs#1207
Adityakk9031 wants to merge 8 commits intoNVIDIA:mainfrom
Adityakk9031:#1174

Conversation

@Adityakk9031
Copy link
Contributor

@Adityakk9031 Adityakk9031 commented Feb 1, 2026

Description
This PR fixes a long-standing bug where gradients fail to propagate backward through array-of-structs in automatic differentiation. The gradient correctly accumulates into the struct array's adjoint but doesn't flow back to the source array.

Problem
When using autodiff with array-of-structs, assignments like y[i].a = x[i] where y is an array of structs would not propagate gradients back to x, even though gradients correctly accumulated into y.grad.

Minimal Example
@WP.struct
class Scalar:
a: wp.float32
@wp.kernel
def pack_kernel(x: wp.array(dtype=wp.float32), y: wp.array(dtype=Scalar)):
i = wp.tid()
y[i].a = x[i] # Gradient chain broke here
Before fix: x.grad = [0.] ❌
After fix: x.grad = [1.] ✅

Root Cause
In
warp/_src/codegen.py
, the
emit_Assign
function handles struct field assignments by calling the
store
builtin, but it failed to generate the reverse-mode adjoint code needed to propagate gradients from the RHS variable back through the struct field reference.

Changes
Modified Files
warp/_src/codegen.py
Added detection for struct array field assignments using type_is_struct()
Added missing adj.add_reverse() call to generate gradient propagation code
Updated warning logic to exclude this case (now differentiable)
warp/tests/test_grad_struct_array.py
(New File)
Added regression test
TestGradStructArray
warp/tests/unittest_suites.py
Registered
TestGradStructArray
to the default test suite for CI execution
Code Logic

Check if we're assigning to a struct field in an array element

is_struct_array_field = (
is_reference(aggregate.type) and
type_is_struct(aggregate_type)
)
if is_reference(attr.type):
adj.add_builtin_call("store", [attr, rhs])

# Generate adjoint code to propagate gradients from struct field back to RHS
if is_struct_array_field and adj.is_differentiable_value_type(strip_reference(rhs.type)):
    adj.add_reverse(f"{rhs.emit_adj()} += {attr.emit_adj()};")

Testing
Added a new test file
warp/tests/test_grad_struct_array.py
with the following test case:

def test_struct_array_gradient_propagation(test, device):
# ... setup x, y, loss ...

tape.backward(loss=loss)

# Check that gradients propagate correctly
assert_np_equal(y.grad.numpy()[0][0], 1.0, tol=1e-5)
assert_np_equal(x.grad.numpy()[0], 1.0, tol=1e-5)

Summary by CodeRabbit

  • Bug Fixes

    • Fixed autodiff for arrays of structs so gradients now propagate correctly from struct fields (scalars, vectors, matrices) back to inputs, removing incorrect nondifferentiability in these cases.
  • Tests

    • Added end-to-end tests validating gradient propagation through arrays of various struct types across devices to prevent regressions.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 1, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link

coderabbitai bot commented Feb 1, 2026

📝 Walkthrough

Walkthrough

Special-cases assignments to struct fields inside array elements in the code generator, emitting reverse-pass adjoint accumulation to propagate gradients into the RHS for differentiable fields and suppressing the nondifferentiability warning. Adds tests validating gradient propagation through arrays of struct-backed fields and registers them in test suites.

Changes

Cohort / File(s) Summary
Codegen: struct-array-field adjoint emission
warp/_src/codegen.py
Adds detection for struct-field assignments inside array elements (is_struct_array_field, is_array_field) and emits reverse-pass adjoint accumulation to propagate gradients from struct fields back into RHS when differentiable; suppresses the nondifferentiability runtime warning for this case while preserving existing behavior for other assignments.
Tests: gradient through arrays of structs
warp/tests/test_grad_struct_array.py
New test module introducing public structs (ScalarStruct, Vec3Struct, Mat22Struct, ArrayFieldStruct), kernels that pack fields into struct arrays and compute losses, and per-device tests verifying gradient propagation for scalar, vec3, and mat22 fields. Includes unittest integration and device registration.
Test suite registration
warp/tests/unittest_suites.py
Imports TestGradStructArray and adds it to default_suite and kit_suite test class lists so the new tests run with the standard suites.

Sequence Diagram(s)

sequenceDiagram
    participant Kernel as Kernel (user)
    participant Codegen as Codegen
    participant Runtime as Runtime/Autodiff
    participant Adjoint as Adjoint Accumulator

    Kernel->>Codegen: emit forward assignment (y[i].field = x[i])
    Codegen->>Runtime: generate forward code and mark struct-array-field case
    Kernel->>Runtime: execute forward (compute loss)
    Runtime->>Runtime: start backward pass
    Runtime->>Codegen: request backward code for assignment
    Codegen->>Adjoint: emit adjoint accumulation for struct-field-in-array -> propagate gradient into RHS
    Adjoint->>Runtime: updated RHS adjoint values
    Runtime->>Kernel: backward complete (gradients available)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.36% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and concisely describes the main fix: enabling gradient propagation through struct arrays, which is the core objective of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link

greptile-apps bot commented Feb 1, 2026

Greptile Overview

Greptile Summary

This PR fixes a critical autodiff bug where gradients failed to propagate backward through array-of-structs. The fix adds manual reverse-mode adjoint code generation in codegen.py:3324 using adj.add_reverse() to accumulate gradients from struct field adjoints back to the RHS variable.

Key changes:

  • Adds gradient propagation for struct array field assignments (y[i].a = x[i])
  • Includes safety check to warn about unsupported array fields within structs during autodiff
  • Comprehensive test coverage validates fix for scalar, vec3, and mat22 struct fields
  • Properly integrated into test suite for CI execution

The implementation correctly identifies struct array field assignments and generates the necessary {rhs.emit_adj()} += {attr.emit_adj()}; statement to close the gradient chain. The fix is well-tested and addresses issue #1174.

Confidence Score: 4/5

  • Safe to merge with minimal risk - fix addresses a real bug with appropriate tests
  • The fix is well-targeted and addresses a specific gradient propagation bug. Comprehensive tests validate the solution. Score of 4 (not 5) because the manual adjoint code generation bypasses the standard builtin differentiability mechanism, which could have edge cases
  • No files require special attention - all changes are well-implemented

Important Files Changed

Filename Overview
warp/_src/codegen.py Adds gradient propagation for struct array field assignments by manually generating reverse-mode adjoint code; includes edge case handling for unsupported array fields
warp/tests/test_grad_struct_array.py Comprehensive test coverage for scalar, vec3, and mat22 struct fields with proper gradient validation

Sequence Diagram

sequenceDiagram
    participant User as User Code
    participant Kernel as pack_kernel
    participant Store as store builtin
    participant ReverseAdjoint as Reverse Mode (NEW)
    participant Tape as Tape.backward()
    
    User->>Kernel: wp.launch(pack_kernel, x, y)
    Kernel->>Store: y[i].a = x[i]
    Note over Store: Forward pass: stores value
    Note over Store: builtin is_differentiable=False
    
    User->>Tape: tape.backward(loss)
    Tape->>ReverseAdjoint: Execute adjoint code
    Note over ReverseAdjoint: NEW: Manual gradient propagation<br/>x.adj += y[i].a.adj
    ReverseAdjoint-->>User: Gradients flow to x.grad
    
    Note over User,ReverseAdjoint: Before fix: gradient chain broke at store<br/>After fix: adj.add_reverse() closes the chain
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 3310 to 3311
if is_struct_array_field and adj.is_differentiable_value_type(strip_reference(rhs.type)):
adj.add_reverse(f"{rhs.emit_adj()} += {attr.emit_adj()};")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test case for this fix (e.g., the minimal example from the PR description) to prevent regression

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@greptile-apps
Copy link

greptile-apps bot commented Feb 1, 2026

Additional Comments (1)

warp/_src/codegen.py
Consider conditionally showing this warning only when the operation is actually non-differentiable, since is_struct_array_field assignments are now differentiable with this fix

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@warp/_src/codegen.py`:
- Around line 3296-3312: The check for struct types in the is_struct_array_field
predicate is using an invalid symbol (isinstance(aggregate_type,
warp._src.types.struct)) so it always fails; update the predicate to use the
proper type check helper type_is_struct(aggregate_type) from warp._src.types
(i.e., replace the isinstance call with type_is_struct(aggregate_type)) so
is_struct_array_field correctly detects struct array fields and the
adj.add_reverse call inside the block (which emits the adjoint propagation for
rhs via attr.emit_adj()) will run when appropriate.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 3310 to 3311
if is_struct_array_field and adj.is_differentiable_value_type(strip_reference(rhs.type)):
adj.add_reverse(f"{rhs.emit_adj()} += {attr.emit_adj()};")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check that a test case was added for this fix (e.g., the minimal example from PR description) to prevent regression

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 69 to 89
add_function_test(TestGrad, "test_scalar_grad", test_scalar_grad, devices=devices)
add_function_test(TestGrad, "test_for_loop_grad", test_for_loop_grad, devices=devices)
add_function_test(TestGrad, "test_for_loop_graph_grad", test_for_loop_graph_grad, devices=devices, check_outputs=False)
add_function_test(TestGrad, "test_for_loop_nested_if_grad", test_for_loop_nested_if_grad, devices=devices)
add_function_test(TestGrad, "test_for_loop_nested_for_grad", test_for_loop_nested_for_grad, devices=devices)
add_function_test(TestGrad, "test_preserve_outputs_grad", test_preserve_outputs_grad, devices=devices)
add_function_test(TestGrad, "test_vector_math_grad", test_vector_math_grad, devices=devices)
add_function_test(TestGrad, "test_matrix_math_grad", test_matrix_math_grad, devices=devices)
add_function_test(TestGrad, "test_3d_math_grad", test_3d_math_grad, devices=devices)
add_function_test(TestGrad, "test_multi_valued_function_grad", test_multi_valued_function_grad, devices=devices)
add_function_test(TestGrad, "test_mesh_grad", test_mesh_grad, devices=devices)
add_function_test(TestGrad, "test_name_clash", test_name_clash, devices=devices)
add_function_test(TestGrad, "test_struct_attribute_gradient", test_struct_attribute_gradient, devices=devices)
add_function_test(TestGrad, "test_copy", test_copy, devices=devices)
add_function_test(TestGrad, "test_aliasing", test_aliasing, devices=devices)
add_function_test(TestGrad, "test_gradient_internal", test_gradient_internal, devices=devices)
add_function_test(TestGrad, "test_gradient_external", test_gradient_external, devices=devices)
add_function_test(TestGrad, "test_gradient_precedence", test_gradient_precedence, devices=devices)
add_function_test(TestGrad, "test_gradient_slice_2d", test_gradient_slice_2d, devices=devices)
add_function_test(TestGrad, "test_gradient_slice_3d_1d", test_gradient_slice_3d_1d, devices=devices)
add_function_test(TestGrad, "test_gradient_slice_3d_2d", test_gradient_slice_3d_2d, devices=devices)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test functions (test_scalar_grad, test_for_loop_grad, etc.) are not defined in this file or imported from anywhere. They're defined in test_grad.py, not in unittest_utils. This will cause NameError when the test module is loaded.

Remove these undefined test registrations and keep only test_struct_array_gradient_propagation:

Suggested change
add_function_test(TestGrad, "test_scalar_grad", test_scalar_grad, devices=devices)
add_function_test(TestGrad, "test_for_loop_grad", test_for_loop_grad, devices=devices)
add_function_test(TestGrad, "test_for_loop_graph_grad", test_for_loop_graph_grad, devices=devices, check_outputs=False)
add_function_test(TestGrad, "test_for_loop_nested_if_grad", test_for_loop_nested_if_grad, devices=devices)
add_function_test(TestGrad, "test_for_loop_nested_for_grad", test_for_loop_nested_for_grad, devices=devices)
add_function_test(TestGrad, "test_preserve_outputs_grad", test_preserve_outputs_grad, devices=devices)
add_function_test(TestGrad, "test_vector_math_grad", test_vector_math_grad, devices=devices)
add_function_test(TestGrad, "test_matrix_math_grad", test_matrix_math_grad, devices=devices)
add_function_test(TestGrad, "test_3d_math_grad", test_3d_math_grad, devices=devices)
add_function_test(TestGrad, "test_multi_valued_function_grad", test_multi_valued_function_grad, devices=devices)
add_function_test(TestGrad, "test_mesh_grad", test_mesh_grad, devices=devices)
add_function_test(TestGrad, "test_name_clash", test_name_clash, devices=devices)
add_function_test(TestGrad, "test_struct_attribute_gradient", test_struct_attribute_gradient, devices=devices)
add_function_test(TestGrad, "test_copy", test_copy, devices=devices)
add_function_test(TestGrad, "test_aliasing", test_aliasing, devices=devices)
add_function_test(TestGrad, "test_gradient_internal", test_gradient_internal, devices=devices)
add_function_test(TestGrad, "test_gradient_external", test_gradient_external, devices=devices)
add_function_test(TestGrad, "test_gradient_precedence", test_gradient_precedence, devices=devices)
add_function_test(TestGrad, "test_gradient_slice_2d", test_gradient_slice_2d, devices=devices)
add_function_test(TestGrad, "test_gradient_slice_3d_1d", test_gradient_slice_3d_1d, devices=devices)
add_function_test(TestGrad, "test_gradient_slice_3d_2d", test_gradient_slice_3d_2d, devices=devices)
add_function_test(TestGrad, "test_struct_array_gradient_propagation", test_struct_array_gradient_propagation, devices=devices)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@warp/tests/test_grad_struct_array.py`:
- Around line 43-60: The test function test_struct_array_gradient_propagation
has an unused parameter named test which triggers Ruff ARG001; rename that
parameter to _test (or _) in the function signature to mark it intentionally
unused and update any internal references if present, ensuring the function name
and behavior (including use of wp.ScopedDevice, x/y/loss setup, tape.backward,
and assertions) remain unchanged.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

# limitations under the License.

import unittest
from typing import Any
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused import

Suggested change
from typing import Any

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@warp/tests/test_grad_struct_array.py`:
- Around line 62-69: Add the missing test class TestGradStructArray to the
default unittest suite by importing TestGradStructArray from its test module and
appending it to the test_classes list inside the default_suite() function;
specifically, update default_suite() (the function that builds test_classes) to
include the TestGradStructArray symbol so the
"test_struct_array_gradient_propagation" tests are executed in CI.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@Adityakk9031
Copy link
Contributor Author

@tabo please check this

@Adityakk9031
Copy link
Contributor Author

@shi-eric please check this

@shi-eric shi-eric requested a review from daedalus5 February 1, 2026 20:11
@wp.kernel
def loss_from_struct_array_kernel(y: wp.array(dtype=ScalarStruct), loss: wp.array(dtype=wp.float32)):
i = wp.tid()
loss[i] = y[i].a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use wp.atomic_add() here and launch N > 1 threads.


tape = wp.Tape()
with tape:
wp.launch(kernel=pack_struct_array_kernel, dim=1, inputs=[x], outputs=[y])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try with dim > 1 here, to test more realistic case

# Test for issue #1174: Gradients not propagating through array of structs
@wp.struct
class ScalarStruct:
a: wp.float32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more dtypes here to test a wider range of use cases? Eg vec3, mat22.

I don't think this change supports arrays of structs that contain array fields. Could you test this case and add more guardrails in codegen.py? If there aren't any compiler errors, we should probably issue a warning if adj.used_by_backward_kernel is true and the user is attempting to write to an array field in a differentiable kernel.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants