fix(codegen): drop parent tile_view valid_shape in subview inference#1535
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughInferSubviewTileTypeComponents now derives subview result-type ChangesSubview tile type inference and sentinel valid_shape handling
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request addresses issue #1507 by ensuring that the matrix multiplication type deducers (DeduceTileMatMulType, DeduceTileMatMulAccType, and DeduceTileMatMulBiasType) correctly propagate the operand valid_shape (taking M from the left-hand side and N from the right-hand side) instead of defaulting to the full output shape. Corresponding unit tests have been added in test_tile_ops.py to verify this propagation behavior across tile.matmul, tile.matmul_acc, and tile.matmul_bias. I have no further feedback to provide as the changes are well-implemented and thoroughly tested.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/ut/ir/operators/test_tile_ops.py (1)
1401-1475: ⚡ Quick winAdd one GEMV regression to lock the shared deducer contract.
These tests validate matmul/matmul_acc/matmul_bias propagation well, but
tile.gemv*rely on separate op registrations even though they reuse the same deducers. A compacttile.gemvvalid-shape assertion would catch wiring drift early.Suggested test addition
+ def test_tile_gemv_propagates_operand_valid_shape(self): + """tile.gemv: output valid is [lhs.valid_M, rhs.valid_N] via shared matmul deducer.""" + lhs = self._make_matmul_lhs_with_valid_m(m=1, k=32, valid_m=1) + rhs = self._make_matmul_rhs_with_valid_n(k=32, n=64, valid_n=24) + call = tile.gemv(lhs, rhs) + result_type = call.type + assert isinstance(result_type, ir.TileType) + assert result_type.tile_view is not None + valid_shape = result_type.tile_view.valid_shape + assert isinstance(valid_shape[0], ir.ConstInt) and valid_shape[0].value == 1 + assert isinstance(valid_shape[1], ir.ConstInt) and valid_shape[1].value == 24🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/ut/ir/operators/test_tile_ops.py` around lines 1401 - 1475, Add a compact regression test mirroring the matmul valid-shape checks: create lhs using _make_matmul_lhs_with_valid_m(m=16,k=32,valid_m=8) and rhs using _make_matmul_rhs_with_valid_n(k=32,n=64,valid_n=24), call tile.gemv(lhs, rhs), get call.type, assert it's an ir.TileType with tile_view not None and that tile_view.valid_shape has two ConstInt entries with values 8 and 24; name the test e.g. test_tile_gemv_propagates_operand_valid_shape to lock the deducer contract for tile.gemv.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/ut/ir/operators/test_tile_ops.py`:
- Around line 1401-1475: Add a compact regression test mirroring the matmul
valid-shape checks: create lhs using
_make_matmul_lhs_with_valid_m(m=16,k=32,valid_m=8) and rhs using
_make_matmul_rhs_with_valid_n(k=32,n=64,valid_n=24), call tile.gemv(lhs, rhs),
get call.type, assert it's an ir.TileType with tile_view not None and that
tile_view.valid_shape has two ConstInt entries with values 8 and 24; name the
test e.g. test_tile_gemv_propagates_operand_valid_shape to lock the deducer
contract for tile.gemv.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 04fbe298-d290-4b73-996c-b139edda698d
📒 Files selected for processing (2)
src/ir/op/tile_ops/matmul.cpptests/ut/ir/operators/test_tile_ops.py
ff93864 to
e7e41f5
Compare
Fixes hw-native-sys#1507 InferSubviewTileTypeComponents read source_tile_type.tile_view_.valid_shape to derive the subview result's static v_row / v_col. ExtractTileTypeInfo in pto_type_utils.cpp always renders non-subview tile types as v_row=?, v_col=? regardless of what valid_shape carries in the IR, so PTOAS infers the subview result purely from the slice's `sizes`. Reading the sentinel [0, 0] that SplitVectorKernel's WithZeroValidShape stamps onto lane1 clones therefore diverged from PTOAS: we emitted v_row=0, v_col=0 against a v_row=? parent and PTOAS rejected. Use source_tile_type.shape_ instead of tile_view.valid_shape. For in-bounds slices this matches PTOAS's "infer from sizes" rule. Runtime valid is unchanged — pto.alloc_tile still reads valid_row / valid_col from TileType.tile_view_.valid_shape, so lane1 elision still writes the [0, 0] sentinel where it should. Add a codegen regression test that materialises the lane1 sentinel via an explicit pl.TileView(valid_shape=[0, 0]) parent and asserts the resulting pto.subview type string does not carry v_row=0 / v_col=0.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/backend/common/pto_ops_common.cpp`:
- Around line 227-237: When initializing source_valid, special-case only the
zero-sentinel case instead of always using source_tile_type.shape_; detect if
source_tile_type.shape_ is the sentinel [0,0] (or equivalent zero-valid
sentinel) and only then set source_valid = source_tile_type.shape_, otherwise
preserve the parent's inferred static valid extents (e.g., use
source_tile_type.valid_shape or tile_view_.valid_shape) so nested subviews keep
their narrowed valid_box; update the assignment that creates source_valid to
branch on that sentinel check and pick the sentinel shape only when applicable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f039cdc5-0e40-459b-8109-bceb75c6ba2b
📒 Files selected for processing (2)
src/backend/common/pto_ops_common.cpptests/ut/codegen/test_pto_codegen_ops.py
Previously the fix in 484df88 always replaced the parent's `tile_view_.valid_shape` with `shape_` when inferring a subview's static valid. CodeRabbit pointed out that this also discards genuine narrow valid extents on parent subviews — when downstream propagation populates a real `v_row/v_col`, nested subviews would now widen back to `shape_` and mis-type their result valid. Restore the existing inference for any non-zero parent valid, and only fall through to `shape_` when the IR carries the all-zero lane1 sentinel (set by SplitVectorKernel's WithZeroValidShape). The original issue hw-native-sys#1507 is still fixed because the sentinel detection covers the exact case that PTOAS rejects. Add a counterpart test for the non-sentinel narrow-valid path so the preservation behaviour is locked in.
Summary
InferSubviewTileTypeComponents(src/backend/common/pto_ops_common.cpp:203) readsource_tile_type.tile_view_.valid_shapeto derive the subview result's staticv_row/v_col.ExtractTileTypeInfo(src/codegen/pto/pto_type_utils.cpp:142-145) renders every non-subview tile type asv_row=?, v_col=?regardless of whattile_view_.valid_shapeholds in the IR. PTOAS therefore infers the subview result's static valid from the slice'ssizes. Reading the sentinel[0, 0]thatSplitVectorKernel'sWithZeroValidShape(src/ir/transforms/split_vector_kernel_pass.cpp:228) stamps onto lane1 clones diverged from PTOAS: we emittedv_row=0, v_col=0against av_row=?parent and PTOAS rejected.source_tile_type.shape_. For in-bounds slices this matches PTOAS's "infer from sizes" rule. Runtime valid is unchanged —pto.alloc_tilestill readsvalid_row/valid_colfromTileType.tile_view_.valid_shape(pto_codegen.cpp:1078-1095), so lane1 elision still writes the[0, 0]sentinel where it should.Why this fixes #1507 (and why the previous matmul attempt didn't)
The first attempt on this branch propagated operand
valid_shapethroughDeduceTileMatMulType/MatMulAcc/MatMulBias. That brokehc_preaccuracy by 98–99% becausepto.alloc_tileconsumestile_view.valid_shapeas the physical write extent of the cube intrinsic — narrowing matmul outputvalid_shapeclippedpto.tmatmulto the narrow region, leaving the rest of the Acc as garbage that downstream consumers still read.The subview-inference fix only changes the static type string PTOAS verifies — it does not change
pto.alloc_tile's runtime arguments, the cube/vec intrinsic's actual write extent, or any tile op's runtime semantics.Testing
test_tile_slice_with_zero_valid_sentinel_parent_does_not_emit_zero_result_validintests/ut/codegen/test_pto_codegen_ops.py::TestTileSliceCodegen— materialises the lane1 sentinel via an explicitpl.TileView(valid_shape=[0, 0])parent and asserts thepto.subviewtype string does NOT carryv_row=0/v_col=0.tests/ut/codegen/+tests/ut/ir/transforms/test_split_vector_kernel.py— 359/359 pass.tests/ut/sweep — 5240 passed, 28 skipped, 0 failed.pypto-lib-model(DSV4hc_pre) accuracy via CI — must validate end-to-end.Related Issues
Fixes #1507