fix(language): accept narrow valid_shape in subscript-write#1528
fix(language): accept narrow valid_shape in subscript-write#1528YunjiQin wants to merge 5 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR makes the parser accept assemble and subscript-write sources when the source's logical ChangesNarrowed-source assemble and subscript-write
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces support for subscript-write assignments where the source tensor or tile has a narrowed valid_shape (e.g., due to ISA alignment padding) that matches the target window, even if its static_shape does not. It adds a helper function _get_source_valid_shape and updates _parse_subscript_assignment to validate using this logical shape, supported by comprehensive runtime and unit tests. The review feedback suggests adding defensive checks to prevent potential runtime errors: specifically, a None check on the result of get_effective_tile_view() to avoid an AttributeError, and a bounds check on source_valid_shape using src_axis to prevent an IndexError.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/ut/language/parser/test_subscript_syntax.py (1)
780-796: ⚡ Quick winAdd a dynamic
valid_shapetile subscript-write test for symmetry.This new tile coverage validates narrowed static
valid_shape, but not the dynamicvalid_shapetrust branch. A matching dynamic tile write case here would close that parser coverage gap and reduce regression risk.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/ut/language/parser/test_subscript_syntax.py` around lines 780 - 796, Add a new unit test that mirrors test_tile_subscript_write_accepts_narrow_valid_shape but uses a dynamic valid_shape instead of compile-time constants: create a new `@pl.function` (e.g., narrow_tile_write_dynamic) that accepts x and src like narrow_tile_write, compute a runtime valid height/width (pass them as function params or compute from x) and call pl.set_validshape(src, dyn_h, dyn_w) before performing the tile subscript write t[0:dyn_h, 0:dyn_w] = narrowed and returning pl.store(t, [0,0], x); ensure the test asserts the generated Python contains "tile.assemble" to validate the dynamic-trust branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@python/pypto/language/parser/ast_parser.py`:
- Around line 1497-1522: The parser currently accepts a padded source because
source_valid_shape matched but later reshapes the RHS to the narrowed `extents`,
losing its padded `static_shape`; update the lift/reshape logic (where the RHS
is rewritten after the check using `extents`) to reinsert dropped unit axes into
the source's original `static_shape` (use
`_get_source_valid_shape`/`source_type.shape` as the base) so the full-rank
shape preserves padded static extents, and move the logical window sizes into
`valid_shape` (i.e., set `valid_shape` to `extents` for the lifted axes) instead
of overriding the static extents; keep the existing checks that use
`_fold_const_slice_extent` and only change the reshape/lift behavior that
currently replaces static extents with `extents`.
---
Nitpick comments:
In `@tests/ut/language/parser/test_subscript_syntax.py`:
- Around line 780-796: Add a new unit test that mirrors
test_tile_subscript_write_accepts_narrow_valid_shape but uses a dynamic
valid_shape instead of compile-time constants: create a new `@pl.function` (e.g.,
narrow_tile_write_dynamic) that accepts x and src like narrow_tile_write,
compute a runtime valid height/width (pass them as function params or compute
from x) and call pl.set_validshape(src, dyn_h, dyn_w) before performing the tile
subscript write t[0:dyn_h, 0:dyn_w] = narrowed and returning pl.store(t, [0,0],
x); ensure the test asserts the generated Python contains "tile.assemble" to
validate the dynamic-trust branch.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: dd856401-153a-40ea-bd5b-4c05e4ea706d
📒 Files selected for processing (3)
python/pypto/language/parser/ast_parser.pytests/st/runtime/control_flow/test_assemble_narrow_valid.pytests/ut/language/parser/test_subscript_syntax.py
CodeRabbit on hw-native-sys#1528 flagged that the rank-lift reshape used `extents` (the window/slot sizes) as the target shape, which breaks when the source has narrow valid_shape: `tensor.reshape` rejects with a product mismatch (e.g. src [16, 16] -> [1, 8, 8] is 256 != 64). Lift the source's padded static_shape instead (insert unit dims at drop positions), and carry valid_shape forward via tensor.reshape's optional 3rd arg. Reject the tile + rank-reduce + narrow-valid combo explicitly — tile.reshape has no valid_shape parameter, so it cannot carry the narrowing across the lift; direct pl.store is the right path there. Extracted the lift into _lift_subscript_write_source_rank to keep the main function under the branch-count lint threshold. Backward compatible: when src.static_shape == window extents, lifted_static == list(extents), so existing rank-reducing writes (no valid_shape) emit the same reshape as before.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/ut/language/parser/test_subscript_syntax.py`:
- Line 824: The regex in the pytest.raises call currently uses "tile.reshape
cannot carry valid_shape" where the dot is unescaped and matches any character;
update the match pattern to escape the literal dot (e.g., use "tile\.reshape
cannot carry valid_shape" or a raw string r"tile\.reshape cannot carry
valid_shape") in the test in test_subscript_syntax.py so the assertion matches
the literal "tile.reshape" message.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 3c6a2ee0-e992-46c6-85e1-e551ce4ba6db
📒 Files selected for processing (2)
python/pypto/language/parser/ast_parser.pytests/ut/language/parser/test_subscript_syntax.py
Address two follow-ups on hw-native-sys#1528: 1. Collapse the rank-lift in _lift_subscript_write_source_rank to one reshape call. The lift target now always derives from the source's own static_shape (skipping tile 2D-floor lead unit axes via source.shape[lead_units:]); whether to pass tensor.reshape's optional 3rd valid_shape arg is the only remaining branch. Extracted _lift_shape so static and valid go through the same axis-insertion logic. 2. As a side effect, fixes a latent bug for tile + rank-lift + lead_units>0 (e.g. a [1, 128] tile written into a 3D target via two scalar drops): the old new-path iteration treated the tile-floor's lead unit as a real kept axis and produced lifted_static = [1, 1, 1] instead of [1, 1, 128]. Regression UT: test_tile_subscript_write_rank_reduce_with_lead_unit. 3. Escape the literal dot in the UnsupportedFeatureError regex match pattern (RUF043, CodeRabbit minor).
Fixes hw-native-sys#1509 `_parse_subscript_assignment` rejected any source whose `static_shape` did not match the slot extent, even when the source carried a narrower `valid_shape` that did. This broke the natural pattern where ISA alignment forces `static_shape > valid_shape` and the source loses its Tile-ness through a `pl.range` loop carry, leaving subscript-write / `pl.assemble` as the only write paths. Relax the per-axis check to also accept when the source's `valid_shape` matches the window, mirroring the semantics `pl.store` already implements on the tile path. Downstream lowering is unchanged — the source flows through with its view metadata intact and `tensor.assemble` → `tile.store` reads `valid_shape` as before.
CodeRabbit on hw-native-sys#1528 flagged that the rank-lift reshape used `extents` (the window/slot sizes) as the target shape, which breaks when the source has narrow valid_shape: `tensor.reshape` rejects with a product mismatch (e.g. src [16, 16] -> [1, 8, 8] is 256 != 64). Lift the source's padded static_shape instead (insert unit dims at drop positions), and carry valid_shape forward via tensor.reshape's optional 3rd arg. Reject the tile + rank-reduce + narrow-valid combo explicitly — tile.reshape has no valid_shape parameter, so it cannot carry the narrowing across the lift; direct pl.store is the right path there. Extracted the lift into _lift_subscript_write_source_rank to keep the main function under the branch-count lint threshold. Backward compatible: when src.static_shape == window extents, lifted_static == list(extents), so existing rank-reducing writes (no valid_shape) emit the same reshape as before.
The iteration-1 fix unconditionally returned non-None for tile sources because get_effective_tile_view always surfaces an effective valid_shape (equal to static_shape for canonical implicit views). That pushed every tile rank-lift — including legitimate 1D-tile-to-2D-target writes with no actual narrowing — onto the issue hw-native-sys#1509 narrow path and triggered the "tile.reshape cannot carry valid_shape" reject. Detect actual narrowing in the helper (valid_shape structurally differs from static_shape on at least one axis) and return None for the canonical no-narrow case. Callers then take today's legacy non-narrow path, restoring 1D-tile rank-lift writes while keeping the iteration-1 fix in effect for real padded sources. Regression UT: test_tile_subscript_write_rank_reduce_1d_src_no_narrowing.
Address two follow-ups on hw-native-sys#1528: 1. Collapse the rank-lift in _lift_subscript_write_source_rank to one reshape call. The lift target now always derives from the source's own static_shape (skipping tile 2D-floor lead unit axes via source.shape[lead_units:]); whether to pass tensor.reshape's optional 3rd valid_shape arg is the only remaining branch. Extracted _lift_shape so static and valid go through the same axis-insertion logic. 2. As a side effect, fixes a latent bug for tile + rank-lift + lead_units>0 (e.g. a [1, 128] tile written into a 3D target via two scalar drops): the old new-path iteration treated the tile-floor's lead unit as a real kept axis and produced lifted_static = [1, 1, 1] instead of [1, 1, 128]. Regression UT: test_tile_subscript_write_rank_reduce_with_lead_unit. 3. Escape the literal dot in the UnsupportedFeatureError regex match pattern (RUF043, CodeRabbit minor).
5310524 to
dfb9563
Compare
The helper was overloaded — both fetching ``view.valid_shape`` and filtering out the canonical "valid == static" case to suppress false-narrow behavior at call sites. That conflated two unrelated concerns and made "why does this return None?" hard to follow. Split: - ``_get_source_valid_shape`` is now a pure accessor; returns whatever the view carries, or None when there is no view metadata at all. - The narrowing semantics live in the lift helper as a local ``is_narrowed = source_valid_shape is not None and not _shape_exprs_match(...)``. Per-axis check is unaffected: when valid == static across the board, the "static matches" continue branch fires before we ever consult valid.
dfb9563 to
16b0b1a
Compare
Summary
Fixes #1509.
_parse_subscript_assignmentrejected any source whosestatic_shapedid notmatch the slot extent, even when the source carried a narrower
valid_shapethat did. This blocked the natural pattern where ISA alignment forces
static_shape > valid_shape(e.g. FP32 Vec tiles padded to 32-byte rowalignment) and the source loses its Tile-ness through a
pl.rangeloop carry,leaving subscript-write /
pl.assembleas the only available write paths.The per-axis check is now relaxed to also accept when the source's
valid_shapematches the window — mirroring the tile-side semanticspl.storealready implements. Dynamic (folds-to-None)valid_shapeistrusted, symmetric to dynamic slot / dynamic
static_shape. Downstreamlowering is unchanged: the source flows through with its view metadata
intact and
tensor.assemble→tile.storereadsvalid_shapeas before.Behaviour table (window expects
Welements at axisi)static_shape[i]valid_shape[i]WW' ≠ W, unsetW' ≠ WWW' ≠ WW'' ≠ WW' ≠ WOut of scope (deferred)
DeduceTensorAssembleType/DeduceTileAssembleTypedo not validateshape compatibility at all today (a direct
pl.assemble(gm, src, [60, 60])with OOB offsets still passes type inference). Tightening that is a
separate enhancement, not required for this fix.
Test plan
tests/ut/language/parser/test_subscript_syntax.py:test_tensor_subscript_write_accepts_narrow_valid_shapetest_tensor_subscript_write_rejects_static_and_valid_mismatchtest_tensor_subscript_write_dynamic_valid_shape_trustedtest_tile_subscript_write_accepts_narrow_valid_shapetests/ut/language/suite (830 tests) passestests/st/runtime/control_flow/test_assemble_narrow_valid.pyadded; 4 cases × onboard platforms, covering both write paths in two shape
families (large dst with narrow slot at non-zero offset;
dst.shape == src.valid_shapewithoffset = [0, 0], where a non-valid_shape-respectingwrite would OOB). Run on hardware/simulator to validate end-to-end.