Skip to content

[wip] Implement WidenInnerTiledReduction#23822

Closed
efric wants to merge 2 commits intousers/efric/VDMFMA-1from
users/efric/VDMFMA-2
Closed

[wip] Implement WidenInnerTiledReduction#23822
efric wants to merge 2 commits intousers/efric/VDMFMA-1from
users/efric/VDMFMA-2

Conversation

@efric
Copy link
Copy Markdown
Member

@efric efric commented Mar 18, 2026

No description provided.

efric added 2 commits March 18, 2026 00:37
Introduce a pass that widens VDMFMA accumulator operands from the
collapsed (vector<2xT>) to the native hardware (vector<4xT>) form.
This hoists the expand/collapse out of the reduction loop, reducing
redundant interleave/deinterleave operations per iteration.

Key changes:
- IREEGPUAttrs.td: Add promotedAcc field to InnerTiledSemanticsAttr
- IREEGPUAttrs.cpp: promotedAcc handling in getTileTypes; accIsNative
  parameter in buildVDMFMAOps/buildUnderlyingOperations
- WidenInnerTiledReduction.cpp: New pass implementation
- Transforms.cpp: Use getTileTypes from semantics (not kind directly)
  to respect promotedAcc; update all InnerTiledSemanticsAttr::get sites
- Passes.td/Passes.cpp: Register and wire into pipeline
- Tests: widen_inner_tiled_reduction.mlir, promotedAcc=true
  test in lower_inner_tiled.mlir

Signed-off-by: Eric Feng <Eric.Feng@amd.com>
Signed-off-by: Eric Feng <Eric.Feng@amd.com>
@efric efric closed this Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant