Skip to content

[AIEX] enable VextrBcstShfl with copy-propagate G_IMPLICIT_DEF#802

Merged
F-Stuckmann merged 3 commits intoaie-publicfrom
stuckmann.vextrbcstshffl
Mar 12, 2026
Merged

[AIEX] enable VextrBcstShfl with copy-propagate G_IMPLICIT_DEF#802
F-Stuckmann merged 3 commits intoaie-publicfrom
stuckmann.vextrbcstshffl

Conversation

@F-Stuckmann
Copy link
Copy Markdown
Collaborator

@F-Stuckmann F-Stuckmann commented Feb 26, 2026

Instruction matching cannot look through copies of Implicit def, if an implicit Def is needed.
This can happen if an implicit Def is assigned to a Register Bank that is different to the one expected in the VextrBcstShfl instruction matcher(i.e. FIFOreg).

This PR replaces these copies.
Note: the big gains for this PR come from removing Copies when source and destination have the same regbank.

Core_Compute_Insn_Count_perf_sorted_reduced Core_PMSize_perf_sorted_reduced_absolute Core_StackSize_perf_sorted_reduced_absolute

Comment thread llvm/lib/Target/AIE/AIE2.h
Comment thread llvm/lib/Target/AIE/AIECombinerHelper.cpp
Comment thread llvm/lib/Target/AIE/AIEPreISelCombiner.cpp
Comment thread llvm/lib/Target/AIE/AIEPreISelCombiner.cpp
Comment thread llvm/lib/Target/AIE/AIEPreISelCombiner.cpp Outdated
@F-Stuckmann F-Stuckmann force-pushed the stuckmann.vextrbcstshffl branch from 6b26173 to c324e22 Compare March 10, 2026 19:45
@F-Stuckmann F-Stuckmann requested a review from abnikant as a code owner March 10, 2026 19:45
const MachineInstr *SrcDef = MRI.getVRegDef(SrcReg);
if (!SrcDef || SrcDef->getOpcode() != TargetOpcode::G_IMPLICIT_DEF)
return false;
return MRI.getType(DstReg) == MRI.getType(SrcReg);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: G_IMPLICIT_DEF doesn't have type-specific semantics, we can create it in the required type. Hence we don't need this condition.

const RegisterBank *DstRB = MRI.getRegBankOrNull(DstReg);
const RegisterBank *SrcRB = MRI.getRegBankOrNull(SrcReg);
if (DstRB == SrcRB) {
MRI.replaceRegWith(DstReg, SrcReg);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't there's merit in reusing IMPLICIT_DEF. I'd just always create a new one. It will blindly fulfill single-use constraints etc.

Copy link
Copy Markdown
Collaborator

@martien-de-jong martien-de-jong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please adjust the title to something like copy-propagate G_IMPLICIT_DEF.
I would simplify it to always create a pristine G_IMPLICIT_DEF with the proper type and regbank. Less conditions, less required testing, no unnecessary multi-use nodes.

@F-Stuckmann F-Stuckmann changed the title [AIEX] enable VextrBcstShfl when Implicit Def is copied. [AIEX] enable VextrBcstShfl with copy-propagate G_IMPLICIT_DEF Mar 11, 2026
@F-Stuckmann F-Stuckmann force-pushed the stuckmann.vextrbcstshffl branch from 6f2f7bf to 236ebed Compare March 12, 2026 12:41
@F-Stuckmann
Copy link
Copy Markdown
Collaborator Author

F-Stuckmann commented Mar 12, 2026

@martien-de-jong check out that the simplification breaks a CSE in the fixup commits.

Before we had a common Implicit def and CSE could delete common code, now with a fresh implicit def, we have to reproduce exaclty the same behaviour twice.

@F-Stuckmann F-Stuckmann force-pushed the stuckmann.vextrbcstshffl branch from 236ebed to 45e0971 Compare March 12, 2026 12:59
@F-Stuckmann
Copy link
Copy Markdown
Collaborator Author

Removed the simplification because touching CSE is outside of the scope of this PR

F-Stuckmann and others added 3 commits March 12, 2026 14:36
Add test cases to inst-select-vextbcstshfl.mir that exercise the
VEXTBCSTSHFL pattern when the second vshuffle operand is a COPY of
G_IMPLICIT_DEF through fiforegbank rather than a direct G_IMPLICIT_DEF.
The cross-bank copy prevents ISel from recognizing the operand as undef,
resulting in separate VEXTRACT + VBCST + VSHUFFLE instructions instead
of a single VEXTBCSTSHFL.

New test variants:
- 64/32/16-bit extract+broadcast+shuffle with cross-bank COPY of undef
- Multi-use: broadcast feeding two shuffles with different modes
After register bank selection, cross-bank copies of G_IMPLICIT_DEF
prevent ISel patterns from recognizing undef operands. For example,
the VEXTBCSTSHFL pattern requires its second vshuffle operand to be
undef, but a COPY from fiforegbank to vregbank makes it invisible
to the pattern matcher.

Add a new Pre-ISel combiner pass that runs between RegBankSelect and
InstructionSelect. It replaces COPY of G_IMPLICIT_DEF with a new
G_IMPLICIT_DEF in the destination register bank. For same-bank copies
the destination register is replaced directly with the source. For
cross-bank copies a new G_IMPLICIT_DEF is created with the correct
type and bank.

This enables 16 vextbcstshfl.64 instructions in Conv2D bf16 kernels
that were previously emitted as separate vbcst.64 + vshuffle pairs.
Add pipeline test entries and RUN lines for aie2ps target.
@F-Stuckmann F-Stuckmann force-pushed the stuckmann.vextrbcstshffl branch from 45e0971 to 69f8255 Compare March 12, 2026 13:36
@F-Stuckmann F-Stuckmann enabled auto-merge (rebase) March 12, 2026 13:36
@F-Stuckmann F-Stuckmann merged commit cae91ec into aie-public Mar 12, 2026
7 checks passed
@F-Stuckmann F-Stuckmann deleted the stuckmann.vextrbcstshffl branch March 12, 2026 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants