Skip to content

[AIE2P] add Vaddmac pattern#781

Merged
martien-de-jong merged 2 commits intoaie-publicfrom
stuckmann.vaddmac
Mar 9, 2026
Merged

[AIE2P] add Vaddmac pattern#781
martien-de-jong merged 2 commits intoaie-publicfrom
stuckmann.vaddmac

Conversation

@F-Stuckmann
Copy link
Copy Markdown
Collaborator

The single performance regression we are seing is caused by a MOV instruction to satisfy the dst == src0 requirement inside of a loop.

Core_Compute_Insn_Count_perf_sorted_reduced Core_PMSize_perf_sorted_reduced_absolute Core_StackSize_perf_sorted_reduced_absolute

Comment thread llvm/lib/Target/AIE/AIEBaseInstrPatterns.td Outdated
Comment thread llvm/lib/Target/AIE/aie2p/AIE2PInstrPatterns.td Outdated
Copy link
Copy Markdown
Collaborator

@martien-de-jong martien-de-jong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps do it for AIE2PS too?

@F-Stuckmann
Copy link
Copy Markdown
Collaborator Author

F-Stuckmann commented Mar 4, 2026

that's why I didn't merge the PR yet ;)

@F-Stuckmann F-Stuckmann force-pushed the stuckmann.vaddmac branch 4 times, most recently from 2ef1014 to 78f8ec4 Compare March 9, 2026 08:45

// Instruction Combining: Fuse MUL + ADD -> VADDMAC/VADDMSC
// When a MAC/MSC result feeds into an ADD(ZeroAddConf), the multiply and
// both accumulations are performed by a single VADDMAC/VADDMSC instruction.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we replace an add with zero by a true add. Hence there are no two accumulations.

@martien-de-jong martien-de-jong merged commit 5235a6c into aie-public Mar 9, 2026
7 checks passed
@martien-de-jong martien-de-jong deleted the stuckmann.vaddmac branch March 9, 2026 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants