-
Notifications
You must be signed in to change notification settings - Fork 625
[TOSA] MultiheadAttention legalization #4382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
a98526f to
cf45a2e
Compare
Lallapallooza
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for patch, few comments.
projects/pt1/test/python/scaled_dot_product_attention_lowering.py
Outdated
Show resolved
Hide resolved
- Legalize Torch scaled_dot_product_attention into TOSA by adding the necessary patterns in TorchToTosa.cpp plus backend type-conversion hooks. - Introduce a detailed decomposition path for multi-head attention within DecomposeComplexOps.cpp, preparing inputs for TOSA lowering. - Expands the PT1 e2e suite with a dedicated multi-head attention MLIR/Python test and drop the corresponding xfails now that the path works. Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: I96c17aefd25b979f1cf6e897d91d5a29f0a2fa85
cf45a2e to
fd02d37
Compare
| PatternRewriter &rewriter); | ||
|
|
||
| namespace { | ||
| // Decompose scaled dot product attention into matmul/softmax pipeline when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this decomposition producing any different IR compared to leveraging the decomposition of sdpa with ExportedProgram.run_decompositions https://docs.pytorch.org/docs/stable/export.html#export-ir-decompositions -- see https://discord.com/channels/636084430946959380/742573221882364009/1446121930922004623 for reference.
I am wondering if the sdpa op should be added to the default decomposition list in
| DEFAULT_DECOMPOSITIONS = [ |
No description provided.