Skip to content

upload 2nd version of sdpa backward#546

Closed
yuankuns wants to merge 74 commits intointel:mainfrom
yuankuns:sdpabackward
Closed

upload 2nd version of sdpa backward#546
yuankuns wants to merge 74 commits intointel:mainfrom
yuankuns:sdpabackward

Conversation

@yuankuns
Copy link
Copy Markdown

@yuankuns yuankuns commented Oct 3, 2025

  1. Enable flash attention 2 backward
  2. layout bhsd/bshd
  3. dtype fp16/bf16
  4. is_causal done,
  5. GQA done
  6. NUM_HEAD=1 done
  7. atomic add done

@Antonyvance Antonyvance added the redesign required Implementation require a redesign label Oct 17, 2025
@tdeng5
Copy link
Copy Markdown

tdeng5 commented Mar 31, 2026

Close invalid PR.

We have refined SYCL-TLA architecture and APIs, this PR is out of date.
If you still need this PR, please resubmit a new one bases on the latest source code.

@tdeng5 tdeng5 closed this Mar 31, 2026
tszulist-hbn pushed a commit to tszulist-hbn/torch-xpu-ops that referenced this pull request Apr 7, 2026
Current, SYCLTLA based FA2 backward kernel are based on SYCLTLA legacy
api which is going to deprecate. This PR swich to new cute api. The code
is from intel/sycl-tla#546 latest commit
94c6fe4fcee01f47522b2df0f31fcf4c80410b81

This change can pass SDPA related UTs and have ~200% performance
improvement on PVC.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

redesign required Implementation require a redesign

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants