Skip to content

update attention in fused_forward, head blocking and add prefillonly transform#857

Open
quic-mamta wants to merge 2 commits intomla_fusionfrom
mla_fusion1
Open

update attention in fused_forward, head blocking and add prefillonly transform#857
quic-mamta wants to merge 2 commits intomla_fusionfrom
mla_fusion1

Conversation

@quic-mamta
Copy link
Contributor

  1. update attention in fused_forward
  2. use head blocking
  3. add prefillonly transform
  4. update min_masked_attention_value

…rd, prefillonly transform

Signed-off-by: Mamta Singh <mamtsing@qti.qualcomm.com>
Signed-off-by: Mamta Singh <mamtsing@qti.qualcomm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants