Skip to content

[SP] add SP deny list instead of allow#7887

Open
kashif wants to merge 1 commit intodeepspeedai:masterfrom
kashif:sp_attn_deny
Open

[SP] add SP deny list instead of allow#7887
kashif wants to merge 1 commit intodeepspeedai:masterfrom
kashif:sp_attn_deny

Conversation

@kashif
Copy link
Contributor

@kashif kashif commented Mar 5, 2026

this way one can register kernels based flash-attn as well with SP

Signed-off-by: Kashif Rasul <kashif.rasul@gmail.com>
Copy link
Collaborator

@tohtana tohtana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kashif,

Thank you for opening this PR! I think supporting HF hub kernels is is a significant update.

Regarding the approach, we check if core_attn_implementation is in ALL_ATTENTION_FUNCTIONS but HF hub kernels like kernels-community/flash-attn2 is not in the list. So HF hub kernels won’t still be available with this fix.

We probably need to do the proper registration steps:

  1. Reject known-bad impls explicitly: eager, flex_attention, and probably paged|eager.
  2. If core_attn_implementation is an HF hub kernel string, call the HF registration path first. (Using lazy_import_flash_attention(…))
  3. Then read core_attn_function = ALL_ATTENTION_FUNCTIONS[core_attn_implementation].
  4. Build uattn from that original function.
  5. Replace that key with uattn_wrapper.

Does it make sense to you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants