Skip to content

ggml-webgpu: support non-square subgroup matrix configs for Intel GPUs#21669

Open
SharmaRithik wants to merge 1 commit intoggml-org:masterfrom
SharmaRithik:intel-gpu-subgroup-support
Open

ggml-webgpu: support non-square subgroup matrix configs for Intel GPUs#21669
SharmaRithik wants to merge 1 commit intoggml-org:masterfrom
SharmaRithik:intel-gpu-subgroup-support

Conversation

@SharmaRithik
Copy link
Copy Markdown

Overview

Enable WebGPU subgroup matrix support for Intel GPUs (Xe2/Battlemage).

Intel GPUs report non-square subgroup matrix configurations (e.g. M=8, N=16, K=16) via Dawn's ChromiumExperimentalSubgroupMatrix feature. The existing filter only accepted square configs (M==N==K), rejecting Intel GPUs entirely despite full hardware and driver support.

Changes:

  • Relax the subgroup matrix config filter to accept any F16->F16 configuration, not just square ones
  • Fix swapped subgroup matrix type parameters in flash_attn.wgsl (hidden by square configs where M==N==K)
  • Add head dimension divisibility check in flash attention op support for non-square K/N dimensions

Tested on Intel Arc B580 (BMG/Xe2-HPG) with Mesa 25.2.8 and latest Dawn:

  • MUL_MAT: 829/829 passed
  • FLASH_ATTN_EXT: 2832/2832 passed

Additional information

Requires latest Dawn (April 2026+) which enables ChromiumExperimentalSubgroupMatrix for Intel Mesa drivers >= 25.2. Older Dawn versions do not expose this feature for Intel GPUs.

Requirements

@SharmaRithik SharmaRithik requested a review from a team as a code owner April 9, 2026 10:43
@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning WebGPU labels Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning WebGPU

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant