Skip to content

rocBLAS updates for 1.13#921

Merged
luraess merged 3 commits into
mainfrom
lr/rocblas-1.13
May 26, 2026
Merged

rocBLAS updates for 1.13#921
luraess merged 3 commits into
mainfrom
lr/rocblas-1.13

Conversation

@luraess
Copy link
Copy Markdown
Member

@luraess luraess commented May 26, 2026

No description provided.

@luraess luraess marked this pull request as ready for review May 26, 2026 14:56
@luraess
Copy link
Copy Markdown
Member Author

luraess commented May 26, 2026

The codegen failures for 1.13 relate to fp atomics issue not related to this PR.

@luraess
Copy link
Copy Markdown
Member Author

luraess commented May 26, 2026

Fix scalar indexing in strided Float16 matmul on Julia 1.13. Julia 1.13 changed the matmul dispatch chain so that types not covered by BLAS fall through to _generic_matmatmul_nonadjtrans!, which does scalar indexing. AMDGPU's generic_matmatmul_wrapper! override only matched T<:ROCBLASFloat, missing Float16. Extending it to T<:ROCBLASFloatWithHalf routes Float16 through the generic_matmatmul! override, which correctly falls back to GPUArrays.generic_matmatmul!.

@luraess luraess merged commit 30db2cc into main May 26, 2026
0 of 2 checks passed
@luraess luraess deleted the lr/rocblas-1.13 branch May 26, 2026 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant