Skip to content

Conversation

@jishnub
Copy link
Member

@jishnub jishnub commented Dec 9, 2025

This should fix #1519. The issue on master is that we have specialized dispatches for arrays of the same eltype, and the complex-real matmul ends up in generic_matmatmul!. This adds an extra method to ensure that the complex-real case also reaches BLAS.

julia> A = ones(ComplexF64, 400, 400); B = ones(size(A)); C = similar(A);

julia> @btime mul!($C, $A, $B);
  57.043 ms (0 allocations: 0 bytes) # master
  1.608 ms (0 allocations: 0 bytes) # this PR

@jishnub jishnub added the backport 1.12 Change should be backported to release-1.12 label Dec 9, 2025
@dkarrasch
Copy link
Member

Is this redirecting to the "reinterpret-trick"? Did that get lost over the more recent changes?

@adienes
Copy link
Member

adienes commented Dec 9, 2025

just for posterity if it happens to be useful, I had just bisected this to JuliaLang/julia#55002

@codecov
Copy link

codecov bot commented Dec 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.03%. Comparing base (f41bf65) to head (cbdb717).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1520      +/-   ##
==========================================
+ Coverage   93.94%   94.03%   +0.09%     
==========================================
  Files          34       34              
  Lines       15984    15980       -4     
==========================================
+ Hits        15016    15027      +11     
+ Misses        968      953      -15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jishnub
Copy link
Member Author

jishnub commented Dec 10, 2025

@dkarrasch Exactly, this had regressed from the previous behavior.

@jishnub
Copy link
Member Author

jishnub commented Dec 10, 2025

@adienes Thanks for the bisect! It appears that the fix required is a bit different. The method already existed, but the BLAS flag format had changed between releases and hadn't been updated.

α::Number, β::Number, ::Val{BlasFlag.GEMM}) where {T<:BlasReal}
gemm_wrapper!(C, tA, tB, A, B, α, β)
end
Base.@constprop :aggressive function generic_matmatmul_wrapper!(C::StridedVecOrMat{Complex{T}}, tA, tB, A::StridedVecOrMat{Complex{T}}, B::StridedVecOrMat{T},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this method is ever taken. BlasReal matrices would always use GEMM as the flag, and gemm_wrapper! may delegate to _generic_matmatmul! if necessary.

@jishnub jishnub merged commit eb007bb into master Dec 13, 2025
4 checks passed
@jishnub jishnub deleted the jishnub/gemm_dispatch_complex branch December 13, 2025 08:05
dkarrasch pushed a commit that referenced this pull request Dec 16, 2025
This should fix #1519. The issue on master is that we have specialized
dispatches for arrays of the same eltype, and the complex-real matmul
ends up in `generic_matmatmul!`. This adds an extra method to ensure
that the complex-real case also reaches BLAS.

```julia
julia> A = ones(ComplexF64, 400, 400); B = ones(size(A)); C = similar(A);

julia> @Btime mul!($C, $A, $B);
  57.043 ms (0 allocations: 0 bytes) # master
  1.608 ms (0 allocations: 0 bytes) # this PR
```
@dkarrasch dkarrasch mentioned this pull request Dec 16, 2025
6 tasks
dkarrasch pushed a commit that referenced this pull request Dec 16, 2025
This should fix #1519. The issue on master is that we have specialized
dispatches for arrays of the same eltype, and the complex-real matmul
ends up in `generic_matmatmul!`. This adds an extra method to ensure
that the complex-real case also reaches BLAS.

```julia
julia> A = ones(ComplexF64, 400, 400); B = ones(size(A)); C = similar(A);

julia> @Btime mul!($C, $A, $B);
  57.043 ms (0 allocations: 0 bytes) # master
  1.608 ms (0 allocations: 0 bytes) # this PR
```
@dkarrasch dkarrasch mentioned this pull request Dec 16, 2025
2 tasks
@dkarrasch dkarrasch removed backport 1.12 Change should be backported to release-1.12 backport 1.13 labels Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Poor performance for product Matrix{ComplexF64} with Matrix{Float64}

3 participants