Releases: JuliaGPU/CUDA.jl
v6.1.0
CUDA v6.1.0
Merged pull requests:
- [CUSPARSE] create slices of sparse matrices using boolean masks (#3032) (@hexaeder)
- FFT plans reworked (#3052) (@RainerHeintzmann)
- Wrapper for Blocksparse CuTensor code (#3057) (@kmp5VT)
- Fix compat in profile (#3074) (@vchuravy)
- Simplify sparse type mappings (#3076) (@kshyatt)
- Implement fast_div using fast rcp (#3077) (@vchuravy)
- Add override for muladd and use LLVM intrinsic for fma (#3078) (@vchuravy)
- Add preference to use CUDA_Runtime_Discovery for the compiler binaries (#3080) (@apozharski)
- Backport #3080 to 5.11 (#3081) (@apozharski)
- Improve eager GC with unified memory (#3092) (@maleadt)
- Avoid Int128 in Float vs Rational comparisons on the device (#3093) (@maleadt)
- Generate
!invarantloads instead ofldgcalls (#3094) (@maleadt) - Add
_throw_dmrsdevice override for reshape of views (#3095) (@Abdelrahman912) - Add tests for fast_div and fast rcp intrinsics (#3096) (@vchuravy)
- Add device overrides for
FastMath.pow_fastwith integer exponents (#3098) (@maleadt) - CUSPARSE: fix COO scalar getindex crossing row boundaries (#3100) (#3101) (@maleadt)
- Switch to GPUArrays' new RNG (#3102) (@maleadt)
- Update headers. (#3107) (@maleadt)
- CUSPARSE: fix sparse(::Symmetric/::Hermitian) stack overflow (#3042). (#3108) (@maleadt)
- Reorganize and untangle subpackage tests (NFC) (#3109) (@maleadt)
- Switch to ParallelTestRunner (#3110) (@maleadt)
- Temporarily disable cross-device memory access from kernels (#3112) (@maleadt)
- Fix workspace size on 5.11 and CUDA 13.2, backport #3062 (#3113) (@kshyatt)
- Remove unset LD_LIBRARY_PATH from pipeline steps (#3116) (@maleadt)
- @nospecialize testf/_compare forwarders in CUDA tests. (#3117) (@maleadt)
- Bump PTX
.targetto match--gpu-name(#3120) (@AntonOresten) - Add back-end flag to
@cuda(#3121) (@maleadt)
Closed issues:
- CUFFT: support for arbitrary dims (#119)
- LLVM 20: Adapt to LDG removal (#2531)
- Julia 1.11: Views can return CPU SubArray (#2551)
- 1-element view not recognized as contiguous (#2653)
- Can't compare
Float32withRationalon CUDA (#2681) - Memory leak with unified memory? (#3013)
- Stack overflow on
sparse(::Symmetric)(#3042) - Scalar indexing on
+(::Symmetric, ::Symmetric)(#3043) - Performance improvement ideas for randn! Float32 (#3056)
Base.FastMath.pow_fastfails to compile with integer exponent (#3065)- normalize on CuArray fails due to scalar indexing (#3097)
- [cuSPARSE] Incorrect indexing for COO-formatted sparse arrays (#3100)
v5.11.2
CUDA v5.11.2
This release has been identified as a backport.
Automated changelogs for backports tend to be wildly incorrect.
Therefore, the list of issues and pull requests is hidden.
v5.11.1
CUDA v5.11.1
This release has been identified as a backport.
Automated changelogs for backports tend to be wildly incorrect.
Therefore, the list of issues and pull requests is hidden.
v6.0.0
CUDA v6.0.0
Breaking changes: Ideally none. CUDA.jl has been split into subpackages, which is a major change, so the major version bump is out of caution. Deprecations have been introduced though, e.g., submodules like CUBLAS are now proper packages, cuBLAS.jl.
This release has been identified as a backport.
Automated changelogs for backports tend to be wildly incorrect.
Therefore, the list of issues and pull requests is hidden.
v5.11.0
v5.10.1
CUDA v5.10.1
Merged pull requests:
Closed issues:
- Precompilation: "You are using CUDA 13.0.0, but CUDA.jl was precompiled for CUDA 13.1.0." (#3049)
v5.10.0
CUDA v5.10.0
- Support for CUDA 13.1
- Support for Julia 1.13
Merged pull requests:
- Support Julia 1.13 (#3020) (@eschnett)
- Add Base.min override for Float16 and extend LLVM version guard to v20. (#3038) (@maleadt)
- Fix sytrs! test: verify non-pivoting path directly instead of skipping (#3039) (@maleadt)
- Initial support for CUDA 13.1. (#3040) (@maleadt)
- Try unsetting LD_LIBRARY_PATH. (#3046) (@maleadt)
- Update README.md (#3047) (@kshyatt)
Closed issues:
v5.9.7
CUDA v5.9.7
Merged pull requests:
- Add autodocs for all libraries, take 2 [only docs] (#2972) (@gdalle)
- Actually pass neutral element into scan (#3011) (@kshyatt)
- Specialize transpose! for CuMatrix (#3015) (@oschulz)
- Extend LLVM 18 workaround to other float types. (#3016) (@maleadt)
- Add yet another workaround to avoid .NaN modifiers on LLVM 18. (#3025) (@maleadt)
- Switch to 1.12 for benchmarks (#3027) (@christiangnrd)
- Replace DataFrames in the profiler with NamedTuples (#3029) (@JamesWrigley)
- Bump version (#3035) (@kshyatt)
Closed issues:
v5.9.6
CUDA v5.9.6
Merged pull requests:
- Expand eigen() and add eigvals,vecs (#2787) (@matteosecli)
- Support new cuquantum version (#2887) (@kshyatt)
- Try integrating with the GPUArrays sparse migration (#2942) (@kshyatt)
- Add some packages to
versioninfooutput (#2983) (@christiangnrd) - Bump actions/checkout from 5 to 6 (#2988) (@dependabot[bot])
- Fix undefined variable in memory_source error message (#2990) (@KaanKesginLW)
- Support log for Hermitian CuMatrix (#2993) (@kshyatt)
- fix and tests for zero-length arrays in level1 CUBLAS (#2994) (@kshyatt)
- Remove Requires.jl (#2999) (@JamesWrigley)
- Bump CUDA_Compiler_jll. (#3008) (@maleadt)
- Add BFloat16 WMMA (#3009) (@AntonOresten)
Closed issues:
- CUDA.jl takes much much longer to load on Julia v1.12 (#2982)
- Selectdim returns view - CuSparseMatrixCSC defaults to scalar operations on views (#2986)
- Non-symmetric eigendecomposition with CUDA (#2989)
- Could you provide suggestion for using Julia on multinode (#2992)
- Unable to add CUDA in Julia 1.12 on Jetson Orin (#2995)
- Unable to add CUDA in Julia 1.12 on Jetson Orin (#2996)
- Error registering a new package using CUDA (#3001)
v5.9.5
CUDA v5.9.5
Merged pull requests:
- Add some matrix functions for symm/herm (#2962) (@kshyatt)
- Ensure diagm preserves eltype (#2975) (@kshyatt)
- Remove old set of deps test (#2976) (@kshyatt)
- Support mul!(Diagonal, A, B) (#2977) (@kshyatt)
- Let cuTensorNet hard fail again (#2978) (@kshyatt)
- Remove diagm in favour of GPUArrays (#2979) (@kshyatt)
- Bump GPUArrays dep (#2984) (@kshyatt)