Releases · JuliaGPU/CUDA.jl

30 Apr 11:59

github-actions

v6.1.0

5e4118b

v6.1.0 Latest

Latest

CUDA v6.1.0

Diff since v6.0.0

Merged pull requests:

[CUSPARSE] create slices of sparse matrices using boolean masks (#3032) (@hexaeder)
FFT plans reworked (#3052) (@RainerHeintzmann)
Wrapper for Blocksparse CuTensor code (#3057) (@kmp5VT)
Fix compat in profile (#3074) (@vchuravy)
Simplify sparse type mappings (#3076) (@kshyatt)
Implement fast_div using fast rcp (#3077) (@vchuravy)
Add override for muladd and use LLVM intrinsic for fma (#3078) (@vchuravy)
Add preference to use CUDA_Runtime_Discovery for the compiler binaries (#3080) (@apozharski)
Backport #3080 to 5.11 (#3081) (@apozharski)
Improve eager GC with unified memory (#3092) (@maleadt)
Avoid Int128 in Float vs Rational comparisons on the device (#3093) (@maleadt)
Generate !invarant loads instead of ldg calls (#3094) (@maleadt)
Add _throw_dmrs device override for reshape of views (#3095) (@Abdelrahman912)
Add tests for fast_div and fast rcp intrinsics (#3096) (@vchuravy)
Add device overrides for FastMath.pow_fast with integer exponents (#3098) (@maleadt)
CUSPARSE: fix COO scalar getindex crossing row boundaries (#3100) (#3101) (@maleadt)
Switch to GPUArrays' new RNG (#3102) (@maleadt)
Update headers. (#3107) (@maleadt)
CUSPARSE: fix sparse(::Symmetric/::Hermitian) stack overflow (#3042). (#3108) (@maleadt)
Reorganize and untangle subpackage tests (NFC) (#3109) (@maleadt)
Switch to ParallelTestRunner (#3110) (@maleadt)
Temporarily disable cross-device memory access from kernels (#3112) (@maleadt)
Fix workspace size on 5.11 and CUDA 13.2, backport #3062 (#3113) (@kshyatt)
Remove unset LD_LIBRARY_PATH from pipeline steps (#3116) (@maleadt)
@nospecialize testf/_compare forwarders in CUDA tests. (#3117) (@maleadt)
Bump PTX .target to match --gpu-name (#3120) (@AntonOresten)
Add back-end flag to @cuda (#3121) (@maleadt)

Closed issues:

CUFFT: support for arbitrary dims (#119)
LLVM 20: Adapt to LDG removal (#2531)
Julia 1.11: Views can return CPU SubArray (#2551)
1-element view not recognized as contiguous (#2653)
Can't compare Float32 with Rational on CUDA (#2681)
Memory leak with unified memory? (#3013)
Stack overflow on sparse(::Symmetric) (#3042)
Scalar indexing on +(::Symmetric, ::Symmetric) (#3043)
Performance improvement ideas for randn! Float32 (#3056)
Base.FastMath.pow_fast fails to compile with integer exponent (#3065)
normalize on CuArray fails due to scalar indexing (#3097)
[cuSPARSE] Incorrect indexing for COO-formatted sparse arrays (#3100)

Contributors

vchuravy, maleadt, and 7 other contributors

Assets 2

22 Apr 20:35

github-actions

v5.11.2

95623c9

v5.11.2

CUDA v5.11.2

Diff since v5.11.1

This release has been identified as a backport.
Automated changelogs for backports tend to be wildly incorrect.
Therefore, the list of issues and pull requests is hidden.

Assets 2

21 Apr 10:40

github-actions

v5.11.1

8a60b84

v5.11.1

CUDA v5.11.1

Diff since v5.11.0

This release has been identified as a backport.
Automated changelogs for backports tend to be wildly incorrect.
Therefore, the list of issues and pull requests is hidden.

Assets 2

10 Apr 15:33

github-actions

v6.0.0

a9a687c

v6.0.0

CUDA v6.0.0

Diff since v5.11.0

Breaking changes: Ideally none. CUDA.jl has been split into subpackages, which is a major change, so the major version bump is out of caution. Deprecations have been introduced though, e.g., submodules like CUBLAS are now proper packages, cuBLAS.jl.

This release has been identified as a backport.
Automated changelogs for backports tend to be wildly incorrect.
Therefore, the list of issues and pull requests is hidden.

Assets 2

13 Mar 15:19

github-actions

v5.11.0

0567b9e

v5.11.0

CUDA v5.11.0

Diff since v5.10.1

Support for CUDA 13.2

Merged pull requests:

Initial support for CUDA 13.2. (#3053) (@maleadt)
Bump subpackages (#3054) (@maleadt)

Contributors

maleadt

Assets 2

09 Mar 12:41

github-actions

v5.10.1

5472295

v5.10.1

CUDA v5.10.1

Diff since v5.10.0

Merged pull requests:

Move nvtx initialization to first profile (#3050) (@wsmoses)

Closed issues:

Precompilation: "You are using CUDA 13.0.0, but CUDA.jl was precompiled for CUDA 13.1.0." (#3049)

Contributors

wsmoses

Assets 2

06 Mar 14:13

github-actions

v5.10.0

3405d7f

v5.10.0

CUDA v5.10.0

Diff since v5.9.7

Support for CUDA 13.1
Support for Julia 1.13

Merged pull requests:

Support Julia 1.13 (#3020) (@eschnett)
Add Base.min override for Float16 and extend LLVM version guard to v20. (#3038) (@maleadt)
Fix sytrs! test: verify non-pivoting path directly instead of skipping (#3039) (@maleadt)
Initial support for CUDA 13.1. (#3040) (@maleadt)
Try unsetting LD_LIBRARY_PATH. (#3046) (@maleadt)
Update README.md (#3047) (@kshyatt)

Closed issues:

DGX Spark GB10: PkgError: Package CUDA errored during testing (#2950)
Cannot load CUDA.jl with Julia 1.13 (#3019)
Julia 1.12: Misaligned address error (#3034)
gpuci: artifact toolkits pick up system libraries (#3045)

Contributors

eschnett, maleadt, and kshyatt

Assets 2

26 Feb 15:50

github-actions

v5.9.7

1810b7a

v5.9.7

CUDA v5.9.7

Diff since v5.9.6

Merged pull requests:

Add autodocs for all libraries, take 2 [only docs] (#2972) (@gdalle)
Actually pass neutral element into scan (#3011) (@kshyatt)
Specialize transpose! for CuMatrix (#3015) (@oschulz)
Extend LLVM 18 workaround to other float types. (#3016) (@maleadt)
Add yet another workaround to avoid .NaN modifiers on LLVM 18. (#3025) (@maleadt)
Switch to 1.12 for benchmarks (#3027) (@christiangnrd)
Replace DataFrames in the profiler with NamedTuples (#3029) (@JamesWrigley)
Bump version (#3035) (@kshyatt)

Closed issues:

Dependencies in profile.jl constitute a significant fraction of the load time (#2238)
PTX compile error: ".NaN requires .target sm_80 or higher" on Julia 1.12 (RTX 2080 / sm_75, works fine on Julia 1.11.7) (#2946)
Support for CUDA Tile (#2991)
Incorrect scalar-sparse matrix multiplication (#3010)

Contributors

maleadt, oschulz, and 4 other contributors

Assets 2

03 Jan 21:11

github-actions

v5.9.6

0c00b83

v5.9.6

CUDA v5.9.6

Diff since v5.9.5

Merged pull requests:

Expand eigen() and add eigvals,vecs (#2787) (@matteosecli)
Support new cuquantum version (#2887) (@kshyatt)
Try integrating with the GPUArrays sparse migration (#2942) (@kshyatt)
Add some packages to versioninfo output (#2983) (@christiangnrd)
Bump actions/checkout from 5 to 6 (#2988) (@dependabot[bot])
Fix undefined variable in memory_source error message (#2990) (@KaanKesginLW)
Support log for Hermitian CuMatrix (#2993) (@kshyatt)
fix and tests for zero-length arrays in level1 CUBLAS (#2994) (@kshyatt)
Remove Requires.jl (#2999) (@JamesWrigley)
Bump CUDA_Compiler_jll. (#3008) (@maleadt)
Add BFloat16 WMMA (#3009) (@AntonOresten)

Closed issues:

CUDA.jl takes much much longer to load on Julia v1.12 (#2982)
Selectdim returns view - CuSparseMatrixCSC defaults to scalar operations on views (#2986)
Non-symmetric eigendecomposition with CUDA (#2989)
Could you provide suggestion for using Julia on multinode (#2992)
Unable to add CUDA in Julia 1.12 on Jetson Orin (#2995)
Unable to add CUDA in Julia 1.12 on Jetson Orin (#2996)
Error registering a new package using CUDA (#3001)

Contributors

maleadt, kshyatt, and 6 other contributors

Assets 2

26 Nov 08:15

github-actions

v5.9.5

c5145ab

v5.9.5

CUDA v5.9.5

Diff since v5.9.4

Merged pull requests:

Add some matrix functions for symm/herm (#2962) (@kshyatt)
Ensure diagm preserves eltype (#2975) (@kshyatt)
Remove old set of deps test (#2976) (@kshyatt)
Support mul!(Diagonal, A, B) (#2977) (@kshyatt)
Let cuTensorNet hard fail again (#2978) (@kshyatt)
Remove diagm in favour of GPUArrays (#2979) (@kshyatt)
Bump GPUArrays dep (#2984) (@kshyatt)

Contributors

kshyatt

Assets 2

Releases: JuliaGPU/CUDA.jl

v6.1.0

CUDA v6.1.0

Contributors

Uh oh!

v5.11.2

CUDA v5.11.2

Uh oh!

v5.11.1

CUDA v5.11.1

Uh oh!

v6.0.0

CUDA v6.0.0

Uh oh!

v5.11.0

CUDA v5.11.0

Contributors

Uh oh!

v5.10.1

CUDA v5.10.1

Contributors

Uh oh!

v5.10.0

CUDA v5.10.0

Contributors

Uh oh!

v5.9.7

CUDA v5.9.7

Contributors

Uh oh!

v5.9.6

CUDA v5.9.6

Contributors

Uh oh!

v5.9.5

CUDA v5.9.5

Contributors

Uh oh!