By: Tim Besard
Re-posted from: https://juliagpu.org/post/2025-05-14-cuda_5.8/index.html
CUDA.jl v5.8 brings several enhancements, most notably the introduction of broadcasting support for CuSparseVector. The release also includes support for CUDA 12.9, and updates to key CUDA libraries like cuTENSOR, cuQuantum, and cuDNN.
Broadcasting for CuSparseVector
A significant enhancement in CUDA.jl v5.8 is the support for broadcasting CuSparseVector. Thanks to @kshyatt, it is now possible to use sparse GPU vectors in broadcast expressions just like it was already possible with sparse matrices:
julia> using CUDA, .CUSPARSE, SparseArraysjulia> x = cu(sprand(Float32, 10, 0.3))
10-element CuSparseVector{Float32, Int32} with 4 stored entries:
[2] = 0.459139
[3] = 0.964073
[8] = 0.904363
[9] = 0.721723julia> # a zero-preserving elementwise operation
x .* 2
10-element CuSparseVector{Float32, Int32} with 4 stored entries:
[2] = 0.918278
[3] = 1.928146
[8] = 1.808726
[9] = 1.443446julia> # a non-zero-preserving elementwise operation
x .+ 1
10-element CuArray{Float32, 1, CUDA.DeviceMemory}:
1.0
1.4591388
1.9640732
1.0
1.0
1.0
1.0
1.9043632
1.7217231
1.0julia> # combining multiple sparse inputs
x .+ cu(sprand(Float32, 10, 0.3))
10-element CuSparseVector{Float32, Int32} with 6 stored entries:
[1] = 0.906
[2] = 0.583197
[3] = 0.964073
[4] = 0.259103
[8] = 0.904363
[9] = 0.935917
Minor Changes
CUDA.jl 5.8 also includes several other useful updates:
-
Added support for CUDA 12.9;
-
Subpackages have been updated to CUDNN 9.10, cuTensor 2.2, and cuQuantum 25.03;
-
CUSPARSE.gemm!now supports additional algorithms choices to limit memory usage; -
Symbols can now be passed to CUDA kernels and stored in
CuArrays; -
CuTensormultiplication now preserves the memory type of the input tensors; -
Sparse CSR matrices are now interfaced with the SparseMatricesCSR.jl package.
As always, we encourage users to update to the latest version to benefit from these improvements and bug fixes. Check out the changelog for a full list of changes.