CUDA.jl 5.8: CuSparseVector broadcasting, CUDA 12.9, and more

By: Tim Besard

Re-posted from: https://juliagpu.org/post/2025-05-14-cuda_5.8/index.html

CUDA.jl v5.8 brings several enhancements, most notably the introduction of broadcasting support for CuSparseVector. The release also includes support for CUDA 12.9, and updates to key CUDA libraries like cuTENSOR, cuQuantum, and cuDNN.

Broadcasting for CuSparseVector

A significant enhancement in CUDA.jl v5.8 is the support for broadcasting CuSparseVector. Thanks to @kshyatt, it is now possible to use sparse GPU vectors in broadcast expressions just like it was already possible with sparse matrices:

julia> using CUDA, .CUSPARSE, SparseArraysjulia> x = cu(sprand(Float32, 10, 0.3))
10-element CuSparseVector{Float32, Int32} with 4 stored entries:
  [2]  =  0.459139
  [3]  =  0.964073
  [8]  =  0.904363
  [9]  =  0.721723julia> # a zero-preserving elementwise operation
       x .* 2
10-element CuSparseVector{Float32, Int32} with 4 stored entries:
  [2]  =  0.918278
  [3]  =  1.928146
  [8]  =  1.808726
  [9]  =  1.443446julia> # a non-zero-preserving elementwise operation
       x .+ 1
10-element CuArray{Float32, 1, CUDA.DeviceMemory}:
 1.0
 1.4591388
 1.9640732
 1.0
 1.0
 1.0
 1.0
 1.9043632
 1.7217231
 1.0julia> # combining multiple sparse inputs
       x .+ cu(sprand(Float32, 10, 0.3))
10-element CuSparseVector{Float32, Int32} with 6 stored entries:
  [1]  =  0.906
  [2]  =  0.583197
  [3]  =  0.964073
  [4]  =  0.259103
  [8]  =  0.904363
  [9]  =  0.935917

Minor Changes

CUDA.jl 5.8 also includes several other useful updates:

As always, we encourage users to update to the latest version to benefit from these improvements and bug fixes. Check out the changelog for a full list of changes.