Add blog post on CUDA.jl 5.8.

maleadt · maleadt · commit 6a4b5d90aeaf · 2025-05-14T12:01:59.000+02:00
diff --git a/post/2025-05-14-cuda_5.8.md b/post/2025-05-14-cuda_5.8.md
@@ -0,0 +1,84 @@
++++
+title = "CUDA.jl 5.8: CuSparseVector broadcasting, CUDA 12.9, and more"
+author = "Tim Besard"
+abstract = """
+  CUDA.jl v5.8 brings several enhancements, most notably the introduction of broadcasting
+  support for `CuSparseVector`. The release also includes support for CUDA 12.9,
+  and updates to key CUDA libraries like cuTENSOR, cuQuantum, and cuDNN.
+"""
++++
+
+{{abstract}}
+
+
+## Broadcasting for `CuSparseVector`
+
+A significant enhancement in CUDA.jl v5.8 is the [support for broadcasting
+`CuSparseVector`](https://github.com/JuliaGPU/CUDA.jl/pull/2733). Thanks to
+[@kshyatt](https://github.com/kshyatt), it is now possible to use sparse GPU vectors in
+broadcast expressions just like it was already possible with sparse matrices:
+
+```julia-repl
+julia> using CUDA, .CUSPARSE, SparseArrays
+
+julia> x = cu(sprand(Float32, 10, 0.3))
+10-element CuSparseVector{Float32, Int32} with 4 stored entries:
+  [2]  =  0.459139
+  [3]  =  0.964073
+  [8]  =  0.904363
+  [9]  =  0.721723
+
+julia> # a zero-preserving elementwise operation
+       x .* 2
+10-element CuSparseVector{Float32, Int32} with 4 stored entries:
+  [2]  =  0.918278
+  [3]  =  1.928146
+  [8]  =  1.808726
+  [9]  =  1.443446
+
+julia> # a non-zero-preserving elementwise operation
+       x .+ 1
+10-element CuArray{Float32, 1, CUDA.DeviceMemory}:
+ 1.0
+ 1.4591388
+ 1.9640732
+ 1.0
+ 1.0
+ 1.0
+ 1.0
+ 1.9043632
+ 1.7217231
+ 1.0
+
+julia> # combining multiple sparse inputs
+       x .+ cu(sprand(Float32, 10, 0.3))
+10-element CuSparseVector{Float32, Int32} with 6 stored entries:
+  [1]  =  0.906
+  [2]  =  0.583197
+  [3]  =  0.964073
+  [4]  =  0.259103
+  [8]  =  0.904363
+  [9]  =  0.935917
+```
+
+
+## Minor Changes
+
+CUDA.jl 5.8 also includes several other useful updates:
+
+- [Added support](https://github.com/JuliaGPU/CUDA.jl/pull/2772) for CUDA 12.9;
+- Subpackages [have been updated](https://github.com/JuliaGPU/CUDA.jl/pull/2776) to CUDNN
+  9.10, cuTensor 2.2, and cuQuantum 25.03;
+- `CUSPARSE.gemm!` [now supports](https://github.com/JuliaGPU/CUDA.jl/pull/2769) additional
+  algorithms choices to limit memory usage;
+- Symbols [can now be passed](https://github.com/JuliaGPU/CUDA.jl/pull/2624) to CUDA kernels
+  and stored in `CuArray`s;
+- `CuTensor` multiplication [now preserves](https://github.com/JuliaGPU/CUDA.jl/pull/2775)
+  the memory type of the input tensors;
+- Sparse CSR matrices [are now interfaced
+  with](https://github.com/JuliaGPU/CUDA.jl/pull/2720) the SparseMatricesCSR.jl package.
+
+As always, we encourage users to update to the latest version to benefit from these
+improvements and bug fixes. Check out the
+[changelog](https://github.com/JuliaGPU/CUDA.jl/releases/tag/v5.8.0) for a full list of
+changes.