Skip to content

Scalar indexing when mul! into SubArrayΒ #3041

@jonas-schulze

Description

@jonas-schulze

Describe the bug

I can't mul!(::SubArray, ...).

To reproduce

The Minimal Working Example (MWE) for this bug:

using CUDA, CUDA.CUSPARSE
using LinearAlgebra
using SparseArrays

n = 100
A = sprandn(n, n, 2/n)
A = cu(A)

X = CUDA.randn(n, 2)
Y = CUDA.zeros(n, 2)
Yv = @view Y[1:n, 1:2]

@assert Yv isa SubArray
mul!(Yv, A, X) # boom
mul!(Yv, I, X) # only fails on Julia 1.12
mul!(Yv, 5, X) # only fails on Julia 1.12
# See https://github.com/JuliaGPU/CUDA.jl/issues/3041#issuecomment-3996816271

Project.zip

Expected behavior

It should just work.

Version info

Julia 1.10.9
Julia Version 1.10.9
Commit 5595d20a287 (2025-03-10 12:51 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 16 Γ— 12th Gen Intel(R) Core(TM) i5-12600K
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, alderlake)
Threads: 1 default, 0 interactive, 1 GC (on 16 virtual cores)
Environment:
  JULIA_PROJECT = @.
CUDA toolchain: 
- runtime 13.0, artifact installation
- driver 580.126.9 for 13.1
- compiler 13.1

CUDA libraries: 
- CUBLAS: 13.1.0
- CURAND: 10.4.0
- CUFFT: 12.0.0
- CUSOLVER: 12.0.4
- CUSPARSE: 12.6.3
- CUPTI: 2025.3.1 (API 13.0.1)
- NVML: 13.0.0+580.126.9

Julia packages: 
- CUDA: 5.9.7
- GPUArrays: 11.4.1
- GPUCompiler: 1.8.2
- KernelAbstractions: 0.9.40
- CUDA_Driver_jll: 13.1.0+2
- CUDA_Compiler_jll: 0.4.1+1
- CUDA_Runtime_jll: 0.19.2+0

Toolchain:
- Julia: 1.10.9
- LLVM: 15.0.7

1 device:
  0: NVIDIA RTX A4000 (sm_86, 15.359 GiB / 15.992 GiB available)
Julia 1.12.5
Julia Version 1.12.5
Commit 5fe89b8ddc1 (2026-02-09 16:05 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 16 Γ— 12th Gen Intel(R) Core(TM) i5-12600K
  WORD_SIZE: 64
  LLVM: libLLVM-18.1.7 (ORCJIT, alderlake)
  GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 16 virtual cores)
Environment:
  JULIA_PROJECT = @.
CUDA toolchain: 
- runtime 13.0, artifact installation
- driver 580.126.9 for 13.1
- compiler 13.1

CUDA libraries: 
- CUBLAS: 13.1.0
- CURAND: 10.4.0
- CUFFT: 12.0.0
- CUSOLVER: 12.0.4
- CUSPARSE: 12.6.3
- CUPTI: 2025.3.1 (API 13.0.1)
- NVML: 13.0.0+580.126.9

Julia packages: 
- CUDA: 5.9.6
- GPUArrays: 11.4.1
- GPUCompiler: 1.8.2
- KernelAbstractions: 0.9.40
- CUDA_Driver_jll: 13.1.0+2
- CUDA_Compiler_jll: 0.4.1+1
- CUDA_Runtime_jll: 0.19.2+0

Toolchain:
- Julia: 1.12.5
- LLVM: 18.1.7

1 device:
  0: NVIDIA RTX A4000 (sm_86, 15.359 GiB / 15.992 GiB available)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions