On Julia v1.12 with AMDGPU.jl v1.3.4, running on an MI300A, I observe non-deterministic behavior when calling findall on a large array. In particular, the number of returned indices is correct, but some of the indices are 0. The problem seems more likely to appear for very large n.
Note: Seems related to #41 , but still occurs with AMDGPU.jl v1.3.4.
cc @vchuravy
MWE 1:
using KernelAbstractions
using Adapt
using AMDGPU
n = 100_000_000
a_gpu = Adapt.adapt(ROCBackend(), fill(true, n))
any(i -> i == 0, findall(x -> x == true, a_gpu)) # sometimes returns true
MWE 2:
using KernelAbstractions
using Adapt
using AMDGPU
n_true = n_false = 100_000_000
a_gpu = Adapt.adapt(ROCBackend(), vcat(fill(true, n_true), fill(false, n_false)))
indices_true = findall(x -> x == true, a_gpu)
@show length(indices_true) # correct length
any(i -> i == 0, indices_true) # true