Skip to content

Conversation

@amontoison
Copy link
Member

@amontoison amontoison commented May 18, 2025

@frapac
I tested this PR with AMD GPUs and all tests passed! 🎉

I propose that the content of src/gpu_utils.jl should be in an extension of MadNPGPU.

@amontoison
Copy link
Member Author

amontoison commented May 18, 2025

Hum, I have this error with some views in the benchmarks.

julia> include("crystal.jl")
Precompiling LaplaceInterpolation...
  1 dependency successfully precompiled in 2 seconds. 3 already precompiled.
Precompiling NPZ...
  3 dependencies successfully precompiled in 6 seconds. 24 already precompiled.
     Failure artifact: punched_pmn
  Downloaded artifact: punched_pmn
ERROR: LoadError: BoundsError: attempt to access 104222000-element ROCArray{Float64, 1, AMDGPU.Runtime.Mem.HIPBuffer} at index [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10    104221991, 104221992, 104221993, 104221994, 104221995, 104221996, 104221997, 104221998, 104221999, 104222000]]
Stacktrace:
  [1] throw_boundserror(A::ROCArray{Float64, 1, AMDGPU.Runtime.Mem.HIPBuffer}, I::Tuple{ROCArray{Int64, 1, AMDGPU.Runtime.Mem.HIPBuffer}})
    @ Base ./essentials.jl:14
  [2] checkbounds
    @ ./abstractarray.jl:699 [inlined]
  [3] view
    @ ~/.julia/x86_64/packages/GPUArrays/uiVyU/src/host/base.jl:305 [inlined]
  [4] get_index_constraints(lvar::ROCArray{…}, uvar::ROCArray{…}, lcon::ROCArray{…}, ucon::ROCArray{…}; fixed_variable_treatment::Type, equality_treatment::Type)
    @ MadNLP ~/.julia/x86_64/packages/MadNLP/cJ39p/src/nlpmodels.jl:104
  [5] get_index_constraints
    @ ~/.julia/x86_64/packages/MadNLP/cJ39p/src/nlpmodels.jl:88 [inlined]
  [6] MadNLPSolver(nlp::FFTNLPModel{…}; kwargs::@Kwargs{})
    @ MadNLP ~/.julia/x86_64/packages/MadNLP/cJ39p/src/IPM/IPM.jl:137
  [7] crystal(z3d::Array{Float32, 3}; variant::Bool, gpu::Bool, gpu_arch::String, rdft::Bool)
    @ Main ~/Argonne/CompressedSensingIPM.jl/benchmarks/crystal.jl:119
  [8] top-level scope
    @ ~/Argonne/CompressedSensingIPM.jl/benchmarks/crystal.jl:141
  [9] include(fname::String)
    @ Main ./sysimg.jl:38
 [10] top-level scope
    @ REPL[1]:1
in expression starting at /home/amontoison/Argonne/CompressedSensingIPM.jl/benchmarks/crystal.jl:141
Some type information was truncated. Use `show(err)` to see complete types.

ERROR: LoadError: BoundsError: attempt to access 1048452-element ROCArray{Float64, 1, AMDGPU.Runtime.Mem.HIPBuffer} at index [[524289, 524290, 524291, 524292, 524293, 524294, 524295, 524296, 524297, 524298    1048443, 1048444, 1048445, 1048446, 1048447, 1048448, 1048449, 1048450, 1048451, 1048452]]
Stacktrace:
 [1] throw_boundserror(A::ROCArray{Float64, 1, AMDGPU.Runtime.Mem.HIPBuffer}, I::Tuple{ROCArray{Int64, 1, AMDGPU.Runtime.Mem.HIPBuffer}})
   @ Base ./essentials.jl:14
 [2] checkbounds
   @ ./abstractarray.jl:699 [inlined]
 [3] view
   @ ~/.julia/x86_64/packages/GPUArrays/uiVyU/src/host/base.jl:305 [inlined]
 [4] MadNLP.PrimalVector(::Type{ROCArray{…}}, nx::Int64, ns::Int64, ind_lb::ROCArray{Int64, 1, AMDGPU.Runtime.Mem.HIPBuffer}, ind_ub::ROCArray{Int64, 1, AMDGPU.Runtime.Mem.HIPBuffer})
   @ MadNLP ~/.julia/x86_64/packages/MadNLP/cJ39p/src/KKT/rhs.jl:173
 [5] MadNLPSolver(nlp::FFTNLPModel{…}; kwargs::@Kwargs{})
   @ MadNLP ~/.julia/x86_64/packages/MadNLP/cJ39p/src/IPM/IPM.jl:168
 [6] fft_example_3D(N1::Int64, N2::Int64, N3::Int64; gpu::Bool, gpu_arch::String, rdft::Bool, check::Bool)
   @ Main ~/Argonne/CompressedSensingIPM.jl/test/fft_example_3D.jl:72
 [7] top-level scope
   @ ~/Argonne/CompressedSensingIPM.jl/benchmarks/cpu_vs_gpu.jl:36
 [8] include(fname::String)
   @ Main ./sysimg.jl:38
 [9] top-level scope
   @ REPL[6]:1
in expression starting at /home/amontoison/Argonne/CompressedSensingIPM.jl/benchmarks/cpu_vs_gpu.jl:30
Some type information was truncated. Use `show(err)` to see complete types.

@amontoison amontoison merged commit 74a2a65 into main May 26, 2025
5 checks passed
@amontoison amontoison deleted the amdgpu branch May 26, 2025 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants