-
Notifications
You must be signed in to change notification settings - Fork 65
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Questionnaire
-
Does ROCm works for you outside of Julia, e.g. C/C++/Python? yes
-
Post output of
rocminfo.
ROCk module version 6.16.6 is loaded
=====================
HSA System Attributes
=====================
Runtime Version: 1.18
Runtime Ext Version: 1.14
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
XNACK enabled: NO
DMAbuf Support: YES
VMM Support: YES
==========
HSA Agents
==========
*******
Agent 1
*******
Name: INTEL(R) XEON(R) PLATINUM 8568Y+
Uuid: CPU-XX
Marketing Name: INTEL(R) XEON(R) PLATINUM 8568Y+
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 0
BDFID: 0
Internal Node ID: 0
Compute Unit: 20
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Memory Properties:
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 247409228(0xebf2a4c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 247409228(0xebf2a4c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 247409228(0xebf2a4c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 4
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 247409228(0xebf2a4c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
*******
Agent 2
*******
Name: gfx942
Uuid: GPU-72d9a6340dabceda
Marketing Name: AMD Instinct MI300X VF
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 4096(0x1000) KB
L3: 262144(0x40000) KB
Chip ID: 29877(0x74b5)
ASIC Revision: 1(0x1)
Cacheline Size: 128(0x80)
Max Clock Freq. (MHz): 2100
BDFID: 33536
Internal Node ID: 1
Compute Unit: 304
SIMDs per CU: 4
Shader Engines: 32
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Coherent Host Access: FALSE
Memory Properties:
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 64(0x40)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 2048(0x800)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 2147483647(0x7fffffff)
y 65535(0xffff)
z 65535(0xffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 189
SDMA engine uCode:: 25
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 200998912(0xbfb0000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 200998912(0xbfb0000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 3
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 200998912(0xbfb0000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 4
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack-
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 2147483647(0x7fffffff)
y 65535(0xffff)
z 65535(0xffff)
FBarrier Max Size: 32
ISA 2
Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack-
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 2147483647(0x7fffffff)
y 65535(0xffff)
z 65535(0xffff)
FBarrier Max Size: 32
*** Done ***
- Post output of
AMDGPU.versioninfo()if possible.
julia> AMDGPU.versioninfo()
[ Info: AMDGPU versioninfo
┌───────────┬──────────────────┬───────────┬──────────────────────────────────────────────────────────────────────
│ Available │ Name │ Version │ Path ⋯
├───────────┼──────────────────┼───────────┼──────────────────────────────────────────────────────────────────────
│ + │ LLD │ - │ /opt/rocm-7.1.0/lib/llvm/bin/ld.lld ⋯
│ + │ Device Libraries │ - │ /root/.julia/artifacts/b46ab46ef568406312e5f500efb677511199c2f9/amd ⋯
│ + │ HIP │ 7.1.25424 │ /opt/rocm-7.1.0/lib/libamdhip64.so ⋯
│ + │ rocBLAS │ 5.1.0 │ /opt/rocm-7.1.0/lib/librocblas.so ⋯
│ + │ rocSOLVER │ 3.31.0 │ /opt/rocm-7.1.0/lib/librocsolver.so ⋯
│ + │ rocSPARSE │ 4.1.0 │ /opt/rocm-7.1.0/lib/librocsparse.so ⋯
│ + │ rocRAND │ 2.10.5 │ /opt/rocm-7.1.0/lib/librocrand.so ⋯
│ + │ rocFFT │ 1.0.35 │ /opt/rocm-7.1.0/lib/librocfft.so ⋯
│ + │ MIOpen │ 3.5.1 │ /opt/rocm-7.1.0/lib/libMIOpen.so ⋯
└───────────┴──────────────────┴───────────┴──────────────────────────────────────────────────────────────────────
1 column omitted
[ Info: AMDGPU devices
┌────┬────────────────────────┬────────────────────────┬───────────┬─────────────┬───────────────┐
│ Id │ Name │ GCN arch │ Wavefront │ Memory │ Shared Memory │
├────┼────────────────────────┼────────────────────────┼───────────┼─────────────┼───────────────┤
│ 1 │ AMD Instinct MI300X VF │ gfx942:sramecc+:xnack- │ 64 │ 191.688 GiB │ 64.000 KiB │
└────┴────────────────────────┴────────────────────────┴───────────┴─────────────┴───────────────┘
Reproducing the bug
-
Describe what's not working.
It appears some ROCM functions have not been implemented for int64 indexing and result in confusing errors. I am not sure if this should be documented, the error should be changed, or a julia default function should be called for those cases. In other cases, julia crashes when encountering the issue, which is more inconvenient. -
Provide MWE to reproduce it (if possible).
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.12.2 (2025-11-20)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org release
|__/ |
julia> using AMDGPU, LinearAlgebra
julia> n=50000
50000
julia> a=AMDGPU.rand(n,n)
ERROR: InexactError: trunc(Int32, 2500000000)
Stacktrace:
[1] throw_inexacterror(func::Symbol, to::Type, val::Int64)
@ Core ./boot.jl:815
[2] checked_trunc_sint
@ ./boot.jl:829 [inlined]
[3] toInt32
@ ./boot.jl:866 [inlined]
[4] Int32
@ ./boot.jl:956 [inlined]
[5] convert
@ ./number.jl:7 [inlined]
[6] cconvert
@ ./essentials.jl:687 [inlined]
[7] macro expansion
@ ~/.julia/packages/AMDGPU/FyUG3/src/utils.jl:122 [inlined]
[8] rocrand_generate_uniform
@ ~/.julia/packages/AMDGPU/FyUG3/src/rand/librocrand.jl:28 [inlined]
[9] rand!
@ ~/.julia/packages/AMDGPU/FyUG3/src/rand/random.jl:51 [inlined]
[10] rand!(A::ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer})
@ AMDGPU ~/.julia/packages/AMDGPU/FyUG3/src/random.jl:39
[11] rand(dim1::Int64, dims::Int64)
@ AMDGPU ~/.julia/packages/AMDGPU/FyUG3/src/random.jl:67
[12] top-level scope
@ REPL[4]:1
julia> a=AMDGPU.zeros(n,n).+1
50000×50000 ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer}:
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 … 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
julia> norm(a)
ERROR: InexactError: trunc(Int32, 2500000000)
Stacktrace:
[1] throw_inexacterror(func::Symbol, to::Type, val::Int64)
@ Core ./boot.jl:815
[2] checked_trunc_sint
@ ./boot.jl:829 [inlined]
[3] toInt32
@ ./boot.jl:866 [inlined]
[4] Int32
@ ./boot.jl:956 [inlined]
[5] convert
@ ./number.jl:7 [inlined]
[6] cconvert
@ ./essentials.jl:687 [inlined]
[7] macro expansion
@ ~/.julia/packages/AMDGPU/FyUG3/src/utils.jl:122 [inlined]
[8] rocblas_snrm2(handle::Ptr{…}, n::Int64, x::ROCArray{…}, incx::Int64, result::Base.RefValue{…})
@ AMDGPU.rocBLAS ~/.julia/packages/AMDGPU/FyUG3/src/blas/librocblas.jl:2355
[9] nrm2
@ ~/.julia/packages/AMDGPU/FyUG3/src/blas/wrappers.jl:86 [inlined]
[10] norm(x::ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer})
@ AMDGPU.rocBLAS ~/.julia/packages/AMDGPU/FyUG3/src/blas/highlevel.jl:49
[11] top-level scope
@ REPL[14]:1
Some type information was truncated. Use `show(err)` to see complete types.
julia> svdvals(a)
[3636] signal 11 (2): Segmentation fault
in expression starting at REPL[6]:1
slange_64_ at /root/julia-1.12.2/bin/../lib/julia/libopenblas64_.so (unknown line)
sgesdd_64_ at /root/julia-1.12.2/bin/../lib/julia/libopenblas64_.so (unknown line)
gesdd! at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/LinearAlgebra/src/lapack.jl:1706
svdvals! at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/LinearAlgebra/src/svd.jl:218 [inlined]
svdvals at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/LinearAlgebra/src/svd.jl:243
unknown function (ip: 0x7af09efe9242) at (unknown file)
jl_apply at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/julia.h:2391 [inlined]
do_call at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/interpreter.c:123
eval_value at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/interpreter.c:243
eval_stmt_value at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/interpreter.c:194 [inlined]
eval_body at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/interpreter.c:707
jl_interpret_toplevel_thunk at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/interpreter.c:898
jl_toplevel_eval_flex at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/toplevel.c:1035
__repl_entry_eval_expanded_with_loc at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:301
jl_apply at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/julia.h:2391 [inlined]
jl_f_invokelatest at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/builtins.c:881
toplevel_eval_with_hooks at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:308
toplevel_eval_with_hooks at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:312
toplevel_eval_with_hooks at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:305 [inlined]
eval_user_input at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:330
repl_backend_loop at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:452
#start_repl_backend#41 at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:427
start_repl_backend at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:424 [inlined]
#run_repl#50 at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:653
run_repl at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:639
jfptr_run_repl_19709.1 at /root/julia-1.12.2/share/julia/compiled/v1.12/REPL/u0gqU_DlLf6.so (unknown line)
run_std_repl at ./client.jl:478
jfptr_run_std_repl_36841.1 at /root/julia-1.12.2/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/julia.h:2391 [inlined]
jl_f_invokelatest at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/builtins.c:881
run_main_repl at ./client.jl:499
repl_main at ./client.jl:586 [inlined]
_start at ./client.jl:561
jfptr__start_47004.1 at /root/julia-1.12.2/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/julia.h:2391 [inlined]
true_main at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/jlapi.c:971
jl_repl_entrypoint at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/src/jlapi.c:1139
main at /cache/build/builder-amdci5-2/julialang/julia-release-1-dot-12/cli/loader_exe.c:58
unknown function (ip: 0x7af0dde2a1c9) at /lib/x86_64-linux-gnu/libc.so.6
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8) at /workspace/srcdir/glibc-2.17/csu/../sysdeps/x86_64/start.S
Allocations: 52830844 (Pool: 52830469; Big: 375); GC: 31
Segmentation fault (core dumped)
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working