-
Notifications
You must be signed in to change notification settings - Fork 65
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Questionnaire
- Does ROCm works for you outside of Julia, e.g. C/C++/Python?
Yes
- Post output of
rocminfo.
ROCk module is loaded
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
Runtime Ext Version: 1.7
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
XNACK enabled: NO
DMAbuf Support: YES
VMM Support: YES
==========
HSA Agents
==========
*******
Agent 1
*******
Name: AMD Ryzen 9 7900 12-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen 9 7900 12-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 5485
BDFID: 0
Internal Node ID: 0
Compute Unit: 24
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Memory Properties:
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 31940948(0x1e76154) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 31940948(0x1e76154) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 31940948(0x1e76154) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 4
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 31940948(0x1e76154) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
*******
Agent 2
*******
Name: gfx1201
Uuid: GPU-27a7a7d52ee60cbb
Marketing Name: AMD Radeon RX 9070 XT
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 8192(0x2000) KB
L3: 65536(0x10000) KB
Chip ID: 30032(0x7550)
ASIC Revision: 1(0x1)
Cacheline Size: 256(0x100)
Max Clock Freq. (MHz): 2400
BDFID: 768
Internal Node ID: 1
Compute Unit: 64
SIMDs per CU: 2
Shader Engines: 4
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Coherent Host Access: FALSE
Memory Properties:
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 1012
SDMA engine uCode:: 838
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 16695296(0xfec000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1201
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
ISA 2
Name: amdgcn-amd-amdhsa--gfx12-generic
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*******
Agent 3
*******
Name: gfx1036
Uuid: GPU-XX
Marketing Name: AMD Radeon Graphics
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 2
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
L2: 256(0x100) KB
Chip ID: 5710(0x164e)
ASIC Revision: 1(0x1)
Cacheline Size: 128(0x80)
Max Clock Freq. (MHz): 2200
BDFID: 6656
Internal Node ID: 2
Compute Unit: 2
SIMDs per CU: 2
Shader Engines: 1
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Coherent Host Access: FALSE
Memory Properties: APU
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 22
SDMA engine uCode:: 9
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 15970472(0xf3b0a8) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 15970472(0xf3b0a8) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 3
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1036
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
ISA 2
Name: amdgcn-amd-amdhsa--gfx10-3-generic
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***
- Post output of
AMDGPU.versioninfo()if possible.
julia> AMDGPU.versioninfo()
[ Info: AMDGPU versioninfo
:0:/usr/src/debug/hip-runtime/hip-runtime-clr/hipamd/src/hip_global.cpp:158 : 5486189594 us: Module not initialized
[85700] signal 6 (-6): Aborted
in expression starting at REPL[4]:1
unknown function (ip: 0x7f5e8d5ec74c)
gsignal at /usr/lib/libc.so.6 (unknown line)
abort at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 0x7f5de667921e)
unknown function (ip: 0x7f5de6870db3)
unknown function (ip: 0x7f5de68345c1)
unknown function (ip: 0x7f5de683a9da)
unknown function (ip: 0x7f5dade183a2)
rocsparse_create_handle at /opt/rocm/lib/librocsparse.so (unknown line)
macro expansion at /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/sparse/error.jl:80 [inlined]
rocsparse_create_handle at /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/sparse/librocsparse.jl:7
create_handle at /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/sparse/rocSPARSE.jl:31 [inlined]
#5 at /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/cache.jl:115 [inlined]
pop! at /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/cache.jl:49
new_state at /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/cache.jl:114
#9 at /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/cache.jl:127 [inlined]
get! at ./dict.jl:458
library_state at /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/cache.jl:127
lib_state at /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/sparse/rocSPARSE.jl:37 [inlined]
handle at /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/sparse/rocSPARSE.jl:41 [inlined]
version at /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/sparse/rocSPARSE.jl:46
_ver at /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/utils.jl:5 [inlined]
versioninfo at /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/utils.jl:6
unknown function (ip: 0x7f5e0df41a3f)
jl_apply at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
do_call at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:126
eval_value at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:223
eval_stmt_value at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:174 [inlined]
eval_body at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:666
jl_interpret_toplevel_thunk at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/interpreter.c:824
jl_toplevel_eval_flex at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/toplevel.c:943
jl_toplevel_eval_flex at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/toplevel.c:886
ijl_toplevel_eval_in at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/toplevel.c:994
eval at ./boot.jl:430 [inlined]
eval_user_input at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:261
repl_backend_loop at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:368
#start_repl_backend#59 at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:343
start_repl_backend at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:340
#run_repl#76 at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:500
run_repl at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:486
jfptr_run_repl_10123.1 at /home/aaruni/.julia/juliaup/julia-1.11.5+0.x64.linux.gnu/share/julia/compiled/v1.11/REPL/u0gqU_4x0TT.so (unknown line)
#1150 at ./client.jl:446
jfptr_YY.1150_14797.1 at /home/aaruni/.julia/juliaup/julia-1.11.5+0.x64.linux.gnu/share/julia/compiled/v1.11/REPL/u0gqU_4x0TT.so (unknown line)
jl_apply at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
jl_f__call_latest at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/builtins.c:875
#invokelatest#2 at ./essentials.jl:1055 [inlined]
invokelatest at ./essentials.jl:1052 [inlined]
run_main_repl at ./client.jl:430
repl_main at ./client.jl:567 [inlined]
_start at ./client.jl:541
jfptr__start_73430.1 at /home/aaruni/.julia/juliaup/julia-1.11.5+0.x64.linux.gnu/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
true_main at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/jlapi.c:900
jl_repl_entrypoint at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/jlapi.c:1059
main at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/cli/loader_exe.c:58
unknown function (ip: 0x7f5e8d57c6b4)
__libc_start_main at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 40378658 (Pool: 40374356; Big: 4302); GC: 48
[1] 85700 IOT instruction (core dumped) julia +release --project=./
Reproducing the bug
- Describe what's not working.
Does not work at all with RX 9070 XT / gfx1201 . Possibly needs newer rocm libraries. Support for this GPU was added in rocm 6.4
- Provide MWE to reproduce it (if possible).
On an RX 9070 XT,
julia> using AMDGPU
julia> AMDGPU.zeros(1)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
'gfx1201' is not a recognized processor for this target (ignoring processor)
ERROR: LLVM error: Cannot select: 0x23a4ac70: ch = store<(store (s32) into %ir.17, !tbaa !137, addrspace 1)> # D:1 0x23caf7a0, 0x2360ec50, 0x237f2b40, undef:i64, /home/aaruni/.julia/packages/LLVM/2JPxT/src/interop/base.jl:39 @[ none:0 @[ none:0 @[ /home/aaruni/.julia/packages/LLVM/2JPxT/src/interop/pointer.jl:88 @[ /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/device/gcn/array.jl:86 @[ /home/aaruni/.julia/packages/KernelAbstractions/C3nYQ/src/macros.jl:322 @[ none:0 ] ] ] ] ] ]
0x2360ec50: i32,ch = load<(dereferenceable invariant load (s32) from %ir..kernarg.offset3.cast, align 8, addrspace 4)> 0x23caf7a0, 0x237f2de0, undef:i64
0x237f2de0: i64 = add nuw 0x2360ee80, Constant:i64<136>
0x2360ee80: i64,ch = CopyFromReg 0x23caf7a0, Register:i64 %0
0x23664d20: i64 = Register %0
0x2360e710: i64 = Constant<136>
0x237f2ec0: i64 = undef
0x237f2b40: i64 = add # D:1 0x237f32b0, Constant:i64<-4>, /home/aaruni/.julia/packages/LLVM/2JPxT/src/interop/base.jl:39 @[ none:0 @[ none:0 @[ /home/aaruni/.julia/packages/LLVM/2JPxT/src/interop/pointer.jl:88 @[ /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/device/gcn/array.jl:86 @[ /home/aaruni/.julia/packages/KernelAbstractions/C3nYQ/src/macros.jl:322 @[ none:0 ] ] ] ] ] ]
0x237f32b0: i64 = add # D:1 0x23664c40, 0x237f2ad0, /home/aaruni/.julia/packages/LLVM/2JPxT/src/interop/base.jl:39 @[ none:0 @[ none:0 @[ /home/aaruni/.julia/packages/LLVM/2JPxT/src/interop/pointer.jl:88 @[ /home/aaruni/.julia/packages/AMDGPU/wH6SV/src/device/gcn/array.jl:86 @[ /home/aaruni/.julia/packages/KernelAbstractions/C3nYQ/src/macros.jl:322 @[ none:0 ] ] ] ] ] ]
0x23664c40: i64 = shl # D:1 0x237f3320, Constant:i32<2>
0x237f3320: i64,ch = CopyFromReg # D:1 0x23caf7a0, Register:i64 %1
0x2360f270: i64 = Register %1
0x237f3390: i32 = Constant<2>
0x237f2ad0: i64 = bitcast 0x237f2bb0
0x237f2bb0: v2i32,ch = load<(dereferenceable invariant load (s64) from %ir..kernarg.offset1.cast + 8, basealign 16, addrspace 4)> 0x23caf7a0, 0x237f3010, undef:i64
0x237f3010: i64 = add 0x2360ee80, Constant:i64<120>
0x2360ee80: i64,ch = CopyFromReg 0x23caf7a0, Register:i64 %0
0x23664d20: i64 = Register %0
0x2360eef0: i64 = Constant<120>
0x237f2ec0: i64 = undef
0x237f3080: i64 = Constant<-4>
0x237f2ec0: i64 = undef
In function: _Z16gpu_fill_kernel_16CompilerMetadataI11DynamicSize12DynamicCheckv16CartesianIndicesILi1E5TupleI5OneToI5Int64EEE7NDRangeILi1ES0_S0_S8_S8_EE14ROCDeviceArrayI7Float32Li1ELi1EESD_
Stacktrace:
[1] handle_error(reason::Cstring)
@ LLVM ~/.julia/packages/LLVM/2JPxT/src/core/context.jl:194
[2] LLVMTargetMachineEmitToMemoryBuffer(T::LLVM.TargetMachine, M::LLVM.Module, codegen::LLVM.API.LLVMCodeGenFileType, ErrorMessage::Base.RefValue{…}, OutMemBuf::Base.RefValue{…})
@ LLVM.API ~/.julia/packages/LLVM/2JPxT/lib/16/libLLVM.jl:11138
[3] emit(tm::LLVM.TargetMachine, mod::LLVM.Module, filetype::LLVM.API.LLVMCodeGenFileType)
@ LLVM ~/.julia/packages/LLVM/2JPxT/src/targetmachine.jl:118
[4] mcgen(job::GPUCompiler.CompilerJob, mod::LLVM.Module, format::LLVM.API.LLVMCodeGenFileType)
@ GPUCompiler ~/.julia/packages/GPUCompiler/Emuht/src/mcgen.jl:75
[5] macro expansion
@ ~/.julia/packages/Tracy/GcShf/src/tracepoint.jl:158 [inlined]
[6] macro expansion
@ ~/.julia/packages/GPUCompiler/Emuht/src/driver.jl:404 [inlined]
[7] macro expansion
@ ~/.julia/packages/Tracy/GcShf/src/tracepoint.jl:158 [inlined]
[8] macro expansion
@ ~/.julia/packages/GPUCompiler/Emuht/src/driver.jl:401 [inlined]
[9] emit_asm(job::GPUCompiler.CompilerJob, ir::LLVM.Module, format::LLVM.API.LLVMCodeGenFileType)
@ GPUCompiler ~/.julia/packages/GPUCompiler/Emuht/src/utils.jl:116
[10] compile_unhooked(output::Symbol, job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
@ GPUCompiler ~/.julia/packages/GPUCompiler/Emuht/src/driver.jl:115
[11] compile_unhooked
@ ~/.julia/packages/GPUCompiler/Emuht/src/driver.jl:80 [inlined]
[12] compile(target::Symbol, job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
@ GPUCompiler ~/.julia/packages/GPUCompiler/Emuht/src/driver.jl:67
[13] compile
@ ~/.julia/packages/GPUCompiler/Emuht/src/driver.jl:55 [inlined]
[14] #40
@ ~/.julia/packages/AMDGPU/wH6SV/src/compiler/codegen.jl:194 [inlined]
[15] JuliaContext(f::AMDGPU.Compiler.var"#40#41"{GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}}; kwargs::@Kwargs{})
@ GPUCompiler ~/.julia/packages/GPUCompiler/Emuht/src/driver.jl:34
[16] JuliaContext(f::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/Emuht/src/driver.jl:25
[17] hipcompile(job::GPUCompiler.CompilerJob)
@ AMDGPU.Compiler ~/.julia/packages/AMDGPU/wH6SV/src/compiler/codegen.jl:193
[18] actual_compilation(cache::Dict{…}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{…}, compiler::typeof(AMDGPU.Compiler.hipcompile), linker::typeof(AMDGPU.Compiler.hiplink))
@ GPUCompiler ~/.julia/packages/GPUCompiler/Emuht/src/execution.jl:245
[19] cached_compilation(cache::Dict{…}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{…}, compiler::Function, linker::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/Emuht/src/execution.jl:159
[20] macro expansion
@ ~/.julia/packages/AMDGPU/wH6SV/src/compiler/codegen.jl:161 [inlined]
[21] macro expansion
@ ./lock.jl:273 [inlined]
[22] hipfunction(f::GPUArrays.var"#gpu_fill_kernel!#3", tt::Type{Tuple{…}}; kwargs::@Kwargs{})
@ AMDGPU.Compiler ~/.julia/packages/AMDGPU/wH6SV/src/compiler/codegen.jl:155
[23] hipfunction(f::GPUArrays.var"#gpu_fill_kernel!#3", tt::Type{Tuple{KernelAbstractions.CompilerMetadata{…}, AMDGPU.Device.ROCDeviceVector{…}, Float32}})
@ AMDGPU.Compiler ~/.julia/packages/AMDGPU/wH6SV/src/compiler/codegen.jl:154
[24] macro expansion
@ ~/.julia/packages/AMDGPU/wH6SV/src/highlevel.jl:155 [inlined]
[25] (::KernelAbstractions.Kernel{…})(::ROCArray{…}, ::Vararg{…}; ndrange::Tuple{…}, workgroupsize::Nothing)
@ AMDGPU.ROCKernels ~/.julia/packages/AMDGPU/wH6SV/src/ROCKernels.jl:91
[26] fill!(A::ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}, x::Float32)
@ GPUArrays ~/.julia/packages/GPUArrays/uiVyU/src/host/construction.jl:22
[27] zeros
@ ~/.julia/packages/AMDGPU/wH6SV/src/array.jl:245 [inlined]
[28] zeros(dims::Int64)
@ AMDGPU ~/.julia/packages/AMDGPU/wH6SV/src/array.jl:244
[29] top-level scope
@ REPL[2]:1
Some type information was truncated. Use `show(err)` to see complete types.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working