-
Notifications
You must be signed in to change notification settings - Fork 75
Description
🐛 Describe the bug
In file included from /home/mario/code_store/pytorch_store/pytorch/aten/src/ATen/native/hip/LegacyThrustHelpers.hip:14:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/device_ptr.h:26:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/memory.h:26:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/detail/reference.h:28:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/system/detail/generic/memory.h:69:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/system/detail/generic/memory.inl:22:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/system/detail/adl/malloc_and_free.h:44:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/system/hip/detail/malloc_and_free.h:31:
/opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/system/hip/detail/util.h:254:69: error: reference to host function 'hip_category' in host device function
254 | (throw thrust::system_error(status, thrust::hip_category(), msg);),
| ^
/opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/system/hip/error.h:129:30: note: 'hip_category' declared here
129 | inline const error_category &hip_category(void);
| ^
In file included from /home/mario/code_store/pytorch_store/pytorch/aten/src/ATen/native/hip/LegacyThrustHelpers.hip:14:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/device_ptr.h:26:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/memory.h:26:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/detail/reference.h:28:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/system/detail/generic/memory.h:69:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/system/detail/generic/memory.inl:22:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/system/detail/adl/malloc_and_free.h:44:
In file included from /opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/system/hip/detail/malloc_and_free.h:31:
/opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/system/hip/detail/util.h:254:32: error: reference to host function 'system_error' in host device function
254 | (throw thrust::system_error(status, thrust::hip_category(), msg);),
| ^
/opt/rocm-6.4.1/lib/llvm/bin/../../../include/thrust/system/detail/system_error.inl:63:5: note: 'system_error' declared here
63 | ::system_error(int ev, const error_category &ecat, const char *what_arg)
| ^
Versions
os : Linux ap 6.11.0-29-generic #29~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Jun 26 14:16:59 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
rocmInifo : =====================
HSA System Attributes
Runtime Version: 1.15
Runtime Ext Version: 1.7
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
XNACK enabled: NO
DMAbuf Support: YES
VMM Support: YES
==========
HSA Agents
Agent 1
Name: 12th Gen Intel(R) Core(TM) i7-12700
Uuid: CPU-XX
Marketing Name: 12th Gen Intel(R) Core(TM) i7-12700
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 49152(0xc000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 4800
BDFID: 0
Internal Node ID: 0
Compute Unit: 20
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Memory Properties:
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 65601000(0x3e8fde8) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 65601000(0x3e8fde8) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 65601000(0x3e8fde8) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 4
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 65601000(0x3e8fde8) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
Agent 2
Name: gfx1100
Uuid: GPU-bcf20e2c59bbf5d9
Marketing Name: Radeon RX 7900 XT
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 6144(0x1800) KB
L3: 81920(0x14000) KB
Chip ID: 29772(0x744c)
ASIC Revision: 0(0x0)
Cacheline Size: 128(0x80)
Max Clock Freq. (MHz): 2175
BDFID: 768
Internal Node ID: 1
Compute Unit: 84
SIMDs per CU: 2
Shader Engines: 6
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Coherent Host Access: FALSE
Memory Properties:
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 542
SDMA engine uCode:: 24
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 20955136(0x13fc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 20955136(0x13fc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 3
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1100
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
ISA 2
Name: amdgcn-amd-amdhsa--gfx11-generic
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
Agent 3
Name: gfx1101
Uuid: GPU-0bd71e5453d07d5e
Marketing Name: AMD Radeon RX 7700 XT
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 2
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 4096(0x1000) KB
L3: 49152(0xc000) KB
Chip ID: 29822(0x747e)
ASIC Revision: 0(0x0)
Cacheline Size: 128(0x80)
Max Clock Freq. (MHz): 2171
BDFID: 2304
Internal Node ID: 2
Compute Unit: 54
SIMDs per CU: 2
Shader Engines: 3
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Coherent Host Access: FALSE
Memory Properties:
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 542
SDMA engine uCode:: 27
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 12566528(0xbfc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 12566528(0xbfc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 3
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1101
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
ISA 2
Name: amdgcn-amd-amdhsa--gfx11-generic
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***