@@ -3592,7 +3592,7 @@ The fields used by CP for code objects before V3 also match those specified in
3592
3592
aligned.
3593
3593
351:272 20 Reserved, must be 0.
3594
3594
bytes
3595
- 383:352 4 bytes COMPUTE_PGM_RSRC3 GFX6-9
3595
+ 383:352 4 bytes COMPUTE_PGM_RSRC3 GFX6-GFX9
3596
3596
Reserved, must be 0.
3597
3597
GFX10
3598
3598
Compute Shader (CS)
@@ -3639,7 +3639,7 @@ The fields used by CP for code objects before V3 also match those specified in
3639
3639
>454 1 bit ENABLE_SGPR_PRIVATE_SEGMENT
3640
3640
_SIZE
3641
3641
457:455 3 bits Reserved, must be 0.
3642
- 458 1 bit ENABLE_WAVEFRONT_SIZE32 GFX6-9
3642
+ 458 1 bit ENABLE_WAVEFRONT_SIZE32 GFX6-GFX9
3643
3643
Reserved, must be 0.
3644
3644
GFX10
3645
3645
- If 0 execute in
@@ -3905,7 +3905,7 @@ The fields used by CP for code objects before V3 also match those specified in
3905
3905
3906
3906
Used by CP to set up
3907
3907
``COMPUTE_PGM_RSRC1.WGP_MODE``.
3908
- 30 1 bit MEM_ORDERED GFX6-9
3908
+ 30 1 bit MEM_ORDERED GFX6-GFX9
3909
3909
Reserved, must be 0.
3910
3910
GFX10
3911
3911
Controls the behavior of the
@@ -3928,7 +3928,7 @@ The fields used by CP for code objects before V3 also match those specified in
3928
3928
3929
3929
Used by CP to set up
3930
3930
``COMPUTE_PGM_RSRC1.MEM_ORDERED``.
3931
- 31 1 bit FWD_PROGRESS GFX6-9
3931
+ 31 1 bit FWD_PROGRESS GFX6-GFX9
3932
3932
Reserved, must be 0.
3933
3933
GFX10
3934
3934
- If 0 execute SIMD wavefronts
@@ -4750,7 +4750,7 @@ in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx6-gfx9-table`.
4750
4750
4751
4751
============ ============ ============== ========== ================================
4752
4752
LLVM Instr LLVM Memory LLVM Memory AMDGPU AMDGPU Machine Code
4753
- Ordering Sync Scope Address GFX6-9
4753
+ Ordering Sync Scope Address GFX6-GFX9
4754
4754
Space
4755
4755
============ ============ ============== ========== ================================
4756
4756
**Non-Atomic**
@@ -8079,41 +8079,41 @@ supports the ``s_trap`` instruction. For usage see:
8079
8079
.. table:: AMDGPU Trap Handler for AMDHSA OS Code Object V4
8080
8080
:name: amdgpu-trap-handler-for-amdhsa-os-v4-table
8081
8081
8082
- =================== =============== =============== ============== =======================================
8083
- Usage Code Sequence GFX6-8 Inputs GFX9-10 Inputs Description
8084
- =================== =============== =============== ============== =======================================
8085
- reserved ``s_trap 0x00`` Reserved by hardware.
8086
- debugger breakpoint ``s_trap 0x01`` *none* *none* Reserved for debugger to use for
8087
- breakpoints. Causes wave to be halted
8088
- with the PC at the trap instruction.
8089
- The debugger is responsible to resume
8090
- the wave, including the instruction
8091
- that the breakpoint overwrote.
8092
- ``llvm.trap`` ``s_trap 0x02`` ``SGPR0-1``: *none* Causes wave to be halted with the PC at
8093
- ``queue_ptr`` the trap instruction. The associated
8094
- queue is signalled to put it into the
8095
- error state. When the queue is put in
8096
- the error state, the waves executing
8097
- dispatches on the queue will be
8098
- terminated.
8099
- ``llvm.debugtrap`` ``s_trap 0x03`` *none* *none* - If debugger not enabled then behaves
8100
- as a no-operation. The trap handler
8101
- is entered and immediately returns to
8102
- continue execution of the wavefront.
8103
- - If the debugger is enabled, causes
8104
- the debug trap to be reported by the
8105
- debugger and the wavefront is put in
8106
- the halt state with the PC at the
8107
- instruction. The debugger must
8108
- increment the PC and resume the wave.
8109
- reserved ``s_trap 0x04`` Reserved.
8110
- reserved ``s_trap 0x05`` Reserved.
8111
- reserved ``s_trap 0x06`` Reserved.
8112
- reserved ``s_trap 0x07`` Reserved.
8113
- reserved ``s_trap 0x08`` Reserved.
8114
- reserved ``s_trap 0xfe`` Reserved.
8115
- reserved ``s_trap 0xff`` Reserved.
8116
- =================== =============== =============== ============== =======================================
8082
+ =================== =============== ================ === ============== =======================================
8083
+ Usage Code Sequence GFX6-GFX8 Inputs GFX9-GFX10 Inputs Description
8084
+ =================== =============== ================ === ============== =======================================
8085
+ reserved ``s_trap 0x00`` Reserved by hardware.
8086
+ debugger breakpoint ``s_trap 0x01`` *none* *none* Reserved for debugger to use for
8087
+ breakpoints. Causes wave to be halted
8088
+ with the PC at the trap instruction.
8089
+ The debugger is responsible to resume
8090
+ the wave, including the instruction
8091
+ that the breakpoint overwrote.
8092
+ ``llvm.trap`` ``s_trap 0x02`` ``SGPR0-1``: *none* Causes wave to be halted with the PC at
8093
+ ``queue_ptr`` the trap instruction. The associated
8094
+ queue is signalled to put it into the
8095
+ error state. When the queue is put in
8096
+ the error state, the waves executing
8097
+ dispatches on the queue will be
8098
+ terminated.
8099
+ ``llvm.debugtrap`` ``s_trap 0x03`` *none* *none* - If debugger not enabled then behaves
8100
+ as a no-operation. The trap handler
8101
+ is entered and immediately returns to
8102
+ continue execution of the wavefront.
8103
+ - If the debugger is enabled, causes
8104
+ the debug trap to be reported by the
8105
+ debugger and the wavefront is put in
8106
+ the halt state with the PC at the
8107
+ instruction. The debugger must
8108
+ increment the PC and resume the wave.
8109
+ reserved ``s_trap 0x04`` Reserved.
8110
+ reserved ``s_trap 0x05`` Reserved.
8111
+ reserved ``s_trap 0x06`` Reserved.
8112
+ reserved ``s_trap 0x07`` Reserved.
8113
+ reserved ``s_trap 0x08`` Reserved.
8114
+ reserved ``s_trap 0xfe`` Reserved.
8115
+ reserved ``s_trap 0xff`` Reserved.
8116
+ =================== =============== ================ === ============== =======================================
8117
8117
8118
8118
.. _amdgpu-amdhsa-function-call-convention:
8119
8119
@@ -8179,7 +8179,7 @@ On entry to a function:
8179
8179
8180
8180
2. The FLAT_SCRATCH register pair is setup. See
8181
8181
:ref:`amdgpu-amdhsa-kernel-prolog-flat-scratch`.
8182
- 3. GFX6-8 : M0 register set to the size of LDS in bytes. See
8182
+ 3. GFX6-GFX8 : M0 register set to the size of LDS in bytes. See
8183
8183
:ref:`amdgpu-amdhsa-kernel-prolog-m0`.
8184
8184
4. The EXEC register is set to the lanes active on entry to the function.
8185
8185
5. MODE register: *TBD*
@@ -8237,7 +8237,7 @@ On exit from a function:
8237
8237
8238
8238
* FLAT_SCRATCH
8239
8239
* EXEC
8240
- * GFX6-8 : M0
8240
+ * GFX6-GFX8 : M0
8241
8241
* All SGPR registers except the clobbered registers of SGPR4-31.
8242
8242
* VGPR40-47
8243
8243
VGPR56-63
0 commit comments