Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 25 additions & 1 deletion llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2103,7 +2103,7 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
getActionDefinitionsBuilder({G_MEMCPY, G_MEMCPY_INLINE, G_MEMMOVE, G_MEMSET})
.lower();

getActionDefinitionsBuilder({G_TRAP, G_DEBUGTRAP}).custom();
getActionDefinitionsBuilder({G_TRAP, G_DEBUGTRAP, G_UBSANTRAP}).custom();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should add the default lower() action to LegalizerHelper for G_DEBUGTRAP and G_UBSANTRAP, which just replace the opcode with G_TRAP (this is what LegalizeDAG does as the default action)

Copy link
Contributor Author

@amansharma612 amansharma612 May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should we replace the custom() call to lower() or add lower() as a fallback option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly. G_TRAP would still be custom, G_DEBUGTRAP and G_UBSANTRAP would lower. The default lowering would just replace those with regular G_TRAP


getActionDefinitionsBuilder({G_VASTART, G_VAARG, G_BRJT, G_JUMP_TABLE,
G_INDEXED_LOAD, G_INDEXED_SEXTLOAD,
Expand Down Expand Up @@ -2222,6 +2222,8 @@ bool AMDGPULegalizerInfo::legalizeCustom(
return legalizeTrap(MI, MRI, B);
case TargetOpcode::G_DEBUGTRAP:
return legalizeDebugTrap(MI, MRI, B);
case TargetOpcode::G_UBSANTRAP:
return legalizeUbsanTrap(MI, MRI, B);
default:
return false;
}
Expand Down Expand Up @@ -7045,6 +7047,28 @@ bool AMDGPULegalizerInfo::legalizeDebugTrap(MachineInstr &MI,
return true;
}

bool AMDGPULegalizerInfo::legalizeUbsanTrap(MachineInstr &MI,
MachineRegisterInfo &MRI,
MachineIRBuilder &B) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is identical to legalizeDebugTrap above, they should just use the same implementation function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trap ID differs in these functions. Is it still required to merge the definitions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can just rename that function above to "legalizeDebugUbsanTrap" and select the TrapID based on the opcode. The warning can also just say "debugtrap/ubsantrap handler not supported".

// Is non-HSA path or trap-handler disabled? Then, report a warning
// accordingly
if (!ST.isTrapHandlerEnabled() ||
ST.getTrapHandlerAbi() != GCNSubtarget::TrapHandlerAbi::AMDHSA) {
DiagnosticInfoUnsupported NoTrap(B.getMF().getFunction(),
"ubsantrap handler not supported",
MI.getDebugLoc(), DS_Warning);
LLVMContext &Ctx = B.getContext();
Ctx.diagnose(NoTrap);
} else {
// Insert trap instruction
B.buildInstr(AMDGPU::S_TRAP)
.addImm(static_cast<unsigned>(GCNSubtarget::TrapID::LLVMAMDHSATrap));
}

MI.eraseFromParent();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want it to be removed even trap handler is disabled?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has to be modified or erased otherwise it will be a legalization loop

return true;
}

bool AMDGPULegalizerInfo::legalizeBVHIntersectRayIntrinsic(
MachineInstr &MI, MachineIRBuilder &B) const {
MachineRegisterInfo &MRI = *B.getMRI();
Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,9 @@ class AMDGPULegalizerInfo final : public LegalizerInfo {
bool legalizeDebugTrap(MachineInstr &MI, MachineRegisterInfo &MRI,
MachineIRBuilder &B) const;

bool legalizeUbsanTrap(MachineInstr &MI, MachineRegisterInfo &MRI,
MachineIRBuilder &B) const;

bool legalizeIntrinsic(LegalizerHelper &Helper,
MachineInstr &MI) const override;
};
Expand Down
129 changes: 129 additions & 0 deletions llvm/test/CodeGen/AMDGPU/trap-abis.ll
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

declare void @llvm.trap() #0
declare void @llvm.debugtrap() #1
declare void @llvm.ubsantrap(i8) #2

define amdgpu_kernel void @trap(ptr addrspace(1) nocapture readonly %arg0) {
; NOHSA-TRAP-GFX900-LABEL: trap:
Expand Down Expand Up @@ -482,6 +483,134 @@ define amdgpu_kernel void @debugtrap(ptr addrspace(1) nocapture readonly %arg0)
ret void
}

define void @ubsan_trap(ptr addrspace(1) nocapture readonly %arg0) {
; CHECK-LABEL: ubsan_trap:
; CHECK: ; %bb.0:
; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; CHECK-NEXT: s_trap 2
; CHECK-NEXT: s_setpc_b64 s[30:31]
; NOHSA-TRAP-GFX900-LABEL: ubsan_trap:
; NOHSA-TRAP-GFX900: ; %bb.0:
; NOHSA-TRAP-GFX900-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; NOHSA-TRAP-GFX900-NEXT: v_mov_b32_e32 v2, 1
; NOHSA-TRAP-GFX900-NEXT: global_store_dword v[0:1], v2, off
; NOHSA-TRAP-GFX900-NEXT: s_waitcnt vmcnt(0)
; NOHSA-TRAP-GFX900-NEXT: s_cbranch_execnz .LBB4_2
; NOHSA-TRAP-GFX900-NEXT: ; %bb.1:
; NOHSA-TRAP-GFX900-NEXT: v_mov_b32_e32 v2, 2
; NOHSA-TRAP-GFX900-NEXT: global_store_dword v[0:1], v2, off
; NOHSA-TRAP-GFX900-NEXT: s_waitcnt vmcnt(0)
; NOHSA-TRAP-GFX900-NEXT: s_setpc_b64 s[30:31]
; NOHSA-TRAP-GFX900-NEXT: .LBB4_2:
; NOHSA-TRAP-GFX900-NEXT: s_endpgm
;
; HSA-TRAP-GFX803-LABEL: ubsan_trap:
; HSA-TRAP-GFX803: ; %bb.0:
; HSA-TRAP-GFX803-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; HSA-TRAP-GFX803-NEXT: v_mov_b32_e32 v2, 1
; HSA-TRAP-GFX803-NEXT: flat_store_dword v[0:1], v2
; HSA-TRAP-GFX803-NEXT: s_waitcnt vmcnt(0)
; HSA-TRAP-GFX803-NEXT: v_mov_b32_e32 v2, 2
; HSA-TRAP-GFX803-NEXT: s_mov_b64 s[0:1], s[6:7]
; HSA-TRAP-GFX803-NEXT: s_trap 2
; HSA-TRAP-GFX803-NEXT: flat_store_dword v[0:1], v2
; HSA-TRAP-GFX803-NEXT: s_waitcnt vmcnt(0)
; HSA-TRAP-GFX803-NEXT: s_setpc_b64 s[30:31]
;
; HSA-TRAP-GFX900-LABEL: ubsan_trap:
; HSA-TRAP-GFX900: ; %bb.0:
; HSA-TRAP-GFX900-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; HSA-TRAP-GFX900-NEXT: v_mov_b32_e32 v2, 1
; HSA-TRAP-GFX900-NEXT: global_store_dword v[0:1], v2, off
; HSA-TRAP-GFX900-NEXT: s_waitcnt vmcnt(0)
; HSA-TRAP-GFX900-NEXT: v_mov_b32_e32 v2, 2
; HSA-TRAP-GFX900-NEXT: s_trap 2
; HSA-TRAP-GFX900-NEXT: global_store_dword v[0:1], v2, off
; HSA-TRAP-GFX900-NEXT: s_waitcnt vmcnt(0)
; HSA-TRAP-GFX900-NEXT: s_setpc_b64 s[30:31]
;
; HSA-NOTRAP-GFX900-LABEL: ubsan_trap:
; HSA-NOTRAP-GFX900: ; %bb.0:
; HSA-NOTRAP-GFX900-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; HSA-NOTRAP-GFX900-NEXT: v_mov_b32_e32 v2, 1
; HSA-NOTRAP-GFX900-NEXT: global_store_dword v[0:1], v2, off
; HSA-NOTRAP-GFX900-NEXT: s_waitcnt vmcnt(0)
; HSA-NOTRAP-GFX900-NEXT: s_cbranch_execnz .LBB4_2
; HSA-NOTRAP-GFX900-NEXT: ; %bb.1:
; HSA-NOTRAP-GFX900-NEXT: v_mov_b32_e32 v2, 2
; HSA-NOTRAP-GFX900-NEXT: global_store_dword v[0:1], v2, off
; HSA-NOTRAP-GFX900-NEXT: s_waitcnt vmcnt(0)
; HSA-NOTRAP-GFX900-NEXT: s_setpc_b64 s[30:31]
; HSA-NOTRAP-GFX900-NEXT: .LBB4_2:
; HSA-NOTRAP-GFX900-NEXT: s_endpgm
;
; HSA-TRAP-GFX1100-LABEL: ubsan_trap:
; HSA-TRAP-GFX1100: ; %bb.0:
; HSA-TRAP-GFX1100-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; HSA-TRAP-GFX1100-NEXT: v_mov_b32_e32 v2, 1
; HSA-TRAP-GFX1100-NEXT: global_store_b32 v[0:1], v2, off dlc
; HSA-TRAP-GFX1100-NEXT: s_waitcnt_vscnt null, 0x0
; HSA-TRAP-GFX1100-NEXT: s_cbranch_execnz .LBB4_2
; HSA-TRAP-GFX1100-NEXT: ; %bb.1:
; HSA-TRAP-GFX1100-NEXT: v_mov_b32_e32 v2, 2
; HSA-TRAP-GFX1100-NEXT: global_store_b32 v[0:1], v2, off dlc
; HSA-TRAP-GFX1100-NEXT: s_waitcnt_vscnt null, 0x0
; HSA-TRAP-GFX1100-NEXT: s_setpc_b64 s[30:31]
; HSA-TRAP-GFX1100-NEXT: .LBB4_2:
; HSA-TRAP-GFX1100-NEXT: s_trap 2
; HSA-TRAP-GFX1100-NEXT: s_sendmsg_rtn_b32 s0, sendmsg(MSG_RTN_GET_DOORBELL)
; HSA-TRAP-GFX1100-NEXT: s_mov_b32 ttmp2, m0
; HSA-TRAP-GFX1100-NEXT: s_waitcnt lgkmcnt(0)
; HSA-TRAP-GFX1100-NEXT: s_and_b32 s0, s0, 0x3ff
; HSA-TRAP-GFX1100-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
; HSA-TRAP-GFX1100-NEXT: s_bitset1_b32 s0, 10
; HSA-TRAP-GFX1100-NEXT: s_mov_b32 m0, s0
; HSA-TRAP-GFX1100-NEXT: s_sendmsg sendmsg(MSG_INTERRUPT)
; HSA-TRAP-GFX1100-NEXT: s_mov_b32 m0, ttmp2
; HSA-TRAP-GFX1100-NEXT: .LBB4_3: ; =>This Inner Loop Header: Depth=1
; HSA-TRAP-GFX1100-NEXT: s_sethalt 5
; HSA-TRAP-GFX1100-NEXT: s_branch .LBB4_3
;
; HSA-TRAP-GFX1100-O0-LABEL: ubsan_trap:
; HSA-TRAP-GFX1100-O0: ; %bb.0:
; HSA-TRAP-GFX1100-O0-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; HSA-TRAP-GFX1100-O0-NEXT: v_mov_b32_e32 v2, v1
; HSA-TRAP-GFX1100-O0-NEXT: ; implicit-def: $sgpr0
; HSA-TRAP-GFX1100-O0-NEXT: ; implicit-def: $sgpr0
; HSA-TRAP-GFX1100-O0-NEXT: ; kill: def $vgpr0 killed $vgpr0 def $vgpr0_vgpr1 killed $exec
; HSA-TRAP-GFX1100-O0-NEXT: v_mov_b32_e32 v1, v2
; HSA-TRAP-GFX1100-O0-NEXT: scratch_store_b64 off, v[0:1], s32 ; 8-byte Folded Spill
; HSA-TRAP-GFX1100-O0-NEXT: ; implicit-def: $sgpr0_sgpr1
; HSA-TRAP-GFX1100-O0-NEXT: v_mov_b32_e32 v2, 1
; HSA-TRAP-GFX1100-O0-NEXT: global_store_b32 v[0:1], v2, off dlc
; HSA-TRAP-GFX1100-O0-NEXT: s_waitcnt_vscnt null, 0x0
; HSA-TRAP-GFX1100-O0-NEXT: s_cbranch_execnz .LBB4_2
; HSA-TRAP-GFX1100-O0-NEXT: ; %bb.1:
; HSA-TRAP-GFX1100-O0-NEXT: scratch_load_b64 v[0:1], off, s32 ; 8-byte Folded Reload
; HSA-TRAP-GFX1100-O0-NEXT: v_mov_b32_e32 v2, 2
; HSA-TRAP-GFX1100-O0-NEXT: s_waitcnt vmcnt(0)
; HSA-TRAP-GFX1100-O0-NEXT: global_store_b32 v[0:1], v2, off dlc
; HSA-TRAP-GFX1100-O0-NEXT: s_waitcnt_vscnt null, 0x0
; HSA-TRAP-GFX1100-O0-NEXT: s_setpc_b64 s[30:31]
; HSA-TRAP-GFX1100-O0-NEXT: .LBB4_2:
; HSA-TRAP-GFX1100-O0-NEXT: s_trap 2
; HSA-TRAP-GFX1100-O0-NEXT: s_sendmsg_rtn_b32 s0, sendmsg(MSG_RTN_GET_DOORBELL)
; HSA-TRAP-GFX1100-O0-NEXT: s_mov_b32 ttmp2, m0
; HSA-TRAP-GFX1100-O0-NEXT: s_waitcnt lgkmcnt(0)
; HSA-TRAP-GFX1100-O0-NEXT: s_and_b32 s0, s0, 0x3ff
; HSA-TRAP-GFX1100-O0-NEXT: s_or_b32 s0, s0, 0x400
; HSA-TRAP-GFX1100-O0-NEXT: s_mov_b32 m0, s0
; HSA-TRAP-GFX1100-O0-NEXT: s_sendmsg sendmsg(MSG_INTERRUPT)
; HSA-TRAP-GFX1100-O0-NEXT: s_mov_b32 m0, ttmp2
; HSA-TRAP-GFX1100-O0-NEXT: .LBB4_3: ; =>This Inner Loop Header: Depth=1
; HSA-TRAP-GFX1100-O0-NEXT: s_sethalt 5
; HSA-TRAP-GFX1100-O0-NEXT: s_branch .LBB4_3
store volatile i32 1, ptr addrspace(1) %arg0
call void @llvm.ubsantrap(i8 0)
store volatile i32 2, ptr addrspace(1) %arg0
ret void
}

attributes #0 = { nounwind noreturn }
attributes #1 = { nounwind }

Expand Down
19 changes: 19 additions & 0 deletions llvm/test/CodeGen/AMDGPU/trap.ll
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --function ubsantrap --version 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was not autogenerated, it needs to be manually updated
it looks like the script did not do anything anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should any specific manual checks be added to the ubsantrap function in this case?

; RUN: llc -global-isel=0 -mtriple=amdgcn--amdhsa -verify-machineinstrs < %s | FileCheck -check-prefix=GCN -check-prefix=HSA-TRAP %s
; RUN: llc -global-isel=1 -mtriple=amdgcn--amdhsa -verify-machineinstrs < %s | FileCheck -check-prefix=GCN -check-prefix=HSA-TRAP %s

Expand Down Expand Up @@ -28,6 +29,7 @@

declare void @llvm.trap() #0
declare void @llvm.debugtrap() #1
declare void @llvm.ubsantrap(i8) #2

; MESA-TRAP: .section .AMDGPU.config
; MESA-TRAP: .long 47180
Expand Down Expand Up @@ -83,6 +85,13 @@ define amdgpu_kernel void @hsa_debugtrap(ptr addrspace(1) nocapture readonly %ar
ret void
}

define amdgpu_kernel void @ubsantrap(ptr addrspace(1) nocapture readonly %arg0) {
store volatile i32 1, ptr addrspace(1) %arg0
call void @llvm.ubsantrap(i8 0)
store volatile i32 2, ptr addrspace(1) %arg0
ret void
}

; For non-HSA path
; GCN-LABEL: {{^}}trap:
; TRAP-BIT: enable_trap_handler = 1
Expand Down Expand Up @@ -147,3 +156,13 @@ attributes #1 = { nounwind }

!llvm.module.flags = !{!0}
!0 = !{i32 1, !"amdhsa_code_object_version", i32 400}
;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
; GCN: {{.*}}
; GCN-WARNING: {{.*}}
; HSA-TRAP: {{.*}}
; MESA-TRAP: {{.*}}
; NO-HSA-TRAP: {{.*}}
; NO-MESA-TRAP: {{.*}}
; NO-TRAP-BIT: {{.*}}
; NOMESA-TRAP: {{.*}}
; TRAP-BIT: {{.*}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something went wrong here, and there are no checks on the other function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arsenm I am unaware of how/what checks to add manually for this function, would appreciate inputs regarding it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is a special case you can't generate. You should be able to copy the checks from a debugtrap function