Skip to content

Commit efa7ab0

Browse files
jinhuang1102Jin Huang
andauthored
[profcheck] Add unknown branch weights to expanded cmpxchg loop. (#165841)
The AtomicExpandPass is responsible for lowering high-level atomic operations (like `atomicrmw fadd`) that are unsupported by the target hardware into a cmpxchg retry loop. Given that we cannot empirically prove the precision branch weights, It uses the `setExplicitlyUnknownBranchWeightsIfProfiled` function to explicitly add "unknown" (50/50) branch weights to this branch. This PR includes fies for the following tests: ``` Transforms/AtomicExpand/AArch64/atomicrmw-fp.ll Transforms/AtomicExpand/AArch64/pcsections.ll Transforms/AtomicExpand/AMDGPU/expand-atomic-f32-agent.ll Transforms/AtomicExpand/AMDGPU/expand-atomic-f32-system.ll Transforms/AtomicExpand/AMDGPU/expand-atomic-f64-agent.ll Transforms/AtomicExpand/AMDGPU/expand-atomic-f64-system.ll Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-nand.ll Transforms/AtomicExpand/AMDGPU/expand-atomic-simplify-cfg-CAS-block.ll Transforms/AtomicExpand/AMDGPU/expand-atomic-v2bf16-agent.ll Transforms/AtomicExpand/AMDGPU/expand-atomic-v2bf16-system.ll Transforms/AtomicExpand/AMDGPU/expand-atomic-v2f16-agent.ll Transforms/AtomicExpand/AMDGPU/expand-atomic-v2f16-system.ll Transforms/AtomicExpand/AMDGPU/expand-atomicrmw-fp-vector.ll Transforms/AtomicExpand/ARM/atomicrmw-fp.ll Transforms/AtomicExpand/LoongArch/atomicrmw-fp.ll Transforms/AtomicExpand/Mips/atomicrmw-fp.ll Transforms/AtomicExpand/PowerPC/atomicrmw-fp.ll Transforms/AtomicExpand/RISCV/atomicrmw-fp.ll Transforms/AtomicExpand/SPARC/libcalls.ll Transforms/AtomicExpand/X86/expand-atomic-rmw-fp.ll Transforms/AtomicExpand/X86/expand-atomic-rmw-initial-load.ll Transforms/AtomicExpand/X86/expand-atomic-xchg-fp.ll ``` Co-authored-by: Jin Huang <[email protected]>
1 parent cb41408 commit efa7ab0

File tree

2 files changed

+17
-4
lines changed

2 files changed

+17
-4
lines changed

llvm/lib/CodeGen/AtomicExpandPass.cpp

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1686,7 +1686,12 @@ Value *AtomicExpandImpl::insertRMWCmpXchgLoop(
16861686

16871687
Loaded->addIncoming(NewLoaded, LoopBB);
16881688

1689-
Builder.CreateCondBr(Success, ExitBB, LoopBB);
1689+
Instruction *CondBr = Builder.CreateCondBr(Success, ExitBB, LoopBB);
1690+
1691+
// Atomic RMW expands to a cmpxchg loop, Since precise branch weights
1692+
// cannot be easily determined here, we mark the branch as "unknown" (50/50)
1693+
// to prevent misleading optimizations.
1694+
setExplicitlyUnknownBranchWeightsIfProfiled(*CondBr, *F, DEBUG_TYPE);
16901695

16911696
Builder.SetInsertPoint(ExitBB, ExitBB->begin());
16921697
return NewLoaded;

llvm/test/Transforms/AtomicExpand/AArch64/atomicrmw-fp.ll

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
1+
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals
22
; RUN: opt -S -mtriple=aarch64-linux-gnu -passes=atomic-expand %s | FileCheck %s
33

4-
define float @test_atomicrmw_fadd_f32(ptr %ptr, float %value) {
4+
define float @test_atomicrmw_fadd_f32(ptr %ptr, float %value) !prof !0 {
55
; CHECK-LABEL: @test_atomicrmw_fadd_f32(
66
; CHECK-NEXT: [[TMP1:%.*]] = load float, ptr [[PTR:%.*]], align 4
77
; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
@@ -14,7 +14,7 @@ define float @test_atomicrmw_fadd_f32(ptr %ptr, float %value) {
1414
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP4]], 1
1515
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i32, i1 } [[TMP4]], 0
1616
; CHECK-NEXT: [[TMP5]] = bitcast i32 [[NEWLOADED]] to float
17-
; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
17+
; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1:![0-9]+]]
1818
; CHECK: atomicrmw.end:
1919
; CHECK-NEXT: ret float [[TMP5]]
2020
;
@@ -336,3 +336,11 @@ define <2 x half> @atomicrmw_fminimum_2_x_half(ptr %ptr, <2 x half> %val) {
336336
%res = atomicrmw fminimum ptr %ptr, <2 x half> %val seq_cst
337337
ret <2 x half> %res
338338
}
339+
340+
!0 = !{!"function_entry_count", i64 1000}
341+
;.
342+
; CHECK: attributes #[[ATTR0:[0-9]+]] = { nocallback nocreateundeforpoison nofree nosync nounwind speculatable willreturn memory(none) }
343+
;.
344+
; CHECK: [[META0:![0-9]+]] = !{!"function_entry_count", i64 1000}
345+
; CHECK: [[PROF1]] = !{!"unknown", !"atomic-expand"}
346+
;.

0 commit comments

Comments
 (0)