[profcheck] Add unknown branch weights to expand LL/SR loop. #166273

jinhuang1102 · 2025-11-04T00:41:49Z

As a follow-up to PR#165841, this change addresses prof_md metadata loss in AtomicExpandPass when lowering atomicrmw xchg to a Load-Linked/Store-Exclusive (LL/SC) loop.

This path is distinct from the LSE path addressed previously:

PR #165841 (and its tests) used -mtriple=aarch64-linux-gnu, which targets a modern ARMv8.1+ architecture. This architecture supports Large System Extensions (LSE), allowing atomicrmw to be lowered directly to a more efficient hardware instruction.

This PR (and its tests) uses -mtriple=aarch64-- or -mtriple=armv8-linux-gnueabihf. This indicates an ARMv8.0 or lower architecture that does not support LSE. On these targets, the pass must fall back to synthesizing a manual LL/SC loop using the ldaxr/stxr instruction pair.

Similar to previous issue, the new conditional branch was failin to inherit the prof_md metadata. Theis PR correctly fix the branch weights to the newly created branch within the LL/SC loop, ensuring profile information is preserved.

This PR includes fixes for the following tests:

Transforms/AtomicExpand/AArch64/expand-atomicrmw-xchg-fp.ll
Transforms/AtomicExpand/ARM/atomic-expansion-v8.ll
Transforms/AtomicExpand/Hexagon/atomicrmw-fp.ll

llvmbot · 2025-11-04T00:42:28Z

@llvm/pr-subscribers-llvm-transforms

Author: Jin Huang (jinhuang1102)

Changes

As the followup PR of #165841, the AtomicExpandPass is lowering high-level atomic operations (atmoicrmw xchg) to a icmp retry loop.

Full diff: https://github.com/llvm/llvm-project/pull/166273.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/AtomicExpandPass.cpp (+9-3)
(modified) llvm/test/Transforms/AtomicExpand/AArch64/expand-atomicrmw-xchg-fp.ll (+21-7)

diff --git a/llvm/lib/CodeGen/AtomicExpandPass.cpp b/llvm/lib/CodeGen/AtomicExpandPass.cpp
index 53f1cfe24a68d..c22f5ff9e444b 100644
--- a/llvm/lib/CodeGen/AtomicExpandPass.cpp
+++ b/llvm/lib/CodeGen/AtomicExpandPass.cpp
@@ -38,6 +38,7 @@
 #include "llvm/IR/MDBuilder.h"
 #include "llvm/IR/MemoryModelRelaxationAnnotations.h"
 #include "llvm/IR/Module.h"
+#include "llvm/IR/ProfDataUtils.h"
 #include "llvm/IR/Type.h"
 #include "llvm/IR/User.h"
 #include "llvm/IR/Value.h"
@@ -1259,8 +1260,7 @@ Value *AtomicExpandImpl::insertRMWLLSCLoop(
   BasicBlock *BB = Builder.GetInsertBlock();
   Function *F = BB->getParent();
 
-  assert(AddrAlign >=
-             F->getDataLayout().getTypeStoreSize(ResultTy) &&
+  assert(AddrAlign >= F->getDataLayout().getTypeStoreSize(ResultTy) &&
          "Expected at least natural alignment at this point.");
 
   // Given: atomicrmw some_op iN* %addr, iN %incr ordering
@@ -1295,7 +1295,13 @@ Value *AtomicExpandImpl::insertRMWLLSCLoop(
       TLI->emitStoreConditional(Builder, NewVal, Addr, MemOpOrder);
   Value *TryAgain = Builder.CreateICmpNE(
       StoreSuccess, ConstantInt::get(IntegerType::get(Ctx, 32), 0), "tryagain");
-  Builder.CreateCondBr(TryAgain, LoopBB, ExitBB);
+
+  Instruction *CondBr = Builder.CreateCondBr(TryAgain, LoopBB, ExitBB);
+
+  // Atomic RMW expands to a icmp loop, because it is hard to predict precise
+  // branch weigths we mark the branch as "unknown" (50/50) to prevent
+  // misleading optimizations.
+  setExplicitlyUnknownBranchWeightsIfProfiled(*CondBr, *F, DEBUG_TYPE);
 
   Builder.SetInsertPoint(ExitBB, ExitBB->begin());
   return Loaded;
diff --git a/llvm/test/Transforms/AtomicExpand/AArch64/expand-atomicrmw-xchg-fp.ll b/llvm/test/Transforms/AtomicExpand/AArch64/expand-atomicrmw-xchg-fp.ll
index 95a52aa0f7f52..b509b2469cfdc 100644
--- a/llvm/test/Transforms/AtomicExpand/AArch64/expand-atomicrmw-xchg-fp.ll
+++ b/llvm/test/Transforms/AtomicExpand/AArch64/expand-atomicrmw-xchg-fp.ll
@@ -1,8 +1,8 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals
 ; RUN: opt -codegen-opt-level=1 -S -mtriple=aarch64-- -passes=atomic-expand %s | FileCheck %s
 ; RUN: opt -codegen-opt-level=1 -S -mtriple=aarch64-- -mattr=+outline-atomics -passes=atomic-expand %s | FileCheck %s --check-prefix=OUTLINE-ATOMICS
 
-define void @atomic_swap_f16(ptr %ptr, half %val) nounwind {
+define void @atomic_swap_f16(ptr %ptr, half %val) !prof !0 {
 ; CHECK-LABEL: @atomic_swap_f16(
 ; CHECK-NEXT:    [[TMP1:%.*]] = bitcast half [[VAL:%.*]] to i16
 ; CHECK-NEXT:    br label [[ATOMICRMW_START:%.*]]
@@ -12,7 +12,7 @@ define void @atomic_swap_f16(ptr %ptr, half %val) nounwind {
 ; CHECK-NEXT:    [[TMP4:%.*]] = zext i16 [[TMP1]] to i64
 ; CHECK-NEXT:    [[TMP5:%.*]] = call i32 @llvm.aarch64.stxr.p0(i64 [[TMP4]], ptr elementtype(i16) [[PTR]])
 ; CHECK-NEXT:    [[TRYAGAIN:%.*]] = icmp ne i32 [[TMP5]], 0
-; CHECK-NEXT:    br i1 [[TRYAGAIN]], label [[ATOMICRMW_START]], label [[ATOMICRMW_END:%.*]]
+; CHECK-NEXT:    br i1 [[TRYAGAIN]], label [[ATOMICRMW_START]], label [[ATOMICRMW_END:%.*]], !prof [[PROF1:![0-9]+]]
 ; CHECK:       atomicrmw.end:
 ; CHECK-NEXT:    [[TMP6:%.*]] = bitcast i16 [[TMP3]] to half
 ; CHECK-NEXT:    ret void
@@ -27,7 +27,7 @@ define void @atomic_swap_f16(ptr %ptr, half %val) nounwind {
   ret void
 }
 
-define void @atomic_swap_f32(ptr %ptr, float %val) nounwind {
+define void @atomic_swap_f32(ptr %ptr, float %val) nounwind !prof !0 {
 ; CHECK-LABEL: @atomic_swap_f32(
 ; CHECK-NEXT:    [[TMP1:%.*]] = bitcast float [[VAL:%.*]] to i32
 ; CHECK-NEXT:    br label [[ATOMICRMW_START:%.*]]
@@ -37,7 +37,7 @@ define void @atomic_swap_f32(ptr %ptr, float %val) nounwind {
 ; CHECK-NEXT:    [[TMP4:%.*]] = zext i32 [[TMP1]] to i64
 ; CHECK-NEXT:    [[TMP5:%.*]] = call i32 @llvm.aarch64.stxr.p0(i64 [[TMP4]], ptr elementtype(i32) [[PTR]])
 ; CHECK-NEXT:    [[TRYAGAIN:%.*]] = icmp ne i32 [[TMP5]], 0
-; CHECK-NEXT:    br i1 [[TRYAGAIN]], label [[ATOMICRMW_START]], label [[ATOMICRMW_END:%.*]]
+; CHECK-NEXT:    br i1 [[TRYAGAIN]], label [[ATOMICRMW_START]], label [[ATOMICRMW_END:%.*]], !prof [[PROF1]]
 ; CHECK:       atomicrmw.end:
 ; CHECK-NEXT:    [[TMP6:%.*]] = bitcast i32 [[TMP3]] to float
 ; CHECK-NEXT:    ret void
@@ -52,7 +52,7 @@ define void @atomic_swap_f32(ptr %ptr, float %val) nounwind {
   ret void
 }
 
-define void @atomic_swap_f64(ptr %ptr, double %val) nounwind {
+define void @atomic_swap_f64(ptr %ptr, double %val) nounwind !prof !0 {
 ; CHECK-LABEL: @atomic_swap_f64(
 ; CHECK-NEXT:    [[TMP1:%.*]] = bitcast double [[VAL:%.*]] to i64
 ; CHECK-NEXT:    br label [[ATOMICRMW_START:%.*]]
@@ -60,7 +60,7 @@ define void @atomic_swap_f64(ptr %ptr, double %val) nounwind {
 ; CHECK-NEXT:    [[TMP2:%.*]] = call i64 @llvm.aarch64.ldaxr.p0(ptr elementtype(i64) [[PTR:%.*]])
 ; CHECK-NEXT:    [[TMP3:%.*]] = call i32 @llvm.aarch64.stxr.p0(i64 [[TMP1]], ptr elementtype(i64) [[PTR]])
 ; CHECK-NEXT:    [[TRYAGAIN:%.*]] = icmp ne i32 [[TMP3]], 0
-; CHECK-NEXT:    br i1 [[TRYAGAIN]], label [[ATOMICRMW_START]], label [[ATOMICRMW_END:%.*]]
+; CHECK-NEXT:    br i1 [[TRYAGAIN]], label [[ATOMICRMW_START]], label [[ATOMICRMW_END:%.*]], !prof [[PROF1]]
 ; CHECK:       atomicrmw.end:
 ; CHECK-NEXT:    [[TMP4:%.*]] = bitcast i64 [[TMP2]] to double
 ; CHECK-NEXT:    ret void
@@ -74,3 +74,17 @@ define void @atomic_swap_f64(ptr %ptr, double %val) nounwind {
   %t1 = atomicrmw xchg ptr %ptr, double %val acquire
   ret void
 }
+
+!0 = !{!"function_entry_count", i64 1000}
+;.
+; CHECK: attributes #[[ATTR0:[0-9]+]] = { nounwind }
+; CHECK: attributes #[[ATTR1:[0-9]+]] = { nofree nounwind willreturn }
+;.
+; OUTLINE-ATOMICS: attributes #[[ATTR0:[0-9]+]] = { "target-features"="+outline-atomics" }
+; OUTLINE-ATOMICS: attributes #[[ATTR1:[0-9]+]] = { nounwind "target-features"="+outline-atomics" }
+;.
+; CHECK: [[META0:![0-9]+]] = !{!"function_entry_count", i64 1000}
+; CHECK: [[PROF1]] = !{!"unknown", !"atomic-expand"}
+;.
+; OUTLINE-ATOMICS: [[META0:![0-9]+]] = !{!"function_entry_count", i64 1000}
+;.

mtrofin

LGTM

arsenm

Title is wrong "icmp loop"?

jinhuang1102 · 2025-11-04T20:43:11Z

Title is wrong "icmp loop"?

Rewrite the title and description. Thx!

boomanaiden154 · 2025-11-04T21:59:37Z

llvm/lib/CodeGen/AtomicExpandPass.cpp

+
+  Instruction *CondBr = Builder.CreateCondBr(TryAgain, LoopBB, ExitBB);
+
+  // Atomic RMW expands to a icmp loop, because it is hard to predict precise


This still needs to be updated.

llvmbot added llvm:codegen llvm:transforms labels Nov 4, 2025

jinhuang1102 requested review from arsenm and mtrofin November 4, 2025 00:42

jinhuang1102 requested review from alanzhao1 and tmsri November 4, 2025 00:42

mtrofin approved these changes Nov 4, 2025

View reviewed changes

arsenm reviewed Nov 4, 2025

View reviewed changes

jinhuang1102 changed the title ~~[profcheck] Add unknown branch weights to expand icmp loop.~~ [profcheck] Add unknown branch weights to expand LL/SR loop. Nov 4, 2025

boomanaiden154 reviewed Nov 4, 2025

View reviewed changes

[profcheck] Add unknown branch weights to expand icmp loop.

03aa99e

jinhuang1102 force-pushed the fixup/prof-missing-icmp-atomicop branch from c1bbb4e to 03aa99e Compare November 4, 2025 23:02

jinhuang1102 merged commit fa5cd27 into llvm:main Nov 5, 2025
8 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[profcheck] Add unknown branch weights to expand LL/SR loop. #166273

[profcheck] Add unknown branch weights to expand LL/SR loop. #166273

Uh oh!

jinhuang1102 commented Nov 4, 2025 •

edited

Loading

Uh oh!

llvmbot commented Nov 4, 2025

Uh oh!

mtrofin left a comment

Uh oh!

arsenm left a comment

Uh oh!

jinhuang1102 commented Nov 4, 2025

Uh oh!

boomanaiden154 Nov 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants


		Instruction *CondBr = Builder.CreateCondBr(TryAgain, LoopBB, ExitBB);

		// Atomic RMW expands to a icmp loop, because it is hard to predict precise

[profcheck] Add unknown branch weights to expand LL/SR loop. #166273

[profcheck] Add unknown branch weights to expand LL/SR loop. #166273

Uh oh!

Conversation

jinhuang1102 commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Nov 4, 2025

Uh oh!

mtrofin left a comment

Choose a reason for hiding this comment

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

jinhuang1102 commented Nov 4, 2025

Uh oh!

boomanaiden154 Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jinhuang1102 commented Nov 4, 2025 •

edited

Loading