Skip to content

Conversation

@gulfemsavrun
Copy link
Contributor

@gulfemsavrun gulfemsavrun commented Jan 23, 2025

Currently, ConstantMergePass merges an unnamed_addr with a non-unnamed_addr constant as it is explained in LangRef.

"Note that a constant with significant address can be merged with a unnamed_addr constant, the result being a constant whose address is significant."

https://llvm.org/docs/LangRef.html#global-variables

This can result in a situation where Clang might violate C semantics, and here is a small reproducer to explain the problem:

const char foo_string[] = "foo";
const char* foo_func(void) { return "foo"; }
int is_foo(const char* p) { return p == foo_string; }

int main() {
  printf("is_foo: %d\n", is_foo("foo"));
}

When we compile with -O0, where ConstantMerge is not applied, Clang and GCC have the same result.

clang -O0 foo.c -o foo
./foo
is_foo: 0

gcc -O0 foo.c -o foo
./foo
is_foo: 0

When we compile with -O1 and higher, where ConstantMerge is applied, Clang and GCC have different results.

clang -O3 foo.c -o foo
./foo
is_foo: 1

gcc -O3 foo.c -o foo
./foo
is_foo: 0

Here's the IR before ConstantMergePass pass:

@.str = private unnamed_addr constant [4 x i8] c"foo\00", align 1 @_ZL10foo_string = internal constant [4 x i8] c"foo\00", align 1 @.str.1 = private unnamed_addr constant [12 x i8] c"is_foo: %d\0A\00", align 1

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
define noundef i8* @_Z8foo_funcv() local_unnamed_addr #0 {
 ret i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64
 0)
 }

 ; Function Attrs: mustprogress nofree norecurse nosync nounwind
 readnone
 uwtable willreturn
 define noundef i32 @_Z6is_fooPKc(i8* noundef readnone %0)
 local_unnamed_addr #0 {
 %2 = icmp eq i8* %0, getelementptr inbounds ([4 x i8], [4 x i8]*
 @_ZL10foo_string, i64 0, i64 0)
 %3 = zext i1 %2 to i32
 ret i32 %3
 }

 ; Function Attrs: mustprogress nofree norecurse nounwind uwtable
 define noundef i32 @main() local_unnamed_addr #1 {
 %1 = tail call i32 (i8*, ...) @printf(i8* noundef nonnull
 dereferenceable(1) getelementptr inbounds ([12 x i8], [12 x i8]*
 @.str.1, i64 0, i64 0), i32 noundef zext (i1 icmp eq (i8* getelementptr
 inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i8* getelementptr
 inbounds ([4 x i8], [4 x i8]* @_ZL10foo_string, i64 0, i64 0)) to i32))
 ret i32 0
}

MergeConstantPass merges _ZL10foo_string into .str, where it merges a non-unnamed_addr constant into an unnamed_addr constant.

IR Dump After ConstantMergePass on [module] ***

@.str = private constant [4 x i8] c"foo\00", align 1 @.str.1 = private unnamed_addr constant [12 x i8] c"is_foo: %d\0A\00", align 1

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
define noundef i8* @_Z8foo_funcv() local_unnamed_addr #0 { ret i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0) }

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
define noundef i32 @_Z6is_fooPKc(i8* noundef readnone %0) local_unnamed_addr #0 {
%2 = icmp eq i8* %0, getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0)
%3 = zext i1 %2 to i32
ret i32 %3
}

; Function Attrs: mustprogress nofree norecurse nounwind uwtable define noundef i32 @main() local_unnamed_addr #1 { %1 = tail call i32 (i8*, ...) @printf(i8* noundef nonnull dereferenceable(1) getelementptr inbounds ([12 x i8], [12 x i8]* @.str.1, i64 0, i64 0), i32 noundef 1)
ret i32 0
}

This transformation might violate the following C pointer semantics:

"Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space."
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf

This patch changes ConstantMerge pass to only allow merging when when a constant is marked with unnamed_addr attribute. I also found an old GitHub issue (#9299) where a similar issue about invalid constant merging is explained.

Currently, ConstantMergePass merges an unnamed_addr with a
non-unnamed_addr constant as it is explained in LangRef.

"Note that a constant with significant address can be
merged with a unnamed_addr constant, the result being a
constant whose address is significant."

https://llvm.org/docs/LangRef.html#global-variables

This can result in a situation where Clang might vioalate C
semantics, and here is a small reproducer to explain the
problem:

const char foo_string[] = "foo";
const char* foo_func(void) { return "foo"; }
int is_foo(const char* p) { return p == foo_string; }

int main() {
  printf("is_foo: %d\n", is_foo("foo"));
}

When we compile with -O0, where ConstantMerge is not
applied, Clang and GCC have the same result.

clang -O0 foo.c -o foo
./foo
is_foo: 0

gcc -O0 foo.c -o foo
./foo
is_foo: 0

When we compile -O1 and higher, where ConstantMerge is
applied, Clang and GCC have different results.

clang -O3 foo.c -o foo
./foo
is_foo: 1

gcc -O3 foo.c -o foo
./foo
is_foo: 0

Here's the IR before ConstantMergePass pass:
@.str = private unnamed_addr constant [4 x i8] c"foo\00", align 1
@_ZL10foo_string = internal constant [4 x i8] c"foo\00", align 1
@.str.1 = private unnamed_addr constant [12 x i8] c"is_foo: %d\0A\00",
align 1

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone
uwtable willreturn
define noundef i8* @_Z8foo_funcv() local_unnamed_addr #0 {
 ret i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64
 0)
 }

 ; Function Attrs: mustprogress nofree norecurse nosync nounwind
 readnone
 uwtable willreturn
 define noundef i32 @_Z6is_fooPKc(i8* noundef readnone %0)
 local_unnamed_addr #0 {
 %2 = icmp eq i8* %0, getelementptr inbounds ([4 x i8], [4 x i8]*
 @_ZL10foo_string, i64 0, i64 0)
 %3 = zext i1 %2 to i32
 ret i32 %3
 }

 ; Function Attrs: mustprogress nofree norecurse nounwind uwtable
 define noundef i32 @main() local_unnamed_addr llvm#1 {
 %1 = tail call i32 (i8*, ...) @printf(i8* noundef nonnull
 dereferenceable(1) getelementptr inbounds ([12 x i8], [12 x i8]*
 @.str.1, i64 0, i64 0), i32 noundef zext (i1 icmp eq (i8* getelementptr
 inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i8* getelementptr
 inbounds ([4 x i8], [4 x i8]* @_ZL10foo_string, i64 0, i64 0)) to i32))
 ret i32 0
}

MergeConstantPass merges _ZL10foo_string into .str, where it merges
a non-unnamed_addr constant into an unnamed_addr constant.

IR Dump After ConstantMergePass on [module] ***

@.str = private constant [4 x i8] c"foo\00", align 1
@.str.1 = private unnamed_addr constant [12 x i8] c"is_foo: %d\0A\00",
align 1

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone
uwtable willreturn
define noundef i8* @_Z8foo_funcv() local_unnamed_addr #0 {
ret i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0)
}

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone
uwtable willreturn
define noundef i32 @_Z6is_fooPKc(i8* noundef readnone %0)
local_unnamed_addr #0 {
%2 = icmp eq i8* %0, getelementptr inbounds ([4 x i8], [4 x i8]* @.str,
i64 0, i64 0)
%3 = zext i1 %2 to i32
ret i32 %3
}

; Function Attrs: mustprogress nofree norecurse nounwind uwtable
define noundef i32 @main() local_unnamed_addr llvm#1 {
%1 = tail call i32 (i8*, ...) @printf(i8* noundef nonnull
dereferenceable(1) getelementptr inbounds ([12 x i8], [12 x i8]*
@.str.1, i64 0, i64 0), i32 noundef 1)
ret i32 0
}

This transformation might violate the following C pointer
semantics:

"Two pointers compare equal if and only if both are null
pointers, both are pointers to the same object (including
a pointer to an object and a subobject at its beginning)
or function, both are pointers to one past the last element
of the same array object, or one is a pointer to one past
the end of one array object and the other is a pointer to
the start of a different array object that happens to
immediately follow the first array object in the address
space."
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf

So, this patch changes ConstantMerge pass to only allow
merging when when a constant is marked with unnamed_addr
attribute. I also found an old GitHub issue where a similar
issue about invalid constant merging is explained.
llvm#9299
@llvmbot
Copy link
Member

llvmbot commented Jan 23, 2025

@llvm/pr-subscribers-llvm-transforms

Author: None (gulfemsavrun)

Changes

Currently, ConstantMergePass merges an unnamed_addr with a non-unnamed_addr constant as it is explained in LangRef.

"Note that a constant with significant address can be merged with a unnamed_addr constant, the result being a constant whose address is significant."

https://llvm.org/docs/LangRef.html#global-variables

This can result in a situation where Clang might vioalate C semantics, and here is a small reproducer to explain the problem:

const char foo_string[] = "foo";
const char* foo_func(void) { return "foo"; }
int is_foo(const char* p) { return p == foo_string; }

int main() {
printf("is_foo: %d\n", is_foo("foo"));
}

When we compile with -O0, where ConstantMerge is not applied, Clang and GCC have the same result.

clang -O0 foo.c -o foo
./foo
is_foo: 0

gcc -O0 foo.c -o foo
./foo
is_foo: 0

When we compile -O1 and higher, where ConstantMerge is applied, Clang and GCC have different results.

clang -O3 foo.c -o foo
./foo
is_foo: 1

gcc -O3 foo.c -o foo
./foo
is_foo: 0

Here's the IR before ConstantMergePass pass:
@.str = private unnamed_addr constant [4 x i8] c"foo\00", align 1 @_ZL10foo_string = internal constant [4 x i8] c"foo\00", align 1 @.str.1 = private unnamed_addr constant [12 x i8] c"is_foo: %d\0A\00", align 1

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
define noundef i8* @_Z8foo_funcv() local_unnamed_addr #0 {
ret i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64
0)
}

; Function Attrs: mustprogress nofree norecurse nosync nounwind
readnone
uwtable willreturn
define noundef i32 @_Z6is_fooPKc(i8* noundef readnone %0)
local_unnamed_addr #0 {
%2 = icmp eq i8* %0, getelementptr inbounds ([4 x i8], [4 x i8]*
@_ZL10foo_string, i64 0, i64 0)
%3 = zext i1 %2 to i32
ret i32 %3
}

; Function Attrs: mustprogress nofree norecurse nounwind uwtable
define noundef i32 @main() local_unnamed_addr #1 {
%1 = tail call i32 (i8*, ...) @printf(i8* noundef nonnull
dereferenceable(1) getelementptr inbounds ([12 x i8], [12 x i8]*
@.str.1, i64 0, i64 0), i32 noundef zext (i1 icmp eq (i8* getelementptr
inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i8* getelementptr
inbounds ([4 x i8], [4 x i8]* @_ZL10foo_string, i64 0, i64 0)) to i32))
ret i32 0
}

MergeConstantPass merges _ZL10foo_string into .str, where it merges a non-unnamed_addr constant into an unnamed_addr constant.

IR Dump After ConstantMergePass on [module] ***

@.str = private constant [4 x i8] c"foo\00", align 1 @.str.1 = private unnamed_addr constant [12 x i8] c"is_foo: %d\0A\00", align 1

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
define noundef i8* @_Z8foo_funcv() local_unnamed_addr #0 { ret i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0) }

; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
define noundef i32 @_Z6is_fooPKc(i8* noundef readnone %0) local_unnamed_addr #0 {
%2 = icmp eq i8* %0, getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0)
%3 = zext i1 %2 to i32
ret i32 %3
}

; Function Attrs: mustprogress nofree norecurse nounwind uwtable define noundef i32 @main() local_unnamed_addr #1 { %1 = tail call i32 (i8*, ...) @printf(i8* noundef nonnull dereferenceable(1) getelementptr inbounds ([12 x i8], [12 x i8]* @.str.1, i64 0, i64 0), i32 noundef 1)
ret i32 0
}

This transformation might violate the following C pointer semantics:

"Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space."
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf

So, this patch changes ConstantMerge pass to only allow merging when when a constant is marked with unnamed_addr attribute. I also found an old GitHub issue where a similar issue about invalid constant merging is explained. #9299


Full diff: https://github.com/llvm/llvm-project/pull/124050.diff

5 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/ConstantMerge.cpp (+1-1)
  • (modified) llvm/test/Transforms/ConstantMerge/2011-01-15-EitherOrder.ll (+1-1)
  • (modified) llvm/test/Transforms/ConstantMerge/merge-both.ll (+9-8)
  • (modified) llvm/test/Transforms/ConstantMerge/merge-dbg.ll (+3-3)
  • (modified) llvm/test/Transforms/ConstantMerge/unnamed-addr.ll (+5-3)
diff --git a/llvm/lib/Transforms/IPO/ConstantMerge.cpp b/llvm/lib/Transforms/IPO/ConstantMerge.cpp
index a1face0a6a9c3a..03b831526f4076 100644
--- a/llvm/lib/Transforms/IPO/ConstantMerge.cpp
+++ b/llvm/lib/Transforms/IPO/ConstantMerge.cpp
@@ -102,7 +102,7 @@ isUnmergeableGlobal(GlobalVariable *GV,
 
 enum class CanMerge { No, Yes };
 static CanMerge makeMergeable(GlobalVariable *Old, GlobalVariable *New) {
-  if (!Old->hasGlobalUnnamedAddr() && !New->hasGlobalUnnamedAddr())
+  if (!Old->hasGlobalUnnamedAddr() || !New->hasGlobalUnnamedAddr())
     return CanMerge::No;
   if (hasMetadataOtherThanDebugLoc(Old))
     return CanMerge::No;
diff --git a/llvm/test/Transforms/ConstantMerge/2011-01-15-EitherOrder.ll b/llvm/test/Transforms/ConstantMerge/2011-01-15-EitherOrder.ll
index 7742459ed4659b..ff73909514ca3d 100644
--- a/llvm/test/Transforms/ConstantMerge/2011-01-15-EitherOrder.ll
+++ b/llvm/test/Transforms/ConstantMerge/2011-01-15-EitherOrder.ll
@@ -6,7 +6,7 @@ declare i32 @zed(ptr, ptr)
 %struct.foobar = type { i32 }
 ; CHECK: bar.d
 @bar.d =  unnamed_addr constant %struct.foobar zeroinitializer, align 4
-; CHECK-NOT: foo.d
+; CHECK: foo.d
 @foo.d = internal constant %struct.foobar zeroinitializer, align 4
 define i32 @main() nounwind ssp {
 entry:
diff --git a/llvm/test/Transforms/ConstantMerge/merge-both.ll b/llvm/test/Transforms/ConstantMerge/merge-both.ll
index ab442e17b47405..0ed0bdb7609dc2 100644
--- a/llvm/test/Transforms/ConstantMerge/merge-both.ll
+++ b/llvm/test/Transforms/ConstantMerge/merge-both.ll
@@ -1,20 +1,21 @@
 ; RUN: opt -S < %s -passes=constmerge | FileCheck %s
-; Test that in one run var3 is merged into var2 and var1 into var4.
-; Test that we merge @var5 and @var6 into one with the higher alignment
+; Test that in one run var2 is merged into var4 and var6 is merged into var8.
+; Test that we merge @var6 and @var8 into one with the higher alignment
 
 declare void @zed(ptr, ptr)
 
 %struct.foobar = type { i32 }
 
 @var1 = internal constant %struct.foobar { i32 2 }
-@var2 = unnamed_addr constant %struct.foobar { i32 2 }
+@var2 = private unnamed_addr constant %struct.foobar { i32 2 }
 @var3 = internal constant %struct.foobar { i32 2 }
-@var4 = unnamed_addr constant %struct.foobar { i32 2 }
+@var4 = private unnamed_addr constant %struct.foobar { i32 2 }
 
 ; CHECK:      %struct.foobar = type { i32 }
 ; CHECK-NOT: @
-; CHECK: @var2 = constant %struct.foobar { i32 2 }
-; CHECK-NEXT: @var4 = constant %struct.foobar { i32 2 }
+; CHECK: @var1 = internal constant %struct.foobar { i32 2 }
+; CHECK-NEXT: @var3 = internal constant %struct.foobar { i32 2 }
+; CHECK-NEXT: @var4 = private unnamed_addr constant %struct.foobar { i32 2 }
 
 declare void @helper(ptr)
 @var5 = internal constant [16 x i8] c"foo1bar2foo3bar\00", align 16
@@ -22,8 +23,9 @@ declare void @helper(ptr)
 @var7 = internal constant [16 x i8] c"foo1bar2foo3bar\00"
 @var8 = private unnamed_addr constant [16 x i8] c"foo1bar2foo3bar\00"
 
+; CHECK: @var5 = internal constant [16 x i8] c"foo1bar2foo3bar\00", align 16
 ; CHECK-NEXT: @var7 = internal constant [16 x i8] c"foo1bar2foo3bar\00"
-; CHECK-NEXT: @var8 = private constant [16 x i8] c"foo1bar2foo3bar\00", align 16
+; CHECK-NEXT: @var8 = private unnamed_addr constant [16 x i8] c"foo1bar2foo3bar\00", align 1
 
 @var4a = alias %struct.foobar, ptr @var4
 @llvm.used = appending global [1 x ptr] [ptr @var4a], section "llvm.metadata"
@@ -38,4 +40,3 @@ entry:
   call void @helper(ptr @var8)
   ret i32 0
 }
-
diff --git a/llvm/test/Transforms/ConstantMerge/merge-dbg.ll b/llvm/test/Transforms/ConstantMerge/merge-dbg.ll
index de83ffbbda058a..5b396f16a20d2e 100644
--- a/llvm/test/Transforms/ConstantMerge/merge-dbg.ll
+++ b/llvm/test/Transforms/ConstantMerge/merge-dbg.ll
@@ -1,8 +1,8 @@
 ; RUN: opt < %s -passes=constmerge -S | FileCheck %s
 
-; CHECK: = constant i32 1, !dbg [[A:![0-9]+]], !dbg [[B:![0-9]+]]
-@a = internal constant i32 1, !dbg !0
-@b = unnamed_addr constant i32 1, !dbg !9
+; CHECK: = private unnamed_addr constant i32 1, !dbg [[A:![0-9]+]], !dbg [[B:![0-9]+]]
+@a = private unnamed_addr constant i32 1, !dbg !0
+@b = private unnamed_addr constant i32 1, !dbg !9
 
 define void @test1(ptr %P1, ptr %P2) {
   store ptr @a, ptr %P1
diff --git a/llvm/test/Transforms/ConstantMerge/unnamed-addr.ll b/llvm/test/Transforms/ConstantMerge/unnamed-addr.ll
index 2490e484c95260..f2b99bae5f89ac 100644
--- a/llvm/test/Transforms/ConstantMerge/unnamed-addr.ll
+++ b/llvm/test/Transforms/ConstantMerge/unnamed-addr.ll
@@ -1,6 +1,6 @@
 ; RUN: opt -passes=constmerge -S < %s | FileCheck %s
-; Test which corresponding x and y are merged and that unnamed_addr
-; is correctly set.
+; Test which corresponding x and y are merged when they both are marked with
+; unnamed_addr attribute.
 
 declare void @zed(ptr, ptr)
 
@@ -23,7 +23,9 @@ declare void @zed(ptr, ptr)
 ; CHECK-NOT: @
 ; CHECK: @test1.x = internal constant %struct.foobar { i32 1 }
 ; CHECK-NEXT: @test1.y = constant %struct.foobar { i32 1 }
-; CHECK-NEXT: @test2.y = constant %struct.foobar { i32 2 }
+; CHECK-NEXT: @test2.x = internal constant %struct.foobar { i32 2 }
+; CHECK-NEXT: @test2.y = unnamed_addr constant %struct.foobar { i32 2 }
+; CHECK-NEXT: @test3.x = internal unnamed_addr constant %struct.foobar { i32 3 }
 ; CHECK-NEXT: @test3.y = constant %struct.foobar { i32 3 }
 ; CHECK-NEXT: @test4.y = unnamed_addr constant %struct.foobar { i32 4 }
 ; CHECK-NOT: @

@gulfemsavrun
Copy link
Contributor Author

gulfemsavrun commented Jan 23, 2025

This issue was discovered when we attempt to add RelLookupTableConverterPass to LTO prelink pass pipeline as in #124052. RelLookupTableConverterPass converts lookup tables to relative lookup tables to make them PIC-friendly.

The issue originally reported in https://reviews.llvm.org/D94355?id=315562#2717531, where we cannot build Clang with LTO when RelLookupTableConverterPass is added to LTO prelink passes. We disabled this pass in LTO prelink passes to avoid the issue at that time. But, we would like re-enable this pass in the LTO pipeline because we currently build with LTO and need our code to be PIC-friendly. Here's the issue that happens when we are building Clang with LTO.

There is a lookup table defined in https://github.com/llvm/llvm-project/blob/main/clang/lib/Basic/TargetInfo.cpp#L274:

const char *TargetInfo::getTypeFormatModifier(IntType T) {
  switch (T) {
  default: llvm_unreachable("not an integer!");
  case SignedChar:
  case UnsignedChar:     return "hh";
  case SignedShort:
  case UnsignedShort:    return "h";
  case SignedInt:
  case UnsignedInt:      return "";
  case SignedLong:
  case UnsignedLong:     return "l";
  case SignedLongLong:
  case UnsignedLongLong: return "ll";
  }
}

This lookup table is converted to a relative lookup table via RelLookupTableConverter pass.

There is a constant that is initialized in https://github.com/llvm/llvm-project/blob/main/llvm/lib/Support/Chrono.cpp#L18

const char llvm::detail::unit<std::ratio<3600>>::value[] = "h";

When built with LTO, ConstantMerge pass decides to merge constants that are initialized to "h". One of these constants are the elements of the the relative lookup table. This merge results in a relocation error in lld:

ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol llvm::detail::unit<std::ratio<3600l, 1l> >::value; recompile with -fPIC
>>> defined in lib/libclang-cpp.so.20.0git.lto.o
>>> referenced by ld-temp.o
>>>       

There are two ways that I can think of resolving this issue:

  1. Change ConstantMerge pass as I implemented in this PR if we agree that this kind of constant merging is a violation of C semantics and GCC and Clang should be consistent in their results. This requires guidance from maintainers that have more expertise in C semantics.

  2. ConstantMerge pass can have a side effect on RelLookupTableConverter pass, so add RelLookupTableConverterPass to post-link optimization pipeline after ConstantMerge as in [PassBuilder] Add RelLookupTableConverterPass to LTO #124053.

I've tagged some people who might provide feedback on this issue, and please feel free to pull in more people.

@nikic
Copy link
Contributor

nikic commented Jan 23, 2025

For the C semantics issue, I think we'd have to add a new unnamed_addr variant, not change the meaning of the existing one. See #63628 for related issue (though I feel like I saw a longer discussion on this somewhere else...)

I don't think your actual problem is really related though. LLVM shouldn't do transforms that ultimately result in a relocation error. Can you explain in more detail why it happens, maybe with a minimal test case? I'm not sure I understand where the problem originates.

@gulfemsavrun gulfemsavrun requested a review from PiJoules January 23, 2025 16:35
@efriedma-quic
Copy link
Collaborator

though I feel like I saw a longer discussion on this somewhere else...

I don't recall anything else. At least not for that particular aspect of it; there's discussion elsewhere about LTO making unnamed_addr constants not have a unique address, but that's only loosely related.

I don't think your actual problem is really related though

I agree. There isn't any direct connection between the linkage of a global, and whether it's unnamed_addr. Maybe constant merging needs a check to ensure it doesn't change the linkage of globals used by a pointer subtraction?

@gulfemsavrun
Copy link
Contributor Author

gulfemsavrun commented Jan 23, 2025

For the C semantics issue, I think we'd have to add a new unnamed_addr variant, not change the meaning of the existing one. See #63628 for related issue (though I feel like I saw a longer discussion on this somewhere else...)

I don't think your actual problem is really related though. LLVM shouldn't do transforms that ultimately result in a relocation error. Can you explain in more detail why it happens, maybe with a minimal test case? I'm not sure I understand where the problem originates.

Here's a small reproducer to explain the issue with RelLookupTableConverterPass:

a.c:

const char *getTypeFormatModifier(int T) {
  switch (T) {
  case 0:
  case 1:     return "hh";
  case 2:
  case 3:    return "h";
  }
  
  return "";
}

b.c:

extern const char value[] = "h";
clang -fPIC -flto -O3 -c a.cc -o a.o
clang -fPIC -flto -O3 -c b.cc -o b.o 
clang -fPIC -flto -O3 -fuse-ld=./build/stage1/bin/ld.lld -shared -o output-lto a.o b.o

We convert switch lookup table into a relative lookup table, where we put relative offsets into the relative lookup table.

; *** IR Dump Before RelLookupTableConverterPass on [module] ***

@.str = private unnamed_addr constant [3 x i8] c"hh\00", align 1
@.str.1 = private unnamed_addr constant [2 x i8] c"h\00", align 1
@.str.2 = private unnamed_addr constant [1 x i8] zeroinitializer, align 1
@switch.table._Z21getTypeFormatModifieri = private unnamed_addr constant [4 x ptr] [ptr @.str, ptr @.str, ptr @.str.1, ptr @.str.1], align 8

; *** IR Dump After RelLookupTableConverterPass on [module] ***

@.str = private unnamed_addr constant [3 x i8] c"hh\00", align 1
@.str.1 = private unnamed_addr constant [2 x i8] c"h\00", align 1
@.str.2 = private unnamed_addr constant [1 x i8] zeroinitializer, align 1
@switch.table._Z21getTypeFormatModifieri.rel = private unnamed_addr constant [4 x i32] [i32 trunc (i64 sub (i64 ptrtoint (ptr @.str to i64), i64 ptrtoint (ptr @switch.table._Z21getTypeFormatModifieri.rel to i64)) to i32), i32 trunc (i64 sub (i64 ptrtoint (ptr @.str to i64), i64 ptrtoint (ptr @switch.table._Z21getTypeFormatModifieri.rel to i64)) to i32), i32 trunc (i64 sub (i64 ptrtoint (ptr @.str.1 to i64), i64 ptrtoint (ptr @switch.table._Z21getTypeFormatModifieri.rel to i64)) to i32), i32 trunc (i64 sub (i64 ptrtoint (ptr @.str.1 to i64), i64 ptrtoint (ptr @switch.table._Z21getTypeFormatModifieri.rel to i64)) to i32)], align 4

Later in post-link LTO phase, ConstantMerge pass decides to merge @.str.1 = private unnamed_addr constant [2 x i8] c"h\00" into @value = local_unnamed_addr constant [2 x i8] c"h\00"

@.str = private unnamed_addr constant [3 x i8] c"hh\00", align 1
@.str.2 = private unnamed_addr constant [1 x i8] zeroinitializer, align 1
@switch.table._Z21getTypeFormatModifieri.rel = private unnamed_addr constant [4 x i32] [i32 trunc (i64 sub (i64 ptrtoint (ptr @.str to i64), i64 ptrtoint (ptr @switch.table._Z21getTypeFormatModifieri.rel to i64)) to i32), i32 trunc (i64 sub (i64 ptrtoint (ptr @.str to i64), i64 ptrtoint (ptr @switch.table._Z21getTypeFormatModifieri.rel to i64)) to i32), i32 trunc (i64 sub (i64 ptrtoint (ptr @value to i64), i64 ptrtoint (ptr @switch.table._Z21getTypeFormatModifieri.rel to i64)) to i32), i32 trunc (i64 sub (i64 ptrtoint (ptr @value to i64), i64 ptrtoint (ptr @switch.table._Z21getTypeFormatModifieri.rel to i64)) to i32)], align 4
@value = local_unnamed_addr constant [2 x i8] c"h\00", align 1

We apply RelLookupTableConverterPass so that the relative lookup table can be put into .rodata section. But, in this case the relative lookup table was put into .data.rel.ro section because one of its values points to a relative address between local_unnamed_addr constant and table now.

ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'value'; recompile with -fPIC
>>> defined in output-lto.lto.o
>>> referenced by ld-temp.o
>>>               output-lto.lto.o:(.data.rel.ro..Lswitch.table._Z21getTypeFormatModifieri.rel+0x8)

@gulfemsavrun
Copy link
Contributor Author

I think the easier way to unblock this issue is to move RelLookupTableConverterPass after ConstantMerge in the LTO post-link pipeline as in #124053. Running this pass in LTO post-link step might also help this pass discover more opportunities.

@efriedma-quic
Copy link
Collaborator

efriedma-quic commented Jan 24, 2025

Okay, that's what I expected: pointer subtraction like that only works if both symbols are "local" (in the ELF sense). ConstantMerge isn't aware of the pointer subtraction, so it blindly merges without checking for that.

unnamed_addr is a red herring: unnamed_addr is correlated with symbol linkage for C strings work, but it's not directly connected to the issue. clang generates unnamed_addr variables with default linkage in some cases.

@gulfemsavrun
Copy link
Contributor Author

gulfemsavrun commented Jan 24, 2025

I agree. There isn't any direct connection between the linkage of a global, and whether it's unnamed_addr. Maybe constant merging needs a check to ensure it doesn't change the linkage of globals used by a pointer subtraction?

Okay, that's what I expected: pointer subtraction like that only works if both symbols are "local" (in the ELF sense). ConstantMerge isn't aware of the pointer subtraction, so it blindly merges without checking for that.

unnamed_addr is a red herring: unnamed_addr is correlated with symbol linkage for C strings work, but it's not directly connected to the issue. clang generates unnamed_addr variables with default linkage in some cases.

Thanks for the feedback @efriedma-quic. It is becoming more clear that the issue in ConstantMerge pass is more about linkage type rather than unnamed_addr. Merging a non-unnamed_addr constant into an unnamed_addr constant might still be questionable in terms of language semantics, but that's kind of separate. How do you feel about adding a local linkage check into ConstantMerge like in #124220? Is it too conservative? This might be another solution for the problem that we ran into with RelLookupTableConverterPass.

@efriedma-quic
Copy link
Collaborator

#124220 is basically the right idea, yes. Maybe we could do something more fancy, at least on ELF targets: if we make the external variable an alias for the internal variable, we can perform the merge without modifying either symbol's linkage. (See GlobalMerge for an example of creating aliases like this.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants