-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[LLD][COFF] Fix handling of weak aliases referencing lazy symbols #112243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The assumption that a symbol is either `Defined` or `Undefined` is not always true for some cases. For example, `mangleMaybe` may create a weak alias to a lazy archive symbol.
|
@llvm/pr-subscribers-lld Author: Jacek Caban (cjacek) ChangesThe assumption that a symbol is either Full diff: https://github.com/llvm/llvm-project/pull/112243.diff 4 Files Affected:
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index 85a58a3677181a..12e1ae62811270 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -1340,7 +1340,7 @@ void LinkerDriver::maybeCreateECExportThunk(StringRef name, Symbol *&sym) {
if (!sym)
return;
if (auto undef = dyn_cast<Undefined>(sym))
- def = undef->getWeakAlias();
+ def = undef->getDefinedWeakAlias();
else
def = dyn_cast<Defined>(sym);
if (!def)
@@ -1376,7 +1376,7 @@ void LinkerDriver::createECExportThunks() {
continue;
Defined *targetSym;
if (auto undef = dyn_cast<Undefined>(sym))
- targetSym = undef->getWeakAlias();
+ targetSym = undef->getDefinedWeakAlias();
else
targetSym = dyn_cast<Defined>(sym);
if (!targetSym)
diff --git a/lld/COFF/Symbols.cpp b/lld/COFF/Symbols.cpp
index 567c2b93776c94..89f2da02bdcf42 100644
--- a/lld/COFF/Symbols.cpp
+++ b/lld/COFF/Symbols.cpp
@@ -112,12 +112,12 @@ DefinedImportThunk::DefinedImportThunk(COFFLinkerContext &ctx, StringRef name,
ImportThunkChunk *chunk)
: Defined(DefinedImportThunkKind, name), wrappedSym(s), data(chunk) {}
-Defined *Undefined::getWeakAlias() {
+Symbol *Undefined::getWeakAlias() {
// A weak alias may be a weak alias to another symbol, so check recursively.
DenseSet<Symbol *> weakChain;
for (Symbol *a = weakAlias; a; a = cast<Undefined>(a)->weakAlias) {
- if (auto *d = dyn_cast<Defined>(a))
- return d;
+ if (!isa<Undefined>(a))
+ return a;
if (!weakChain.insert(a).second)
break; // We have a cycle.
}
@@ -125,7 +125,7 @@ Defined *Undefined::getWeakAlias() {
}
bool Undefined::resolveWeakAlias() {
- Defined *d = getWeakAlias();
+ Defined *d = getDefinedWeakAlias();
if (!d)
return false;
diff --git a/lld/COFF/Symbols.h b/lld/COFF/Symbols.h
index 9b21e09bf83a42..a898ebf05fd809 100644
--- a/lld/COFF/Symbols.h
+++ b/lld/COFF/Symbols.h
@@ -340,7 +340,10 @@ class Undefined : public Symbol {
// If this symbol is external weak, try to resolve it to a defined
// symbol by searching the chain of fallback symbols. Returns the symbol if
// successful, otherwise returns null.
- Defined *getWeakAlias();
+ Symbol *getWeakAlias();
+ Defined *getDefinedWeakAlias() {
+ return dyn_cast_or_null<Defined>(getWeakAlias());
+ }
// If this symbol is external weak, replace this object with aliased symbol.
bool resolveWeakAlias();
diff --git a/lld/test/COFF/weak-lazy.s b/lld/test/COFF/weak-lazy.s
new file mode 100644
index 00000000000000..2812ba7af8b5f8
--- /dev/null
+++ b/lld/test/COFF/weak-lazy.s
@@ -0,0 +1,18 @@
+# REQUIRES: x86
+
+# RUN: llvm-mc -filetype=obj -triple=i686-windows %s -o %t.obj
+# RUN: llvm-lib -machine:x86 -out:%t-func.lib %t.obj
+
+# -export:func creates a weak alias to a lazy symbol. Make sure we can handle that when processing -export:func2=func.
+# RUN: lld-link -dll -noentry -machine:x86 -out:%t.dll %t-func.lib -export:func -export:func2=func
+
+ .text
+ .def @feat.00;
+ .scl 3;
+ .type 0;
+ .endef
+ .globl @feat.00
+.set @feat.00, 1
+ .globl _func@0
+_func@0:
+ retl
|
|
@llvm/pr-subscribers-lld-coff Author: Jacek Caban (cjacek) ChangesThe assumption that a symbol is either Full diff: https://github.com/llvm/llvm-project/pull/112243.diff 4 Files Affected:
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index 85a58a3677181a..12e1ae62811270 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -1340,7 +1340,7 @@ void LinkerDriver::maybeCreateECExportThunk(StringRef name, Symbol *&sym) {
if (!sym)
return;
if (auto undef = dyn_cast<Undefined>(sym))
- def = undef->getWeakAlias();
+ def = undef->getDefinedWeakAlias();
else
def = dyn_cast<Defined>(sym);
if (!def)
@@ -1376,7 +1376,7 @@ void LinkerDriver::createECExportThunks() {
continue;
Defined *targetSym;
if (auto undef = dyn_cast<Undefined>(sym))
- targetSym = undef->getWeakAlias();
+ targetSym = undef->getDefinedWeakAlias();
else
targetSym = dyn_cast<Defined>(sym);
if (!targetSym)
diff --git a/lld/COFF/Symbols.cpp b/lld/COFF/Symbols.cpp
index 567c2b93776c94..89f2da02bdcf42 100644
--- a/lld/COFF/Symbols.cpp
+++ b/lld/COFF/Symbols.cpp
@@ -112,12 +112,12 @@ DefinedImportThunk::DefinedImportThunk(COFFLinkerContext &ctx, StringRef name,
ImportThunkChunk *chunk)
: Defined(DefinedImportThunkKind, name), wrappedSym(s), data(chunk) {}
-Defined *Undefined::getWeakAlias() {
+Symbol *Undefined::getWeakAlias() {
// A weak alias may be a weak alias to another symbol, so check recursively.
DenseSet<Symbol *> weakChain;
for (Symbol *a = weakAlias; a; a = cast<Undefined>(a)->weakAlias) {
- if (auto *d = dyn_cast<Defined>(a))
- return d;
+ if (!isa<Undefined>(a))
+ return a;
if (!weakChain.insert(a).second)
break; // We have a cycle.
}
@@ -125,7 +125,7 @@ Defined *Undefined::getWeakAlias() {
}
bool Undefined::resolveWeakAlias() {
- Defined *d = getWeakAlias();
+ Defined *d = getDefinedWeakAlias();
if (!d)
return false;
diff --git a/lld/COFF/Symbols.h b/lld/COFF/Symbols.h
index 9b21e09bf83a42..a898ebf05fd809 100644
--- a/lld/COFF/Symbols.h
+++ b/lld/COFF/Symbols.h
@@ -340,7 +340,10 @@ class Undefined : public Symbol {
// If this symbol is external weak, try to resolve it to a defined
// symbol by searching the chain of fallback symbols. Returns the symbol if
// successful, otherwise returns null.
- Defined *getWeakAlias();
+ Symbol *getWeakAlias();
+ Defined *getDefinedWeakAlias() {
+ return dyn_cast_or_null<Defined>(getWeakAlias());
+ }
// If this symbol is external weak, replace this object with aliased symbol.
bool resolveWeakAlias();
diff --git a/lld/test/COFF/weak-lazy.s b/lld/test/COFF/weak-lazy.s
new file mode 100644
index 00000000000000..2812ba7af8b5f8
--- /dev/null
+++ b/lld/test/COFF/weak-lazy.s
@@ -0,0 +1,18 @@
+# REQUIRES: x86
+
+# RUN: llvm-mc -filetype=obj -triple=i686-windows %s -o %t.obj
+# RUN: llvm-lib -machine:x86 -out:%t-func.lib %t.obj
+
+# -export:func creates a weak alias to a lazy symbol. Make sure we can handle that when processing -export:func2=func.
+# RUN: lld-link -dll -noentry -machine:x86 -out:%t.dll %t-func.lib -export:func -export:func2=func
+
+ .text
+ .def @feat.00;
+ .scl 3;
+ .type 0;
+ .endef
+ .globl @feat.00
+.set @feat.00, 1
+ .globl _func@0
+_func@0:
+ retl
|
|
@llvm/pr-subscribers-platform-windows Author: Jacek Caban (cjacek) ChangesThe assumption that a symbol is either Full diff: https://github.com/llvm/llvm-project/pull/112243.diff 4 Files Affected:
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index 85a58a3677181a..12e1ae62811270 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -1340,7 +1340,7 @@ void LinkerDriver::maybeCreateECExportThunk(StringRef name, Symbol *&sym) {
if (!sym)
return;
if (auto undef = dyn_cast<Undefined>(sym))
- def = undef->getWeakAlias();
+ def = undef->getDefinedWeakAlias();
else
def = dyn_cast<Defined>(sym);
if (!def)
@@ -1376,7 +1376,7 @@ void LinkerDriver::createECExportThunks() {
continue;
Defined *targetSym;
if (auto undef = dyn_cast<Undefined>(sym))
- targetSym = undef->getWeakAlias();
+ targetSym = undef->getDefinedWeakAlias();
else
targetSym = dyn_cast<Defined>(sym);
if (!targetSym)
diff --git a/lld/COFF/Symbols.cpp b/lld/COFF/Symbols.cpp
index 567c2b93776c94..89f2da02bdcf42 100644
--- a/lld/COFF/Symbols.cpp
+++ b/lld/COFF/Symbols.cpp
@@ -112,12 +112,12 @@ DefinedImportThunk::DefinedImportThunk(COFFLinkerContext &ctx, StringRef name,
ImportThunkChunk *chunk)
: Defined(DefinedImportThunkKind, name), wrappedSym(s), data(chunk) {}
-Defined *Undefined::getWeakAlias() {
+Symbol *Undefined::getWeakAlias() {
// A weak alias may be a weak alias to another symbol, so check recursively.
DenseSet<Symbol *> weakChain;
for (Symbol *a = weakAlias; a; a = cast<Undefined>(a)->weakAlias) {
- if (auto *d = dyn_cast<Defined>(a))
- return d;
+ if (!isa<Undefined>(a))
+ return a;
if (!weakChain.insert(a).second)
break; // We have a cycle.
}
@@ -125,7 +125,7 @@ Defined *Undefined::getWeakAlias() {
}
bool Undefined::resolveWeakAlias() {
- Defined *d = getWeakAlias();
+ Defined *d = getDefinedWeakAlias();
if (!d)
return false;
diff --git a/lld/COFF/Symbols.h b/lld/COFF/Symbols.h
index 9b21e09bf83a42..a898ebf05fd809 100644
--- a/lld/COFF/Symbols.h
+++ b/lld/COFF/Symbols.h
@@ -340,7 +340,10 @@ class Undefined : public Symbol {
// If this symbol is external weak, try to resolve it to a defined
// symbol by searching the chain of fallback symbols. Returns the symbol if
// successful, otherwise returns null.
- Defined *getWeakAlias();
+ Symbol *getWeakAlias();
+ Defined *getDefinedWeakAlias() {
+ return dyn_cast_or_null<Defined>(getWeakAlias());
+ }
// If this symbol is external weak, replace this object with aliased symbol.
bool resolveWeakAlias();
diff --git a/lld/test/COFF/weak-lazy.s b/lld/test/COFF/weak-lazy.s
new file mode 100644
index 00000000000000..2812ba7af8b5f8
--- /dev/null
+++ b/lld/test/COFF/weak-lazy.s
@@ -0,0 +1,18 @@
+# REQUIRES: x86
+
+# RUN: llvm-mc -filetype=obj -triple=i686-windows %s -o %t.obj
+# RUN: llvm-lib -machine:x86 -out:%t-func.lib %t.obj
+
+# -export:func creates a weak alias to a lazy symbol. Make sure we can handle that when processing -export:func2=func.
+# RUN: lld-link -dll -noentry -machine:x86 -out:%t.dll %t-func.lib -export:func -export:func2=func
+
+ .text
+ .def @feat.00;
+ .scl 3;
+ .type 0;
+ .endef
+ .globl @feat.00
+.set @feat.00, 1
+ .globl _func@0
+_func@0:
+ retl
|
|
I encountered this issue while working with ARM64EC code, which makes heavy use of weak (anti-dependency) aliases. However, the problem also affects other targets. The test in this PR fails on an assertion during a cast in |
mstorsjo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes LGTM
| Symbol *Undefined::getWeakAlias() { | ||
| // A weak alias may be a weak alias to another symbol, so check recursively. | ||
| DenseSet<Symbol *> weakChain; | ||
| for (Symbol *a = weakAlias; a; a = cast<Undefined>(a)->weakAlias) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So... previously, this cast<Undefined>(a) was broken (invoking UB etc), if we had a weak alias which pointed at something which wasn't either Undefined or Defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
The assumption that a symbol is either Undefined or Defined is true for some callers, but not all. See the other comment for details.
| // symbol by searching the chain of fallback symbols. Returns the symbol if | ||
| // successful, otherwise returns null. | ||
| Defined *getWeakAlias(); | ||
| Symbol *getWeakAlias(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In trying to read the diff and the effects of it - I see that a number of callers are switched from getWeakAlias() to getDefinedWeakAlias() - but what I don't see from the point of view of the diff, is what remaining callers there are, that still are calling getWeakAlias()?
Because any caller that is switched to getDefinedWeakAlias() will essentially be doing the same as before (minus the UB in cast<Undefined>()), but which callers do get the new behaviour of getWeakAlias() where it potentially returns something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The remaining callers use getWeakAlias to check if a symbol can be resolved. After the run() call, all such symbols should be replaced with a Defined symbol (assuming valid input, though a malformed archive index could disrupt this). This means that for many callers, getWeakAlias isn’t expected to handle lazy symbols, making it functionally equivalent to getDefinedWeakAlias. They only differ if new weak aliases are defined between run() calls.
In the test example from this PR, using getDefinedWeakAlias in findMangle would mostly work as well, but it's a bit more fragile. Processing -export:func would create a weak alias to _func@0, but -export:func2=func would still consider it unresolvable and attempt to mangle it again. On the other hand, getWeakAlias returns the lazy symbol, so we know it will be resolvable and can skip the unnecessary mangling attempt.
…vm#112243) The assumption that a symbol is either `Defined` or `Undefined` is not always true for some cases. For example, `mangleMaybe` may create a weak alias to a lazy archive symbol.
The assumption that a symbol is either
DefinedorUndefinedis not always true for some cases. For example,mangleMaybemay create a weak alias to a lazy archive symbol.