-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[llvm-symbolizer] Recognize AIX big archive #150401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 26 commits
d98ed02
64639e1
f5a357e
305ef99
66d5d11
27b4f10
647f98e
629e7a5
8c71818
f0be9a8
bfa845e
9c14b38
f8aeb3e
0952fcf
6be8d1a
12dd6e5
68b886f
d0f0a1d
fe5a9c2
e2083eb
12bf16a
0595460
4d46767
315b98c
b37409b
30e4387
08f38ae
ca85990
844906c
74d4e23
183197a
c68b4d9
7407769
4da5ebd
b0f0705
86f2023
f034d90
c6d8b90
e4d6fed
f57d66a
c3acb24
136bde7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -535,16 +535,20 @@ MACH-O SPECIFIC OPTIONS | |
.. option:: --default-arch <arch> | ||
|
||
If a binary contains object files for multiple architectures (e.g. it is a | ||
Mach-O universal binary), symbolize the object file for a given architecture. | ||
You can also specify the architecture by writing ``binary_name:arch_name`` in | ||
the input (see example below). If the architecture is not specified in either | ||
way, the address will not be symbolized. Defaults to empty string. | ||
Mach-O universal binary or an AIX archive with architecture variants), | ||
symbolize the object file for a given architecture. You can also specify | ||
the architecture by writing ``binary_name:arch_name`` in the input (see | ||
example below). For AIX archives, the format ``archive.a(member.o):arch`` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The docs here still refer to AIX archives specifically, yet the behaviour change is generic. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shared objects are normally stored in (and loaded from) archives on AIX. By convention, the same (big format) archive contains both 32-bit and 64-bit objects This behaviour is AIX specific, hence the documentation is more specific to AIX There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Behavior such as having same object name which vary in architecture getting added to archive is specific to AIX |
||
is also supported. If the architecture is not specified, | ||
the address will not be symbolized. Defaults to empty string. | ||
|
||
.. code-block:: console | ||
|
||
$ cat addr.txt | ||
/tmp/mach_universal_binary:i386 0x1f84 | ||
/tmp/mach_universal_binary:x86_64 0x100000f24 | ||
/tmp/archive.a(member.o):ppc 0x1000 | ||
/tmp/archive.a(member.o):ppc64 0x2000 | ||
midhuncodes7 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
$ llvm-symbolizer < addr.txt | ||
_main | ||
|
@@ -553,6 +557,12 @@ MACH-O SPECIFIC OPTIONS | |
_main | ||
/tmp/source_x86_64.cc:8 | ||
|
||
_foo | ||
/tmp/source_ppc.cc:12 | ||
|
||
_foo | ||
/tmp/source_ppc64.cc:12 | ||
|
||
.. option:: --dsym-hint <path/to/file.dSYM> | ||
|
||
If the debug info for a binary isn't present in the default location, look for | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,6 +21,7 @@ | |
#include "llvm/DebugInfo/PDB/PDBContext.h" | ||
#include "llvm/DebugInfo/Symbolize/SymbolizableObjectFile.h" | ||
#include "llvm/Demangle/Demangle.h" | ||
#include "llvm/Object/Archive.h" | ||
#include "llvm/Object/BuildID.h" | ||
#include "llvm/Object/COFF.h" | ||
#include "llvm/Object/ELFObjectFile.h" | ||
|
@@ -285,7 +286,7 @@ LLVMSymbolizer::findSymbol(ArrayRef<uint8_t> BuildID, StringRef Symbol, | |
} | ||
|
||
void LLVMSymbolizer::flush() { | ||
ObjectForUBPathAndArch.clear(); | ||
ObjectFileCache.clear(); | ||
LRUBinaries.clear(); | ||
CacheSize = 0; | ||
BinaryForPath.clear(); | ||
|
@@ -436,13 +437,13 @@ bool LLVMSymbolizer::findDebugBinary(const std::string &OrigPath, | |
SmallString<16> OrigDir(OrigPath); | ||
llvm::sys::path::remove_filename(OrigDir); | ||
SmallString<16> DebugPath = OrigDir; | ||
// Try relative/path/to/original_binary/debuglink_name | ||
// Try relative/path/to/original_binary/debuglink_name. | ||
jh7370 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
llvm::sys::path::append(DebugPath, DebuglinkName); | ||
if (checkFileCRC(DebugPath, CRCHash)) { | ||
Result = std::string(DebugPath); | ||
return true; | ||
} | ||
// Try relative/path/to/original_binary/.debug/debuglink_name | ||
// Try relative/path/to/original_binary/.debug/debuglink_name. | ||
DebugPath = OrigDir; | ||
llvm::sys::path::append(DebugPath, ".debug", DebuglinkName); | ||
if (checkFileCRC(DebugPath, CRCHash)) { | ||
|
@@ -451,17 +452,17 @@ bool LLVMSymbolizer::findDebugBinary(const std::string &OrigPath, | |
} | ||
// Make the path absolute so that lookups will go to | ||
// "/usr/lib/debug/full/path/to/debug", not | ||
// "/usr/lib/debug/to/debug" | ||
// "/usr/lib/debug/to/debug". | ||
llvm::sys::fs::make_absolute(OrigDir); | ||
if (!Opts.FallbackDebugPath.empty()) { | ||
// Try <FallbackDebugPath>/absolute/path/to/original_binary/debuglink_name | ||
// Try <FallbackDebugPath>/absolute/path/to/original_binary/debuglink_name. | ||
DebugPath = Opts.FallbackDebugPath; | ||
} else { | ||
#if defined(__NetBSD__) | ||
// Try /usr/libdata/debug/absolute/path/to/original_binary/debuglink_name | ||
// Try /usr/libdata/debug/absolute/path/to/original_binary/debuglink_name. | ||
DebugPath = "/usr/libdata/debug"; | ||
#else | ||
// Try /usr/lib/debug/absolute/path/to/original_binary/debuglink_name | ||
// Try /usr/lib/debug/absolute/path/to/original_binary/debuglink_name. | ||
DebugPath = "/usr/lib/debug"; | ||
#endif | ||
} | ||
|
@@ -510,11 +511,11 @@ std::string LLVMSymbolizer::lookUpGsymFile(const std::string &Path) { | |
return !EC && !llvm::sys::fs::is_directory(Status); | ||
}; | ||
|
||
// First, look beside the binary file | ||
// First, look beside the binary file. | ||
if (const auto GsymPath = Path + ".gsym"; CheckGsymFile(GsymPath)) | ||
return GsymPath; | ||
|
||
// Then, look in the directories specified by GsymFileDirectory | ||
// Then, look in the directories specified by GsymFileDirectory. | ||
|
||
for (const auto &Directory : Opts.GsymFileDirectory) { | ||
SmallString<16> GsymPath = llvm::StringRef{Directory}; | ||
|
@@ -557,57 +558,149 @@ LLVMSymbolizer::getOrCreateObjectPair(const std::string &Path, | |
if (!DbgObj) | ||
DbgObj = Obj; | ||
ObjectPair Res = std::make_pair(Obj, DbgObj); | ||
std::string DbgObjPath = DbgObj->getFileName().str(); | ||
auto Pair = | ||
ObjectPairForPathArch.emplace(std::make_pair(Path, ArchName), Res); | ||
BinaryForPath.find(DbgObjPath)->second.pushEvictor([this, I = Pair.first]() { | ||
ObjectPairForPathArch.erase(I); | ||
}); | ||
std::string DbgObjPath = DbgObj->getFileName().str(); | ||
auto BinIter = BinaryForPath.find(DbgObjPath); | ||
if (BinIter != BinaryForPath.end()) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could I get an explanation for why this behaviour has changed, please. Also, I assume you have test coverage for it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not answered. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In some iterations, the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do you mean by "in some iterations"? Is this actually related to your addition of big archive support? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As in the other comments, please explain in detail from the entry point what the code path is that could hit here without the binary being cached, please, because I don't see it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For a plain object file (t.o) the Updating Hope this helps There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks. It's going to take me some time to go over this and determine if there's a better approach. I suspect there is, but I'm not sure yet what that is. |
||
BinIter->second.pushEvictor( | ||
[this, I = Pair.first]() { ObjectPairForPathArch.erase(I); }); | ||
} | ||
return Res; | ||
} | ||
|
||
Expected<ObjectFile *> | ||
LLVMSymbolizer::getOrCreateObject(const std::string &Path, | ||
const std::string &ArchName) { | ||
Binary *Bin; | ||
Expected<object::Binary *> | ||
LLVMSymbolizer::loadOrGetBinary(const std::string &Path) { | ||
auto Pair = BinaryForPath.emplace(Path, OwningBinary<Binary>()); | ||
if (!Pair.second) { | ||
Bin = Pair.first->second->getBinary(); | ||
recordAccess(Pair.first->second); | ||
} else { | ||
Expected<OwningBinary<Binary>> BinOrErr = createBinary(Path); | ||
if (!BinOrErr) | ||
return BinOrErr.takeError(); | ||
return Pair.first->second->getBinary(); | ||
} | ||
|
||
CachedBinary &CachedBin = Pair.first->second; | ||
CachedBin = std::move(BinOrErr.get()); | ||
CachedBin.pushEvictor([this, I = Pair.first]() { BinaryForPath.erase(I); }); | ||
LRUBinaries.push_back(CachedBin); | ||
CacheSize += CachedBin.size(); | ||
Bin = CachedBin->getBinary(); | ||
Expected<OwningBinary<Binary>> BinOrErr = createBinary(Path); | ||
if (!BinOrErr) { | ||
BinaryForPath.erase(Pair.first); | ||
|
||
return BinOrErr.takeError(); | ||
} | ||
|
||
if (!Bin) | ||
return static_cast<ObjectFile *>(nullptr); | ||
CachedBinary &CachedBin = Pair.first->second; | ||
CachedBin = std::move(*BinOrErr); | ||
CachedBin.pushEvictor([this, I = Pair.first]() { BinaryForPath.erase(I); }); | ||
LRUBinaries.push_back(CachedBin); | ||
CacheSize += CachedBin.size(); | ||
return CachedBin->getBinary(); | ||
} | ||
|
||
if (MachOUniversalBinary *UB = dyn_cast_or_null<MachOUniversalBinary>(Bin)) { | ||
auto I = ObjectForUBPathAndArch.find(std::make_pair(Path, ArchName)); | ||
if (I != ObjectForUBPathAndArch.end()) | ||
return I->second.get(); | ||
|
||
Expected<std::unique_ptr<ObjectFile>> ObjOrErr = | ||
UB->getMachOObjectForArch(ArchName); | ||
if (!ObjOrErr) { | ||
ObjectForUBPathAndArch.emplace(std::make_pair(Path, ArchName), | ||
std::unique_ptr<ObjectFile>()); | ||
return ObjOrErr.takeError(); | ||
Expected<ObjectFile *> LLVMSymbolizer::findOrCacheObject( | ||
const ArchiveCacheKey &Key, | ||
llvm::function_ref<Expected<std::unique_ptr<ObjectFile>>()> Loader, | ||
const std::string &PathForBinaryCache) { | ||
|
||
jh7370 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
auto It = ObjectFileCache.find(Key); | ||
if (It != ObjectFileCache.end()) | ||
return It->second.get(); | ||
|
||
Expected<std::unique_ptr<ObjectFile>> ObjOrErr = Loader(); | ||
if (!ObjOrErr) { | ||
ObjectFileCache.emplace(Key, std::unique_ptr<ObjectFile>()); | ||
return ObjOrErr.takeError(); | ||
} | ||
|
||
ObjectFile *Res = ObjOrErr->get(); | ||
auto NewEntry = ObjectFileCache.emplace(Key, std::move(*ObjOrErr)); | ||
auto CacheIter = BinaryForPath.find(PathForBinaryCache); | ||
if (CacheIter != BinaryForPath.end()) | ||
CacheIter->second.pushEvictor( | ||
[this, Iter = NewEntry.first]() { ObjectFileCache.erase(Iter); }); | ||
return Res; | ||
} | ||
|
||
Expected<ObjectFile *> LLVMSymbolizer::getOrCreateObjectFromArchive( | ||
StringRef ArchivePath, StringRef MemberName, StringRef ArchName) { | ||
Expected<object::Binary *> BinOrErr = loadOrGetBinary(ArchivePath.str()); | ||
if (!BinOrErr) | ||
return BinOrErr.takeError(); | ||
object::Binary *Bin = *BinOrErr; | ||
|
||
object::Archive *Archive = dyn_cast_if_present<object::Archive>(Bin); | ||
if (!Archive) | ||
return createStringError(std::errc::invalid_argument, | ||
jh7370 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
"'%s' is not a valid archive", | ||
ArchivePath.str().c_str()); | ||
|
||
Error Err = Error::success(); | ||
// On AIX, archives can contain multiple members with the same name but | ||
// different types. We need to check all matches and find one that matches | ||
// both name and architecture. | ||
for (auto &Child : Archive->children(Err, /*SkipInternal=*/true)) { | ||
Expected<StringRef> NameOrErr = Child.getName(); | ||
if (!NameOrErr) | ||
continue; | ||
if (*NameOrErr == sys::path::filename(MemberName)) { | ||
|
||
Expected<std::unique_ptr<object::Binary>> MemberOrErr = | ||
Child.getAsBinary(); | ||
if (!MemberOrErr) | ||
continue; | ||
|
||
std::unique_ptr<object::Binary> Binary = std::move(*MemberOrErr); | ||
if (auto *Obj = dyn_cast<object::ObjectFile>(Binary.get())) { | ||
Triple::ArchType ObjArch = Obj->makeTriple().getArch(); | ||
Triple RequestedTriple; | ||
RequestedTriple.setArch(Triple::getArchTypeForLLVMName(ArchName)); | ||
if (ObjArch != RequestedTriple.getArch()) | ||
continue; | ||
|
||
ArchiveCacheKey CacheKey{ArchivePath.str(), MemberName.str(), | ||
ArchName.str()}; | ||
Expected<ObjectFile *> Res = findOrCacheObject( | ||
CacheKey, | ||
[O = std::unique_ptr<ObjectFile>( | ||
Obj)]() mutable -> Expected<std::unique_ptr<ObjectFile>> { | ||
return std::move(O); | ||
}, | ||
ArchivePath.str()); | ||
Binary.release(); | ||
return Res; | ||
} | ||
} | ||
} | ||
if (Err) | ||
return std::move(Err); | ||
return createStringError(std::errc::invalid_argument, | ||
"no matching member '%s' with arch '%s' in '%s'", | ||
MemberName.str().c_str(), ArchName.str().c_str(), | ||
ArchivePath.str().c_str()); | ||
} | ||
|
||
Expected<ObjectFile *> | ||
LLVMSymbolizer::getOrCreateObject(const std::string &Path, | ||
const std::string &ArchName) { | ||
// First check for archive(member) format - more efficient to check closing | ||
// paren first. | ||
size_t CloseParen = Path.rfind(')'); | ||
if (CloseParen != std::string::npos && CloseParen == Path.length() - 1) { | ||
|
||
size_t OpenParen = Path.rfind('(', CloseParen); | ||
if (OpenParen != std::string::npos) { | ||
StringRef ArchivePath = StringRef(Path).substr(0, OpenParen); | ||
StringRef MemberName = | ||
StringRef(Path).substr(OpenParen + 1, CloseParen - OpenParen - 1); | ||
jh7370 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
return getOrCreateObjectFromArchive(ArchivePath, MemberName, ArchName); | ||
} | ||
ObjectFile *Res = ObjOrErr->get(); | ||
auto Pair = ObjectForUBPathAndArch.emplace(std::make_pair(Path, ArchName), | ||
std::move(ObjOrErr.get())); | ||
BinaryForPath.find(Path)->second.pushEvictor( | ||
[this, Iter = Pair.first]() { ObjectForUBPathAndArch.erase(Iter); }); | ||
return Res; | ||
} | ||
|
||
Expected<object::Binary *> BinOrErr = loadOrGetBinary(Path); | ||
if (!BinOrErr) | ||
return BinOrErr.takeError(); | ||
object::Binary *Bin = *BinOrErr; | ||
|
||
if (MachOUniversalBinary *UB = dyn_cast_or_null<MachOUniversalBinary>(Bin)) { | ||
ArchiveCacheKey CacheKey{Path, "", ArchName}; | ||
return findOrCacheObject( | ||
CacheKey, | ||
[UB, ArchName]() -> Expected<std::unique_ptr<ObjectFile>> { | ||
return UB->getMachOObjectForArch(ArchName); | ||
}, | ||
Path); | ||
} | ||
if (Bin->isObject()) { | ||
return cast<ObjectFile>(Bin); | ||
|
@@ -648,7 +741,9 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) { | |
|
||
auto I = Modules.find(ModuleName); | ||
if (I != Modules.end()) { | ||
recordAccess(BinaryForPath.find(BinaryName)->second); | ||
auto BinIter = BinaryForPath.find(BinaryName); | ||
if (BinIter != BinaryForPath.end()) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comment: what's the purpose of the logic change and what test coverage is there for it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not answered. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In some iteration the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the new code path that triggers the crash that you're talking about? In other words, why wasn't this a problem before and now is? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For a plain object file (t.o) the Updating Hope this helps |
||
recordAccess(BinIter->second); | ||
return I->second.get(); | ||
} | ||
|
||
|
@@ -701,7 +796,7 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) { | |
if (auto Err = loadDataForEXE(ReaderType, Objects.first->getFileName(), | ||
Session)) { | ||
Modules.emplace(ModuleName, std::unique_ptr<SymbolizableModule>()); | ||
// Return along the PDB filename to provide more context | ||
// Return along the PDB filename to provide more context. | ||
|
||
return createFileError(PDBFileName, std::move(Err)); | ||
} | ||
Context.reset(new PDBContext(*CoffObject, std::move(Session))); | ||
|
@@ -716,9 +811,9 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) { | |
createModuleInfo(Objects.first, std::move(Context), ModuleName); | ||
if (ModuleOrErr) { | ||
auto I = Modules.find(ModuleName); | ||
BinaryForPath.find(BinaryName)->second.pushEvictor([this, I]() { | ||
Modules.erase(I); | ||
}); | ||
auto BinIter = BinaryForPath.find(BinaryName); | ||
if (BinIter != BinaryForPath.end()) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comment: what's the purpose of the logic change and what test coverage is there for it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In some iterations the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As above, what is the new code path that causes the crash that you've discussed? I've looked at the code and I can't see it: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For a plain object file (t.o) the Updating Hope this helps |
||
BinIter->second.pushEvictor([this, I]() { Modules.erase(I); }); | ||
} | ||
return ModuleOrErr; | ||
} | ||
|
@@ -742,7 +837,7 @@ LLVMSymbolizer::getOrCreateModuleInfo(const ObjectFile &Obj) { | |
Context = BTFContext::create(Obj); | ||
else | ||
Context = DWARFContext::create(Obj); | ||
// FIXME: handle COFF object with PDB info to use PDBContext | ||
// FIXME: handle COFF object with PDB info to use PDBContext. | ||
return createModuleInfo(&Obj, std::move(Context), ObjName); | ||
} | ||
|
||
|
Uh oh!
There was an error while loading. Please reload this page.