-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[llvm-symbolizer] Recognize AIX big archive #150401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[llvm-symbolizer] Recognize AIX big archive #150401
Conversation
…des7/llvm-project into midhun7/big-archive-recognition
|
@llvm/pr-subscribers-llvm-binary-utilities Author: Midhunesh (midhuncodes7) ChangesThis PR implements big archive recognition by the symbolizer. Patch is 26.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150401.diff 10 Files Affected:
diff --git a/llvm/docs/CommandGuide/llvm-symbolizer.rst b/llvm/docs/CommandGuide/llvm-symbolizer.rst
index 2da1b2470a83e..8f3a132139fe9 100644
--- a/llvm/docs/CommandGuide/llvm-symbolizer.rst
+++ b/llvm/docs/CommandGuide/llvm-symbolizer.rst
@@ -535,16 +535,20 @@ MACH-O SPECIFIC OPTIONS
.. option:: --default-arch <arch>
If a binary contains object files for multiple architectures (e.g. it is a
- Mach-O universal binary), symbolize the object file for a given architecture.
- You can also specify the architecture by writing ``binary_name:arch_name`` in
- the input (see example below). If the architecture is not specified in either
- way, the address will not be symbolized. Defaults to empty string.
+ Mach-O universal binary or an AIX archive with architecture variants),
+ symbolize the object file for a given architecture. You can also specify
+ the architecture by writing ``binary_name:arch_name`` in the input (see
+ example below). For AIX archives, the format ``archive.a(member.o):arch``
+ is also supported. If the architecture is not specified in either way,
+ the address will not be symbolized. Defaults to empty string.
.. code-block:: console
$ cat addr.txt
/tmp/mach_universal_binary:i386 0x1f84
/tmp/mach_universal_binary:x86_64 0x100000f24
+ /tmp/archive.a(member.o):ppc 0x1000
+ /tmp/archive.a(member.o):ppc64 0x2000
$ llvm-symbolizer < addr.txt
_main
@@ -553,6 +557,12 @@ MACH-O SPECIFIC OPTIONS
_main
/tmp/source_x86_64.cc:8
+ _foo
+ /tmp/source_ppc.cc:12
+
+ _foo
+ /tmp/source_ppc64.cc:12
+
.. option:: --dsym-hint <path/to/file.dSYM>
If the debug info for a binary isn't present in the default location, look for
diff --git a/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h b/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
index fb8f3d8af6b1b..5144085f3e23c 100644
--- a/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
+++ b/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
@@ -29,6 +29,12 @@
#include <utility>
#include <vector>
+#if defined(_AIX)
+# define SYMBOLIZE_AIX 1
+#else
+# define SYMBOLIZE_AIX 0
+#endif
+
namespace llvm {
namespace object {
class ELFObjectFileBase;
@@ -202,6 +208,12 @@ class LLVMSymbolizer {
Expected<ObjectFile *> getOrCreateObject(const std::string &Path,
const std::string &ArchName);
+ /// Return a pointer to object file at specified path, for a specified
+ /// architecture that is present inside an archive file
+ Expected<ObjectFile *> getOrCreateObjectFromArchive(StringRef ArchivePath,
+ StringRef MemberName,
+ const std::string &ArchName);
+
/// Update the LRU cache order when a binary is accessed.
void recordAccess(CachedBinary &Bin);
@@ -226,6 +238,20 @@ class LLVMSymbolizer {
std::map<std::pair<std::string, std::string>, std::unique_ptr<ObjectFile>>
ObjectForUBPathAndArch;
+ struct ArchiveCacheKey {
+ std::string ArchivePath; // Storage for StringRef
+ std::string MemberName; // Storage for StringRef
+ std::string ArchName; // Storage for StringRef
+
+ // Required for map comparison
+ bool operator<(const ArchiveCacheKey &Other) const {
+ return std::tie(ArchivePath, MemberName, ArchName) <
+ std::tie(Other.ArchivePath, Other.MemberName, Other.ArchName);
+ }
+ };
+
+ std::map<ArchiveCacheKey, std::unique_ptr<ObjectFile>> ObjectForArchivePathAndArch;
+
Options Opts;
std::unique_ptr<BuildIDFetcher> BIDFetcher;
diff --git a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
index 56527719da51f..6dddc3a709239 100644
--- a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
+++ b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
@@ -33,6 +33,7 @@
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/Path.h"
+#include "llvm/Object/Archive.h"
#include <cassert>
#include <cstring>
@@ -286,6 +287,7 @@ LLVMSymbolizer::findSymbol(ArrayRef<uint8_t> BuildID, StringRef Symbol,
void LLVMSymbolizer::flush() {
ObjectForUBPathAndArch.clear();
+ ObjectForArchivePathAndArch.clear();
LRUBinaries.clear();
CacheSize = 0;
BinaryForPath.clear();
@@ -321,7 +323,7 @@ bool checkFileCRC(StringRef Path, uint32_t CRCHash) {
bool getGNUDebuglinkContents(const ObjectFile *Obj, std::string &DebugName,
uint32_t &CRCHash) {
- if (!Obj)
+ if (!Obj || !isa<ObjectFile>(Obj))
return false;
for (const SectionRef &Section : Obj->sections()) {
StringRef Name;
@@ -557,19 +559,101 @@ LLVMSymbolizer::getOrCreateObjectPair(const std::string &Path,
if (!DbgObj)
DbgObj = Obj;
ObjectPair Res = std::make_pair(Obj, DbgObj);
- std::string DbgObjPath = DbgObj->getFileName().str();
auto Pair =
ObjectPairForPathArch.emplace(std::make_pair(Path, ArchName), Res);
- BinaryForPath.find(DbgObjPath)->second.pushEvictor([this, I = Pair.first]() {
+ std::string DbgObjPath = DbgObj->getFileName().str();
+ auto BinIter = BinaryForPath.find(DbgObjPath);
+ if (BinIter != BinaryForPath.end()) {
+ BinIter->second.pushEvictor([this, I = Pair.first]() {
ObjectPairForPathArch.erase(I);
- });
+ });
+ }
return Res;
}
+Expected<ObjectFile *> LLVMSymbolizer::getOrCreateObjectFromArchive(StringRef ArchivePath,
+ StringRef MemberName,
+ const std::string &ArchName) {
+ Binary *Bin = nullptr;
+ auto Pair = BinaryForPath.emplace(ArchivePath.str(), OwningBinary<Binary>());
+ if (!Pair.second) {
+ Bin = Pair.first->second->getBinary();
+ recordAccess(Pair.first->second);
+ } else {
+ Expected<OwningBinary<Binary>> ArchiveOrErr = createBinary(ArchivePath);
+ if (!ArchiveOrErr) {
+ return ArchiveOrErr.takeError();
+ }
+
+ CachedBinary &CachedBin = Pair.first->second;
+ CachedBin = std::move(ArchiveOrErr.get());
+ CachedBin.pushEvictor([this, I = Pair.first]() { BinaryForPath.erase(I); });
+ LRUBinaries.push_back(CachedBin);
+ CacheSize += CachedBin.size();
+ Bin = CachedBin->getBinary();
+ }
+
+ if (!Bin || !isa<object::Archive>(Bin))
+ return errorCodeToError(object_error::invalid_file_type);
+
+ object::Archive *Archive = cast<object::Archive>(Bin);
+ Error Err = Error::success();
+
+ // On AIX, archives can contain multiple members with same name but different types
+ // We need to check all matches and find one that matches both name and architecture
+ for (auto &Child : Archive->children(Err, /*SkipInternal=*/true)) {
+ Expected<StringRef> NameOrErr = Child.getName();
+ if (!NameOrErr)
+ continue;
+ if (*NameOrErr == llvm::sys::path::filename(MemberName)) {
+ Expected<std::unique_ptr<object::Binary>> MemberOrErr = Child.getAsBinary();
+ if (!MemberOrErr)
+ continue;
+
+ std::unique_ptr<object::Binary> Binary = std::move(*MemberOrErr);
+ if (auto *Obj = dyn_cast<object::ObjectFile>(Binary.get())) {
+#if defined(_AIX)
+ Triple::ArchType ObjArch = Obj->makeTriple().getArch();
+ Triple RequestedTriple;
+ RequestedTriple.setArch(Triple::getArchTypeForLLVMName(ArchName));
+ if (ObjArch != RequestedTriple.getArch())
+ continue;
+#endif
+ ArchiveCacheKey CacheKey{ArchivePath.str(), MemberName.str(), ArchName};
+ auto I = ObjectForArchivePathAndArch.find(CacheKey);
+ if (I != ObjectForArchivePathAndArch.end())
+ return I->second.get();
+
+ auto CachedObj = std::unique_ptr<ObjectFile>(Obj);
+ auto NewEntry = ObjectForArchivePathAndArch.emplace(
+ CacheKey, std::move(CachedObj));
+ Binary.release();
+ BinaryForPath.find(ArchivePath.str())->second.pushEvictor(
+ [this, Iter = NewEntry.first]() { ObjectForArchivePathAndArch.erase(Iter); });
+ return NewEntry.first->second.get();
+ }
+ }
+ }
+ if (Err)
+ return std::move(Err);
+ return errorCodeToError(object_error::arch_not_found);
+}
+
Expected<ObjectFile *>
LLVMSymbolizer::getOrCreateObject(const std::string &Path,
const std::string &ArchName) {
- Binary *Bin;
+ // First check for archive(member) format - more efficient to check closing paren first
+ size_t CloseParen = Path.rfind(')');
+ if (CloseParen != std::string::npos && CloseParen == Path.length() - 1) {
+ size_t OpenParen = Path.rfind('(', CloseParen);
+ if (OpenParen != std::string::npos) {
+ StringRef ArchivePath = StringRef(Path).substr(0, OpenParen);
+ StringRef MemberName = StringRef(Path).substr(OpenParen + 1, CloseParen - OpenParen - 1);
+ return getOrCreateObjectFromArchive(ArchivePath, MemberName, ArchName);
+ }
+ }
+
+ Binary *Bin = nullptr;
auto Pair = BinaryForPath.emplace(Path, OwningBinary<Binary>());
if (!Pair.second) {
Bin = Pair.first->second->getBinary();
@@ -648,7 +732,9 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) {
auto I = Modules.find(ModuleName);
if (I != Modules.end()) {
- recordAccess(BinaryForPath.find(BinaryName)->second);
+ auto BinIter = BinaryForPath.find(BinaryName);
+ if (BinIter != BinaryForPath.end())
+ recordAccess(BinIter->second);
return I->second.get();
}
@@ -716,7 +802,9 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) {
createModuleInfo(Objects.first, std::move(Context), ModuleName);
if (ModuleOrErr) {
auto I = Modules.find(ModuleName);
- BinaryForPath.find(BinaryName)->second.pushEvictor([this, I]() {
+ auto BinIter = BinaryForPath.find(BinaryName);
+ if (BinIter != BinaryForPath.end())
+ BinIter->second.pushEvictor([this, I]() {
Modules.erase(I);
});
}
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-32.yaml b/llvm/test/DebugInfo/Inputs/big-archive-32.yaml
new file mode 100644
index 0000000000000..2080607a1a88c
--- /dev/null
+++ b/llvm/test/DebugInfo/Inputs/big-archive-32.yaml
@@ -0,0 +1,119 @@
+--- !XCOFF
+FileHeader:
+ MagicNumber: 0x1DF
+ NumberOfSections: 2
+ CreationTime: 0
+ OffsetToSymbolTable: 0xA0
+ EntriesInSymbolTable: 11
+ AuxiliaryHeaderSize: 0
+ Flags: 0x0
+Sections:
+ - Name: .text
+ Address: 0x0
+ Size: 0x1C
+ FileOffsetToData: 0x64
+ FileOffsetToRelocations: 0x0
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x0
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_TEXT ]
+ SectionData: 4E800020000000000009204000000001000000040003666F6F000000
+ - Name: .data
+ Address: 0x1C
+ Size: 0xC
+ FileOffsetToData: 0x80
+ FileOffsetToRelocations: 0x8C
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x2
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_DATA ]
+ SectionData: '000000000000002800000000'
+ Relocations:
+ - Address: 0x1C
+ Symbol: 0x5
+ Info: 0x1F
+ Type: 0x0
+ - Address: 0x20
+ Symbol: 0x9
+ Info: 0x1F
+ Type: 0x0
+Symbols:
+ - Name: .file
+ Value: 0x0
+ Section: N_DEBUG
+ Type: 0x18
+ StorageClass: C_FILE
+ NumberOfAuxEntries: 2
+ AuxEntries:
+ - Type: AUX_FILE
+ FileNameOrString: foo.c
+ FileStringType: XFT_FN
+ - Type: AUX_FILE
+ FileNameOrString: 'IBM Open XL C/C++ for AIX 17.1.3 (5725-C72, 5765-J18), version 17.1.3.0, LLVM version 21.0.0git (145c02cece3630765e6412e6820bc446ddb4e138)'
+ FileStringType: XFT_CV
+ - Name: ''
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 5
+ StorageMappingClass: XMC_PR
+ SectionOrLength: 25
+ StabInfoIndex: 0
+ StabSectNum: 0
+ - Name: .foo
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_LD
+ SymbolAlignment: 0
+ StorageMappingClass: XMC_PR
+ SectionOrLength: 3
+ StabInfoIndex: 0
+ StabSectNum: 0
+ - Name: foo
+ Value: 0x1C
+ Section: .data
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 2
+ StorageMappingClass: XMC_DS
+ SectionOrLength: 12
+ StabInfoIndex: 0
+ StabSectNum: 0
+ - Name: TOC
+ Value: 0x28
+ Section: .data
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 2
+ StorageMappingClass: XMC_TC0
+ SectionOrLength: 0
+ StabInfoIndex: 0
+ StabSectNum: 0
+StringTable: {}
+...
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-64.yaml b/llvm/test/DebugInfo/Inputs/big-archive-64.yaml
new file mode 100644
index 0000000000000..9bbb1107555e0
--- /dev/null
+++ b/llvm/test/DebugInfo/Inputs/big-archive-64.yaml
@@ -0,0 +1,115 @@
+--- !XCOFF
+FileHeader:
+ MagicNumber: 0x1F7
+ NumberOfSections: 2
+ CreationTime: 0
+ OffsetToSymbolTable: 0xF8
+ EntriesInSymbolTable: 11
+ AuxiliaryHeaderSize: 0
+ Flags: 0x0
+Sections:
+ - Name: .text
+ Address: 0x0
+ Size: 0x1C
+ FileOffsetToData: 0xA8
+ FileOffsetToRelocations: 0x0
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x0
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_TEXT ]
+ SectionData: 4E800020000000000009204000000001000000040003666F6F000000
+ - Name: .data
+ Address: 0x20
+ Size: 0x18
+ FileOffsetToData: 0xC4
+ FileOffsetToRelocations: 0xDC
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x2
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_DATA ]
+ SectionData: '000000000000000000000000000000380000000000000000'
+ Relocations:
+ - Address: 0x20
+ Symbol: 0x5
+ Info: 0x3F
+ Type: 0x0
+ - Address: 0x28
+ Symbol: 0x9
+ Info: 0x3F
+ Type: 0x0
+Symbols:
+ - Name: .file
+ Value: 0x0
+ Section: N_DEBUG
+ Type: 0x18
+ StorageClass: C_FILE
+ NumberOfAuxEntries: 2
+ AuxEntries:
+ - Type: AUX_FILE
+ FileNameOrString: foo.c
+ FileStringType: XFT_FN
+ - Type: AUX_FILE
+ FileNameOrString: 'IBM Open XL C/C++ for AIX 17.1.3 (5725-C72, 5765-J18), version 17.1.3.0, LLVM version 21.0.0git (5ca72bc8d2e87445649eab1825dffd2a047440ba)'
+ FileStringType: XFT_CV
+ - Name: ''
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 5
+ StorageMappingClass: XMC_PR
+ SectionOrLengthLo: 25
+ SectionOrLengthHi: 0
+ - Name: .foo
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_LD
+ SymbolAlignment: 0
+ StorageMappingClass: XMC_PR
+ SectionOrLengthLo: 3
+ SectionOrLengthHi: 0
+ - Name: foo
+ Value: 0x20
+ Section: .data
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 3
+ StorageMappingClass: XMC_DS
+ SectionOrLengthLo: 24
+ SectionOrLengthHi: 0
+ - Name: TOC
+ Value: 0x38
+ Section: .data
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 2
+ StorageMappingClass: XMC_TC0
+ SectionOrLengthLo: 0
+ SectionOrLengthHi: 0
+StringTable: {}
+...
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-elf-1.yaml b/llvm/test/DebugInfo/Inputs/big-archive-elf-1.yaml
new file mode 100644
index 0000000000000..8e5c929e82878
--- /dev/null
+++ b/llvm/test/DebugInfo/Inputs/big-archive-elf-1.yaml
@@ -0,0 +1,68 @@
+--- !ELF
+FileHeader:
+ Class: ELFCLASS64
+ Data: ELFDATA2LSB
+ Type: ET_REL
+ Machine: EM_PPC64
+ Flags: [ ]
+ SectionHeaderStringTable: .strtab
+Sections:
+ - Name: .text
+ Type: SHT_PROGBITS
+ Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
+ AddressAlign: 0x10
+ Content: '2000804E000000000000000000000000'
+ - Name: .comment
+ Type: SHT_PROGBITS
+ Flags: [ SHF_MERGE, SHF_STRINGS ]
+ AddressAlign: 0x1
+ EntSize: 0x1
+ Content: 0049424D204F70656E20584C20432F432B2B20666F72204C696E7578206F6E20506F7765722031372E312E322028353732352D4337322C20353736352D4A3230292C2076657273696F6E2031372E312E322E302C20636C616E672076657273696F6E2032312E302E306769742028676974406769746875622E69626D2E636F6D3A636F6D70696C65722F6C6C766D2D70726F6A6563742E67697420653165653233663838333532623937333563363735386661396335653035313366626234393361322900
+ - Name: .note.GNU-stack
+ Type: SHT_PROGBITS
+ AddressAlign: 0x1
+ - Name: .eh_frame
+ Type: SHT_PROGBITS
+ Flags: [ SHF_ALLOC ]
+ AddressAlign: 0x8
+ Content: 1000000000000000017A5200047841011B0C01001000000018000000000000001000000000000000
+ - Name: .rela.eh_frame
+ Type: SHT_RELA
+ Flags: [ SHF_INFO_LINK ]
+ Link: .symtab
+ AddressAlign: 0x8
+ Info: .eh_frame
+ Relocations:
+ - Offset: 0x1C
+ Symbol: .text
+ Type: R_PPC64_REL32
+ - Name: .llvm_addrsig
+ Type: SHT_LLVM_ADDRSIG
+ Flags: [ SHF_EXCLUDE ]
+ Link: .symtab
+ AddressAlign: 0x1
+ Offset: 0x1B8
+ Symbols: [ ]
+ - Type: SectionHeaderTable
+ Sections:
+ - Name: .strtab
+ - Name: .text
+ - Name: .comment
+ - Name: .note.GNU-stack
+ - Name: .eh_frame
+ - Name: .rela.eh_frame
+ - Name: .llvm_addrsig
+ - Name: .symtab
+Symbols:
+ - Name: foo1.c
+ Type: STT_FILE
+ Index: SHN_ABS
+ - Name: .text
+ Type: STT_SECTION
+ Section: .text
+ - Name: foo1
+ Type: STT_FUNC
+ Section: .text
+ Binding: STB_GLOBAL
+ Size: 0x10
+...
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-elf-2.yaml b/llvm/test/DebugInfo/Inputs/big-archive-elf-2.yaml
new file mode 100644
index 00000000...
[truncated]
|
|
@llvm/pr-subscribers-debuginfo Author: Midhunesh (midhuncodes7) ChangesThis PR implements big archive recognition by the symbolizer. Patch is 26.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150401.diff 10 Files Affected:
diff --git a/llvm/docs/CommandGuide/llvm-symbolizer.rst b/llvm/docs/CommandGuide/llvm-symbolizer.rst
index 2da1b2470a83e..8f3a132139fe9 100644
--- a/llvm/docs/CommandGuide/llvm-symbolizer.rst
+++ b/llvm/docs/CommandGuide/llvm-symbolizer.rst
@@ -535,16 +535,20 @@ MACH-O SPECIFIC OPTIONS
.. option:: --default-arch <arch>
If a binary contains object files for multiple architectures (e.g. it is a
- Mach-O universal binary), symbolize the object file for a given architecture.
- You can also specify the architecture by writing ``binary_name:arch_name`` in
- the input (see example below). If the architecture is not specified in either
- way, the address will not be symbolized. Defaults to empty string.
+ Mach-O universal binary or an AIX archive with architecture variants),
+ symbolize the object file for a given architecture. You can also specify
+ the architecture by writing ``binary_name:arch_name`` in the input (see
+ example below). For AIX archives, the format ``archive.a(member.o):arch``
+ is also supported. If the architecture is not specified in either way,
+ the address will not be symbolized. Defaults to empty string.
.. code-block:: console
$ cat addr.txt
/tmp/mach_universal_binary:i386 0x1f84
/tmp/mach_universal_binary:x86_64 0x100000f24
+ /tmp/archive.a(member.o):ppc 0x1000
+ /tmp/archive.a(member.o):ppc64 0x2000
$ llvm-symbolizer < addr.txt
_main
@@ -553,6 +557,12 @@ MACH-O SPECIFIC OPTIONS
_main
/tmp/source_x86_64.cc:8
+ _foo
+ /tmp/source_ppc.cc:12
+
+ _foo
+ /tmp/source_ppc64.cc:12
+
.. option:: --dsym-hint <path/to/file.dSYM>
If the debug info for a binary isn't present in the default location, look for
diff --git a/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h b/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
index fb8f3d8af6b1b..5144085f3e23c 100644
--- a/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
+++ b/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
@@ -29,6 +29,12 @@
#include <utility>
#include <vector>
+#if defined(_AIX)
+# define SYMBOLIZE_AIX 1
+#else
+# define SYMBOLIZE_AIX 0
+#endif
+
namespace llvm {
namespace object {
class ELFObjectFileBase;
@@ -202,6 +208,12 @@ class LLVMSymbolizer {
Expected<ObjectFile *> getOrCreateObject(const std::string &Path,
const std::string &ArchName);
+ /// Return a pointer to object file at specified path, for a specified
+ /// architecture that is present inside an archive file
+ Expected<ObjectFile *> getOrCreateObjectFromArchive(StringRef ArchivePath,
+ StringRef MemberName,
+ const std::string &ArchName);
+
/// Update the LRU cache order when a binary is accessed.
void recordAccess(CachedBinary &Bin);
@@ -226,6 +238,20 @@ class LLVMSymbolizer {
std::map<std::pair<std::string, std::string>, std::unique_ptr<ObjectFile>>
ObjectForUBPathAndArch;
+ struct ArchiveCacheKey {
+ std::string ArchivePath; // Storage for StringRef
+ std::string MemberName; // Storage for StringRef
+ std::string ArchName; // Storage for StringRef
+
+ // Required for map comparison
+ bool operator<(const ArchiveCacheKey &Other) const {
+ return std::tie(ArchivePath, MemberName, ArchName) <
+ std::tie(Other.ArchivePath, Other.MemberName, Other.ArchName);
+ }
+ };
+
+ std::map<ArchiveCacheKey, std::unique_ptr<ObjectFile>> ObjectForArchivePathAndArch;
+
Options Opts;
std::unique_ptr<BuildIDFetcher> BIDFetcher;
diff --git a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
index 56527719da51f..6dddc3a709239 100644
--- a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
+++ b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
@@ -33,6 +33,7 @@
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/Path.h"
+#include "llvm/Object/Archive.h"
#include <cassert>
#include <cstring>
@@ -286,6 +287,7 @@ LLVMSymbolizer::findSymbol(ArrayRef<uint8_t> BuildID, StringRef Symbol,
void LLVMSymbolizer::flush() {
ObjectForUBPathAndArch.clear();
+ ObjectForArchivePathAndArch.clear();
LRUBinaries.clear();
CacheSize = 0;
BinaryForPath.clear();
@@ -321,7 +323,7 @@ bool checkFileCRC(StringRef Path, uint32_t CRCHash) {
bool getGNUDebuglinkContents(const ObjectFile *Obj, std::string &DebugName,
uint32_t &CRCHash) {
- if (!Obj)
+ if (!Obj || !isa<ObjectFile>(Obj))
return false;
for (const SectionRef &Section : Obj->sections()) {
StringRef Name;
@@ -557,19 +559,101 @@ LLVMSymbolizer::getOrCreateObjectPair(const std::string &Path,
if (!DbgObj)
DbgObj = Obj;
ObjectPair Res = std::make_pair(Obj, DbgObj);
- std::string DbgObjPath = DbgObj->getFileName().str();
auto Pair =
ObjectPairForPathArch.emplace(std::make_pair(Path, ArchName), Res);
- BinaryForPath.find(DbgObjPath)->second.pushEvictor([this, I = Pair.first]() {
+ std::string DbgObjPath = DbgObj->getFileName().str();
+ auto BinIter = BinaryForPath.find(DbgObjPath);
+ if (BinIter != BinaryForPath.end()) {
+ BinIter->second.pushEvictor([this, I = Pair.first]() {
ObjectPairForPathArch.erase(I);
- });
+ });
+ }
return Res;
}
+Expected<ObjectFile *> LLVMSymbolizer::getOrCreateObjectFromArchive(StringRef ArchivePath,
+ StringRef MemberName,
+ const std::string &ArchName) {
+ Binary *Bin = nullptr;
+ auto Pair = BinaryForPath.emplace(ArchivePath.str(), OwningBinary<Binary>());
+ if (!Pair.second) {
+ Bin = Pair.first->second->getBinary();
+ recordAccess(Pair.first->second);
+ } else {
+ Expected<OwningBinary<Binary>> ArchiveOrErr = createBinary(ArchivePath);
+ if (!ArchiveOrErr) {
+ return ArchiveOrErr.takeError();
+ }
+
+ CachedBinary &CachedBin = Pair.first->second;
+ CachedBin = std::move(ArchiveOrErr.get());
+ CachedBin.pushEvictor([this, I = Pair.first]() { BinaryForPath.erase(I); });
+ LRUBinaries.push_back(CachedBin);
+ CacheSize += CachedBin.size();
+ Bin = CachedBin->getBinary();
+ }
+
+ if (!Bin || !isa<object::Archive>(Bin))
+ return errorCodeToError(object_error::invalid_file_type);
+
+ object::Archive *Archive = cast<object::Archive>(Bin);
+ Error Err = Error::success();
+
+ // On AIX, archives can contain multiple members with same name but different types
+ // We need to check all matches and find one that matches both name and architecture
+ for (auto &Child : Archive->children(Err, /*SkipInternal=*/true)) {
+ Expected<StringRef> NameOrErr = Child.getName();
+ if (!NameOrErr)
+ continue;
+ if (*NameOrErr == llvm::sys::path::filename(MemberName)) {
+ Expected<std::unique_ptr<object::Binary>> MemberOrErr = Child.getAsBinary();
+ if (!MemberOrErr)
+ continue;
+
+ std::unique_ptr<object::Binary> Binary = std::move(*MemberOrErr);
+ if (auto *Obj = dyn_cast<object::ObjectFile>(Binary.get())) {
+#if defined(_AIX)
+ Triple::ArchType ObjArch = Obj->makeTriple().getArch();
+ Triple RequestedTriple;
+ RequestedTriple.setArch(Triple::getArchTypeForLLVMName(ArchName));
+ if (ObjArch != RequestedTriple.getArch())
+ continue;
+#endif
+ ArchiveCacheKey CacheKey{ArchivePath.str(), MemberName.str(), ArchName};
+ auto I = ObjectForArchivePathAndArch.find(CacheKey);
+ if (I != ObjectForArchivePathAndArch.end())
+ return I->second.get();
+
+ auto CachedObj = std::unique_ptr<ObjectFile>(Obj);
+ auto NewEntry = ObjectForArchivePathAndArch.emplace(
+ CacheKey, std::move(CachedObj));
+ Binary.release();
+ BinaryForPath.find(ArchivePath.str())->second.pushEvictor(
+ [this, Iter = NewEntry.first]() { ObjectForArchivePathAndArch.erase(Iter); });
+ return NewEntry.first->second.get();
+ }
+ }
+ }
+ if (Err)
+ return std::move(Err);
+ return errorCodeToError(object_error::arch_not_found);
+}
+
Expected<ObjectFile *>
LLVMSymbolizer::getOrCreateObject(const std::string &Path,
const std::string &ArchName) {
- Binary *Bin;
+ // First check for archive(member) format - more efficient to check closing paren first
+ size_t CloseParen = Path.rfind(')');
+ if (CloseParen != std::string::npos && CloseParen == Path.length() - 1) {
+ size_t OpenParen = Path.rfind('(', CloseParen);
+ if (OpenParen != std::string::npos) {
+ StringRef ArchivePath = StringRef(Path).substr(0, OpenParen);
+ StringRef MemberName = StringRef(Path).substr(OpenParen + 1, CloseParen - OpenParen - 1);
+ return getOrCreateObjectFromArchive(ArchivePath, MemberName, ArchName);
+ }
+ }
+
+ Binary *Bin = nullptr;
auto Pair = BinaryForPath.emplace(Path, OwningBinary<Binary>());
if (!Pair.second) {
Bin = Pair.first->second->getBinary();
@@ -648,7 +732,9 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) {
auto I = Modules.find(ModuleName);
if (I != Modules.end()) {
- recordAccess(BinaryForPath.find(BinaryName)->second);
+ auto BinIter = BinaryForPath.find(BinaryName);
+ if (BinIter != BinaryForPath.end())
+ recordAccess(BinIter->second);
return I->second.get();
}
@@ -716,7 +802,9 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) {
createModuleInfo(Objects.first, std::move(Context), ModuleName);
if (ModuleOrErr) {
auto I = Modules.find(ModuleName);
- BinaryForPath.find(BinaryName)->second.pushEvictor([this, I]() {
+ auto BinIter = BinaryForPath.find(BinaryName);
+ if (BinIter != BinaryForPath.end())
+ BinIter->second.pushEvictor([this, I]() {
Modules.erase(I);
});
}
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-32.yaml b/llvm/test/DebugInfo/Inputs/big-archive-32.yaml
new file mode 100644
index 0000000000000..2080607a1a88c
--- /dev/null
+++ b/llvm/test/DebugInfo/Inputs/big-archive-32.yaml
@@ -0,0 +1,119 @@
+--- !XCOFF
+FileHeader:
+ MagicNumber: 0x1DF
+ NumberOfSections: 2
+ CreationTime: 0
+ OffsetToSymbolTable: 0xA0
+ EntriesInSymbolTable: 11
+ AuxiliaryHeaderSize: 0
+ Flags: 0x0
+Sections:
+ - Name: .text
+ Address: 0x0
+ Size: 0x1C
+ FileOffsetToData: 0x64
+ FileOffsetToRelocations: 0x0
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x0
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_TEXT ]
+ SectionData: 4E800020000000000009204000000001000000040003666F6F000000
+ - Name: .data
+ Address: 0x1C
+ Size: 0xC
+ FileOffsetToData: 0x80
+ FileOffsetToRelocations: 0x8C
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x2
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_DATA ]
+ SectionData: '000000000000002800000000'
+ Relocations:
+ - Address: 0x1C
+ Symbol: 0x5
+ Info: 0x1F
+ Type: 0x0
+ - Address: 0x20
+ Symbol: 0x9
+ Info: 0x1F
+ Type: 0x0
+Symbols:
+ - Name: .file
+ Value: 0x0
+ Section: N_DEBUG
+ Type: 0x18
+ StorageClass: C_FILE
+ NumberOfAuxEntries: 2
+ AuxEntries:
+ - Type: AUX_FILE
+ FileNameOrString: foo.c
+ FileStringType: XFT_FN
+ - Type: AUX_FILE
+ FileNameOrString: 'IBM Open XL C/C++ for AIX 17.1.3 (5725-C72, 5765-J18), version 17.1.3.0, LLVM version 21.0.0git (145c02cece3630765e6412e6820bc446ddb4e138)'
+ FileStringType: XFT_CV
+ - Name: ''
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 5
+ StorageMappingClass: XMC_PR
+ SectionOrLength: 25
+ StabInfoIndex: 0
+ StabSectNum: 0
+ - Name: .foo
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_LD
+ SymbolAlignment: 0
+ StorageMappingClass: XMC_PR
+ SectionOrLength: 3
+ StabInfoIndex: 0
+ StabSectNum: 0
+ - Name: foo
+ Value: 0x1C
+ Section: .data
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 2
+ StorageMappingClass: XMC_DS
+ SectionOrLength: 12
+ StabInfoIndex: 0
+ StabSectNum: 0
+ - Name: TOC
+ Value: 0x28
+ Section: .data
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 2
+ StorageMappingClass: XMC_TC0
+ SectionOrLength: 0
+ StabInfoIndex: 0
+ StabSectNum: 0
+StringTable: {}
+...
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-64.yaml b/llvm/test/DebugInfo/Inputs/big-archive-64.yaml
new file mode 100644
index 0000000000000..9bbb1107555e0
--- /dev/null
+++ b/llvm/test/DebugInfo/Inputs/big-archive-64.yaml
@@ -0,0 +1,115 @@
+--- !XCOFF
+FileHeader:
+ MagicNumber: 0x1F7
+ NumberOfSections: 2
+ CreationTime: 0
+ OffsetToSymbolTable: 0xF8
+ EntriesInSymbolTable: 11
+ AuxiliaryHeaderSize: 0
+ Flags: 0x0
+Sections:
+ - Name: .text
+ Address: 0x0
+ Size: 0x1C
+ FileOffsetToData: 0xA8
+ FileOffsetToRelocations: 0x0
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x0
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_TEXT ]
+ SectionData: 4E800020000000000009204000000001000000040003666F6F000000
+ - Name: .data
+ Address: 0x20
+ Size: 0x18
+ FileOffsetToData: 0xC4
+ FileOffsetToRelocations: 0xDC
+ FileOffsetToLineNumbers: 0x0
+ NumberOfRelocations: 0x2
+ NumberOfLineNumbers: 0x0
+ Flags: [ STYP_DATA ]
+ SectionData: '000000000000000000000000000000380000000000000000'
+ Relocations:
+ - Address: 0x20
+ Symbol: 0x5
+ Info: 0x3F
+ Type: 0x0
+ - Address: 0x28
+ Symbol: 0x9
+ Info: 0x3F
+ Type: 0x0
+Symbols:
+ - Name: .file
+ Value: 0x0
+ Section: N_DEBUG
+ Type: 0x18
+ StorageClass: C_FILE
+ NumberOfAuxEntries: 2
+ AuxEntries:
+ - Type: AUX_FILE
+ FileNameOrString: foo.c
+ FileStringType: XFT_FN
+ - Type: AUX_FILE
+ FileNameOrString: 'IBM Open XL C/C++ for AIX 17.1.3 (5725-C72, 5765-J18), version 17.1.3.0, LLVM version 21.0.0git (5ca72bc8d2e87445649eab1825dffd2a047440ba)'
+ FileStringType: XFT_CV
+ - Name: ''
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 5
+ StorageMappingClass: XMC_PR
+ SectionOrLengthLo: 25
+ SectionOrLengthHi: 0
+ - Name: .foo
+ Value: 0x0
+ Section: .text
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_LD
+ SymbolAlignment: 0
+ StorageMappingClass: XMC_PR
+ SectionOrLengthLo: 3
+ SectionOrLengthHi: 0
+ - Name: foo
+ Value: 0x20
+ Section: .data
+ Type: 0x0
+ StorageClass: C_EXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 3
+ StorageMappingClass: XMC_DS
+ SectionOrLengthLo: 24
+ SectionOrLengthHi: 0
+ - Name: TOC
+ Value: 0x38
+ Section: .data
+ Type: 0x0
+ StorageClass: C_HIDEXT
+ NumberOfAuxEntries: 1
+ AuxEntries:
+ - Type: AUX_CSECT
+ ParameterHashIndex: 0
+ TypeChkSectNum: 0
+ SymbolType: XTY_SD
+ SymbolAlignment: 2
+ StorageMappingClass: XMC_TC0
+ SectionOrLengthLo: 0
+ SectionOrLengthHi: 0
+StringTable: {}
+...
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-elf-1.yaml b/llvm/test/DebugInfo/Inputs/big-archive-elf-1.yaml
new file mode 100644
index 0000000000000..8e5c929e82878
--- /dev/null
+++ b/llvm/test/DebugInfo/Inputs/big-archive-elf-1.yaml
@@ -0,0 +1,68 @@
+--- !ELF
+FileHeader:
+ Class: ELFCLASS64
+ Data: ELFDATA2LSB
+ Type: ET_REL
+ Machine: EM_PPC64
+ Flags: [ ]
+ SectionHeaderStringTable: .strtab
+Sections:
+ - Name: .text
+ Type: SHT_PROGBITS
+ Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
+ AddressAlign: 0x10
+ Content: '2000804E000000000000000000000000'
+ - Name: .comment
+ Type: SHT_PROGBITS
+ Flags: [ SHF_MERGE, SHF_STRINGS ]
+ AddressAlign: 0x1
+ EntSize: 0x1
+ Content: 0049424D204F70656E20584C20432F432B2B20666F72204C696E7578206F6E20506F7765722031372E312E322028353732352D4337322C20353736352D4A3230292C2076657273696F6E2031372E312E322E302C20636C616E672076657273696F6E2032312E302E306769742028676974406769746875622E69626D2E636F6D3A636F6D70696C65722F6C6C766D2D70726F6A6563742E67697420653165653233663838333532623937333563363735386661396335653035313366626234393361322900
+ - Name: .note.GNU-stack
+ Type: SHT_PROGBITS
+ AddressAlign: 0x1
+ - Name: .eh_frame
+ Type: SHT_PROGBITS
+ Flags: [ SHF_ALLOC ]
+ AddressAlign: 0x8
+ Content: 1000000000000000017A5200047841011B0C01001000000018000000000000001000000000000000
+ - Name: .rela.eh_frame
+ Type: SHT_RELA
+ Flags: [ SHF_INFO_LINK ]
+ Link: .symtab
+ AddressAlign: 0x8
+ Info: .eh_frame
+ Relocations:
+ - Offset: 0x1C
+ Symbol: .text
+ Type: R_PPC64_REL32
+ - Name: .llvm_addrsig
+ Type: SHT_LLVM_ADDRSIG
+ Flags: [ SHF_EXCLUDE ]
+ Link: .symtab
+ AddressAlign: 0x1
+ Offset: 0x1B8
+ Symbols: [ ]
+ - Type: SectionHeaderTable
+ Sections:
+ - Name: .strtab
+ - Name: .text
+ - Name: .comment
+ - Name: .note.GNU-stack
+ - Name: .eh_frame
+ - Name: .rela.eh_frame
+ - Name: .llvm_addrsig
+ - Name: .symtab
+Symbols:
+ - Name: foo1.c
+ Type: STT_FILE
+ Index: SHN_ABS
+ - Name: .text
+ Type: STT_SECTION
+ Section: .text
+ - Name: foo1
+ Type: STT_FUNC
+ Section: .text
+ Binding: STB_GLOBAL
+ Size: 0x10
+...
diff --git a/llvm/test/DebugInfo/Inputs/big-archive-elf-2.yaml b/llvm/test/DebugInfo/Inputs/big-archive-elf-2.yaml
new file mode 100644
index 00000000...
[truncated]
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
| LLVMSymbolizer::getOrCreateObject(const std::string &Path, | ||
| const std::string &ArchName) { | ||
| Binary *Bin; | ||
| // First check for archive(member) format - more efficient to check closing paren first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // First check for archive(member) format - more efficient to check closing paren first | |
| // First check for archive(member) format - more efficient to check closing paren first. |
Nit.
This behaviour change isn't guarded to Big Archive format as stated in the documentation. It looks to me like a UNIX archive would be impacted by this too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping. The second part of this comment hasn't been addressed as far as I can tell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we intend to allow naming archive members in non-AIX contexts (as that may be helpful). The documentation will be made more generic like how it used to be before.
…des7/llvm-project into midhun7/big-archive-recognition
|
I'm still catching up on reviews left over from my time off last week. I'll be taking a look again at this hopefully tomorrow. |
jh7370
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't mark conversation threads that I have initiated as resolved. I need the "resolved" status to help me spot things I have and have not checked, because it's not unheard of for people to mark things as resolved without actually addressing them. You can find more context on https://discourse.llvm.org/t/rfc-github-pr-resolve-conversation-button/73178.
|
@jh7370 any updates from your end? |
Sorry, I've had some time off and have been unwell too. I'm starting to catch up on things now, but it'll likely be next week before I can take a look. |
| ArchivePath.str().c_str()); | ||
|
|
||
| Error Err = Error::success(); | ||
| // On AIX, archives can contain multiple members with the same name but |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor point: it's quite straightforward to create an archive with multiple members of the same name, using llvm-ar q. I don't think this impacts things here specifically though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed the comment as it is not specific
| std::map<std::pair<std::string, std::string>, std::unique_ptr<ObjectFile>> | ||
| ObjectForUBPathAndArch; | ||
| struct ArchiveCacheKey { | ||
| std::string ArchivePath; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: references to Archive in the names here are misleading, because they might be referring to a Mach-O universal binary, if I'm not mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed as container to be more generic
| }); | ||
| std::string DbgObjPath = DbgObj->getFileName().str(); | ||
| auto BinIter = BinaryForPath.find(DbgObjPath); | ||
| if (BinIter != BinaryForPath.end()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've given it some more thought and I'm fairly convinced the problem is that you're storing the wrong thing in BinaryForPath when dealing with archive members. Specifically, this should store the final binary (i.e. the archive child binary) and should be keyed on t.a(t.o) (or equivalent).
This will require some reorganisation of the code from what you've done, so that loadOrGetBinary knows that it's expecting an archive and fetches the child out of that.
Does this make sense?
|
@jh7370 requesting your inputs on the recent changes |
|
@jh7370 any updates on this review will be much appreciated |
| std::map<std::string, CachedBinary, std::less<>> BinaryForPath; | ||
|
|
||
| /// Store the archive path for the object file | ||
| std::unordered_map<const object::ObjectFile *, std::string> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please review the section on map types in the LLVM programmer's manual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used DenseMap instead
| }; | ||
|
|
||
| /// Parsed object file for path/object/architecture pair, where | ||
| /// "path" refers to Mach-O universal binary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment up-to-date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the comment
| /// Helper function to load binary. | ||
| Expected<object::Binary *> | ||
| loadOrGetBinary(const std::string &OpenPath, | ||
| const std::string *FullPathKeyOpt = nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use std::optional for optional types, not pointers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used std::optional
| /// "path" refers to Mach-O universal binary. | ||
| std::map<ContainerCacheKey, std::unique_ptr<ObjectFile>> ObjectFileCache; | ||
|
|
||
| /// Helper function to load binary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly how is this comment aiding readability? What does it add that the function name itself doesn't add?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed comment
| loadOrGetBinary(const std::string &OpenPath, | ||
| const std::string *FullPathKeyOpt = nullptr); | ||
|
|
||
| /// Helper function to find and get object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above re. comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed comment
| /// Contains parsed binary for each path, or parsing error. | ||
| std::map<std::string, CachedBinary, std::less<>> BinaryForPath; | ||
|
|
||
| /// Store the archive path for the object file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// Store the archive path for the object file | |
| /// Store the archive path for the object file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corrected
| if (It != ObjectToArchivePath.end()) { | ||
| StringRef ArchivePath = It->second; | ||
| StringRef MemberName = sys::path::filename(DbgObj->getFileName()); | ||
| FullDbgObjKey = (Twine(ArchivePath) + "(" + MemberName + ")").str(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to wrap ArchivePath in a Twine constructor explicitly. The result of adding a StringRef and a string literal is a Twine anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed Twine
| Expected<object::Binary *> | ||
| LLVMSymbolizer::loadOrGetBinary(const std::string &OpenPath, | ||
| const std::string *FullPathKeyOpt) { | ||
| // If no separate cache key is provided, use the open path itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is an "open path"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
| CacheSize += CachedBin.size(); | ||
| Bin = CachedBin->getBinary(); | ||
| Expected<OwningBinary<Binary>> BinOrErr = createBinary(OpenPath); | ||
| if (!BinOrErr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're still using the term "open path" in your variable names. Please don't make up terminology.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed and replaced as ArchivePathKey
🐧 Linux x64 Test Results
|
jh7370
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code is mostly looking okay now. I'll look at the tests in my next pass.
c74382e to
d16b8d4
Compare
| @@ -247,8 +246,8 @@ class LLVMSymbolizer { | |||
| }; | |||
|
|
|||
| /// Parsed object file for each path/member/architecture triple. | |||
| /// Used to cache objects extracted from containers (Mach-O universal | |||
| /// binaries, archives). | |||
| /// Used to cache objects extracted from containers (e.g: Mach-O | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "e.g." or "e.g.," (depending on whether you follow American or British English style respectively, I believe American is more common in LLVM). ":" is not appropriate here.
| @@ -0,0 +1,119 @@ | |||
| --- !XCOFF | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming the file "big-archive" doesn't make sense, because the output file is an XCOFF file, not an archive file. Just call it xcoff-32.yaml or similar.
Do we really need so much bloat in the file? The idea of YAML inputs is generally to have small, focused inputs where we can easily see what's relevant to the test. From what I can tell, all you need to have is the minimal file header info, one symbol, and a section for that symbol to be in (and even the section is potentially unnecessary).
If we can minimise the size, it would then make more sense to put the files inline in the test, rather than separate, to keep everything related to a test together.
| @@ -0,0 +1,68 @@ | |||
| --- !ELF | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments apply to the ELF file as the XCOFF file. Further, we should be able to drop the second file entirely and replace it with parameterisation using yaml2obj's -D option to choose the class/machine values as needed. You can see plenty of examples of minimal ELF YAML inputs in other tests.
| @@ -0,0 +1,25 @@ | |||
| // Test archive member recognition by name (ELF format). | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tend to put newer llvm-symbolizer tests that have some relevance to the llvm-symbolizer command-line handling in the test/tools/llvm-symbolizer directory, rather than the DebugInfo directory.
We also tend to use # for comments and no markers for RUN commands/FileCheck prefixes when we don't need to worry about code. If you do move the YAML inline, you should instead use ## for comments and # for lit and FIleCheck stuff.
| // RUN: rm -f %t.a | ||
| // RUN: llvm-ar crv %t.a %t-1.o %t-2.o | ||
|
|
||
| // Test symbolization by member name (using just base names). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"using just base names" is an unnecessary addition to the comment and suggests there's another method that can be used (which we've ruled out previously).
| // CHECK-ERROR: error: {{.*}}no matching member{{.*}} | ||
| // RUN: not llvm-symbolizer --obj="%t-1.o(%{t:stem}.tmp-1.o)" 0x1000 2>&1 | FileCheck %s --check-prefix=CHECK-NOTARCHIVE | ||
| // CHECK-NOTARCHIVE: error: '{{.*}}' is not a valid archive | ||
| // RUN: not llvm-symbolizer --obj="%t.a()" 0x1000 2>&1 | FileCheck %s --check-prefix=CHECK-EMPTY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use CHECK-EMPTY as a prefix, because -EMPTY is already a special FileCheck suffix and this will just cause confusion.
| @@ -0,0 +1,33 @@ | |||
| // Test big archive recognition and error handling in llvm-symbolizer. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments as the other test apply to this one.
Do we actually need distinct ELF and XCOFF test cases? How is the behaviour different?
| // RUN: cp %t-64.o %t.o | ||
| // RUN: llvm-ar %if system-aix %{-X64 rv%} %else %{qv%} %t.a %t.o | ||
|
|
||
| // Verify archive contains two members with same name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you checking this?
| // CHECK-32: foo | ||
| // CHECK-64: foo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the check patterns being identical, how do you know that the behaviour is actually correct? In other words, how do you know that the right object is being found for symbolization?
| // Test error cases. | ||
| // RUN: not llvm-symbolizer --obj="%t.a(nonexistent.o)" 0x1000 2>&1 | FileCheck %s --check-prefix=CHECK-ERROR | ||
| // CHECK-ERROR: error: {{.*}}no matching member{{.*}} | ||
| // RUN: not llvm-symbolizer --obj="%t.o(%{t:stem}.tmp.o)" 0x1000 2>&1 | FileCheck %s --check-prefix=CHECK-NOTARCHIVE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could simplify some of these patterns by creating a directory and cd-ing into it at the start of the test before creating the files. That way you can just use names like "1.o" and "2.o" and "archive.a" or whatever throughout the test.
This PR implements big archive recognition by the symbolizer.
The archive input format should be in archive.a(member.o) format