-
Notifications
You must be signed in to change notification settings - Fork 15.1k
Fix getting section info in large mach-o files. #165940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Mach-o has 32 bit file offsets in the MachO::section_64 structs. dSYM files can contain sections whose start offset exceeds UINT32_MAX, which means the MachO::section_64.offset will get truncated. We can calculate when this happens and properly adjust the section offset to be 64 bit safe. This means tools can get the correct section contents for large dSYM files and allows tools that parse DWARF, like llvm-gsymutil, to be able to load and convert these files correctly.
|
@llvm/pr-subscribers-llvm-binary-utilities Author: Greg Clayton (clayborg) ChangesMach-o has 32 bit file offsets in the MachO::section_64 structs. dSYM files can contain sections whose start offset exceeds UINT32_MAX, which means the MachO::section_64.offset will get truncated. We can calculate when this happens and properly adjust the section offset to be 64 bit safe. This means tools can get the correct section contents for large dSYM files and allows tools that parse DWARF, like llvm-gsymutil, to be able to load and convert these files correctly. Full diff: https://github.com/llvm/llvm-project/pull/165940.diff 2 Files Affected:
diff --git a/llvm/include/llvm/Object/MachO.h b/llvm/include/llvm/Object/MachO.h
index 01e7c6b07dd36..f4c1e30b097ee 100644
--- a/llvm/include/llvm/Object/MachO.h
+++ b/llvm/include/llvm/Object/MachO.h
@@ -447,7 +447,7 @@ class LLVM_ABI MachOObjectFile : public ObjectFile {
uint64_t getSectionAddress(DataRefImpl Sec) const override;
uint64_t getSectionIndex(DataRefImpl Sec) const override;
uint64_t getSectionSize(DataRefImpl Sec) const override;
- ArrayRef<uint8_t> getSectionContents(uint32_t Offset, uint64_t Size) const;
+ ArrayRef<uint8_t> getSectionContents(uint64_t Offset, uint64_t Size) const;
Expected<ArrayRef<uint8_t>>
getSectionContents(DataRefImpl Sec) const override;
uint64_t getSectionAlignment(DataRefImpl Sec) const override;
diff --git a/llvm/lib/Object/MachOObjectFile.cpp b/llvm/lib/Object/MachOObjectFile.cpp
index e09dc947c2779..300a5f7ed2a48 100644
--- a/llvm/lib/Object/MachOObjectFile.cpp
+++ b/llvm/lib/Object/MachOObjectFile.cpp
@@ -1978,20 +1978,34 @@ uint64_t MachOObjectFile::getSectionSize(DataRefImpl Sec) const {
return SectSize;
}
-ArrayRef<uint8_t> MachOObjectFile::getSectionContents(uint32_t Offset,
+ArrayRef<uint8_t> MachOObjectFile::getSectionContents(uint64_t Offset,
uint64_t Size) const {
return arrayRefFromStringRef(getData().substr(Offset, Size));
}
Expected<ArrayRef<uint8_t>>
MachOObjectFile::getSectionContents(DataRefImpl Sec) const {
- uint32_t Offset;
+ uint64_t Offset;
uint64_t Size;
if (is64Bit()) {
MachO::section_64 Sect = getSection64(Sec);
Offset = Sect.offset;
Size = Sect.size;
+ // Check for large mach-o files where the section contents might exceed
+ // 4GB. MachO::section_64 objects only have 32 bit file offsets to the
+ // section contents and can overflow in dSYM files. We can track this and
+ // adjust the section offset to be 64 bit safe.
+ uint64_t SectOffsetAdjust = 0;
+ for (uint32_t SectIdx=0; SectIdx<Sec.d.a; ++SectIdx) {
+ MachO::section_64 CurrSect =
+ getStruct<MachO::section_64>(*this, Sections[SectIdx]);
+ const uint64_t EndSectFileOffset =
+ (uint64_t)CurrSect.offset + CurrSect.size;
+ if (EndSectFileOffset >= UINT32_MAX)
+ SectOffsetAdjust += EndSectFileOffset & 0xFFFFFFFF00000000ull;
+ }
+ Offset += SectOffsetAdjust;
} else {
MachO::section Sect = getSection(Sec);
Offset = Sect.offset;
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
We now return an error if a section file offset exceeds 4GB and the sections are not ordered in the mach-o file. If sections are not ordered, we can't assume the section file offset overflows make sense to apply to other sections, but we can if they are ordered.
457d287 to
350328a
Compare
| getStruct<MachO::section_64>(*this, Sections[SectIdx]); | ||
| uint64_t CurrTrueOffset = (uint64_t)CurrSect.offset + SectOffsetAdjust; | ||
| if ((SectOffsetAdjust > 0) && (PrevTrueOffset > CurrTrueOffset)) | ||
| return malformedError("section data exceeds 4GB and are not ordered"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return malformedError("section data exceeds 4GB and are not ordered"); | |
| return malformedError("section data exceeds 4GB and is not ordered"); |
|
Thanks for fixing! Can we add a test? |
Mach-o has 32 bit file offsets in the MachO::section_64 structs. dSYM files can contain sections whose start offset exceeds UINT32_MAX, which means the MachO::section_64.offset will get truncated. We can calculate when this happens and properly adjust the section offset to be 64 bit safe. This means tools can get the correct section contents for large dSYM files and allows tools that parse DWARF, like llvm-gsymutil, to be able to load and convert these files correctly.