-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[llvm-objdump] Add preliminary support for decoding binary files #115667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
@llvm/pr-subscribers-llvm-binary-utilities Author: Michael Clark (michaeljclark) Changesthis pull request occurred because I frequently use
Patch is 24.80 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/115667.diff 12 Files Affected:
diff --git a/llvm/include/llvm/Object/Binary.h b/llvm/include/llvm/Object/Binary.h
index ce870e25acafe0..112c3ef66c7363 100644
--- a/llvm/include/llvm/Object/Binary.h
+++ b/llvm/include/llvm/Object/Binary.h
@@ -71,6 +71,7 @@ class Binary {
ID_GOFF,
ID_Wasm,
+ ID_Binary,
ID_EndObjects
};
@@ -191,7 +192,8 @@ DEFINE_ISA_CONVERSION_FUNCTIONS(Binary, LLVMBinaryRef)
/// @param Source The data to create the Binary from.
Expected<std::unique_ptr<Binary>> createBinary(MemoryBufferRef Source,
LLVMContext *Context = nullptr,
- bool InitContent = true);
+ bool InitContent = true,
+ bool RawBinary = false);
template <typename T> class OwningBinary {
std::unique_ptr<T> Bin;
@@ -243,7 +245,8 @@ template <typename T> const T* OwningBinary<T>::getBinary() const {
Expected<OwningBinary<Binary>> createBinary(StringRef Path,
LLVMContext *Context = nullptr,
- bool InitContent = true);
+ bool InitContent = true,
+ bool RawBinary = false);
} // end namespace object
diff --git a/llvm/include/llvm/Object/BinaryObjectFile.h b/llvm/include/llvm/Object/BinaryObjectFile.h
new file mode 100644
index 00000000000000..622c7fa4fbc63c
--- /dev/null
+++ b/llvm/include/llvm/Object/BinaryObjectFile.h
@@ -0,0 +1,130 @@
+//===- BinaryObjectFile.h - Binary object file implementation ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file declares the BinaryObjectFile class.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_OBJECT_RAWOBJECTFILE_H
+#define LLVM_OBJECT_RAWOBJECTFILE_H
+
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Object/Binary.h"
+#include "llvm/Object/Error.h"
+#include "llvm/Object/ObjectFile.h"
+#include "llvm/Object/SymbolicFile.h"
+#include "llvm/Support/Casting.h"
+#include "llvm/Support/Error.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/MemoryBufferRef.h"
+#include "llvm/Support/ScopedPrinter.h"
+#include "llvm/TargetParser/SubtargetFeature.h"
+#include "llvm/TargetParser/Triple.h"
+#include <cassert>
+#include <cstdint>
+
+namespace llvm {
+
+template <typename T> class SmallVectorImpl;
+
+namespace object {
+
+struct BinarySymbol {
+ uint32_t Flags = 0;
+ uint64_t Value = 0;
+ StringRef Name;
+};
+
+struct BinaryRelocation {
+ uint64_t Offset = 0;
+ uint64_t Symbol = 0;
+ uint64_t Type = 0;
+};
+
+struct BinarySection {
+ uint64_t Offset = 0;
+ uint64_t Index = 0;
+ uint64_t Address = 0;
+ uint64_t Size = 0;
+ StringRef Name;
+ std::vector<BinaryRelocation> Relocations;
+};
+
+class BinaryObjectFile : public ObjectFile {
+private:
+ std::vector<BinarySymbol> Symbols;
+ std::vector<BinarySection> Sections;
+
+public:
+ BinaryObjectFile(MemoryBufferRef Source);
+
+ bool is64Bit() const override;
+
+ basic_symbol_iterator symbol_begin() const override;
+ basic_symbol_iterator symbol_end() const override;
+ section_iterator section_begin() const override;
+ section_iterator section_end() const override;
+
+ const BinarySymbol &getBinarySymbol(const DataRefImpl &Symb) const;
+ const BinarySymbol &getBinarySymbol(const SymbolRef &Symb) const;
+
+ void moveSymbolNext(DataRefImpl &Symb) const override;
+ Expected<StringRef> getSymbolName(DataRefImpl Symb) const override;
+ Expected<uint32_t> getSymbolFlags(DataRefImpl Symb) const override;
+ Expected<uint64_t> getSymbolAddress(DataRefImpl Symb) const override;
+ uint64_t getSymbolValueImpl(DataRefImpl Symb) const override;
+ uint64_t getCommonSymbolSizeImpl(DataRefImpl Symb) const override;
+ Expected<SymbolRef::Type> getSymbolType(DataRefImpl Symb) const override;
+ Expected<section_iterator> getSymbolSection(DataRefImpl Symb) const override;
+
+ const BinarySection &getBinarySection(const DataRefImpl Ref) const;
+ const BinarySection &getBinarySection(const SectionRef &Section) const;
+
+ void moveSectionNext(DataRefImpl &Sec) const override;
+ Expected<StringRef> getSectionName(DataRefImpl Sec) const override;
+ uint64_t getSectionAddress(DataRefImpl Sec) const override;
+ uint64_t getSectionIndex(DataRefImpl Sec) const override;
+ uint64_t getSectionSize(DataRefImpl Sec) const override;
+ Expected<ArrayRef<uint8_t>>
+ getSectionContents(DataRefImpl Sec) const override;
+ uint64_t getSectionAlignment(DataRefImpl Sec) const override;
+ bool isSectionCompressed(DataRefImpl Sec) const override;
+ bool isSectionText(DataRefImpl Sec) const override;
+ bool isSectionData(DataRefImpl Sec) const override;
+ bool isSectionBSS(DataRefImpl Sec) const override;
+ bool isSectionVirtual(DataRefImpl Sec) const override;
+
+ relocation_iterator section_rel_begin(DataRefImpl Sec) const override;
+ relocation_iterator section_rel_end(DataRefImpl Sec) const override;
+
+ const BinaryRelocation &getBinaryRelocation(const RelocationRef &Ref) const;
+ const BinaryRelocation &getBinaryRelocation(DataRefImpl Ref) const;
+
+ // Overrides from RelocationRef.
+ void moveRelocationNext(DataRefImpl &Rel) const override;
+ uint64_t getRelocationOffset(DataRefImpl Rel) const override;
+ symbol_iterator getRelocationSymbol(DataRefImpl Rel) const override;
+ uint64_t getRelocationType(DataRefImpl Rel) const override;
+ void getRelocationTypeName(DataRefImpl Rel,
+ SmallVectorImpl<char> &Result) const override;
+
+ uint8_t getBytesInAddress() const override;
+ StringRef getFileFormatName() const override;
+ Triple::ArchType getArch() const override;
+ Expected<SubtargetFeatures> getFeatures() const override;
+ std::optional<StringRef> tryGetCPUName() const override;
+ bool isRelocatableObject() const override;
+};
+
+} // end namespace object
+} // end namespace llvm
+
+#endif // LLVM_OBJECT_RAWOBJECTFILE_H
diff --git a/llvm/include/llvm/Object/ObjectFile.h b/llvm/include/llvm/Object/ObjectFile.h
index f49763e31a9c76..886cd9be2b018d 100644
--- a/llvm/include/llvm/Object/ObjectFile.h
+++ b/llvm/include/llvm/Object/ObjectFile.h
@@ -43,6 +43,7 @@ class SectionRef;
class SymbolRef;
class symbol_iterator;
class WasmObjectFile;
+class BinaryObjectFile;
using section_iterator = content_iterator<SectionRef>;
@@ -400,6 +401,9 @@ class ObjectFile : public SymbolicFile {
static Expected<std::unique_ptr<WasmObjectFile>>
createWasmObjectFile(MemoryBufferRef Object);
+
+ static Expected<std::unique_ptr<BinaryObjectFile>>
+ createBinaryObjectFile(MemoryBufferRef Object);
};
/// A filtered iterator for SectionRefs that skips sections based on some given
diff --git a/llvm/lib/Object/Binary.cpp b/llvm/lib/Object/Binary.cpp
index 2dfae8ab5d3c64..740e5169fc91ca 100644
--- a/llvm/lib/Object/Binary.cpp
+++ b/llvm/lib/Object/Binary.cpp
@@ -14,6 +14,7 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/BinaryFormat/Magic.h"
#include "llvm/Object/Archive.h"
+#include "llvm/Object/BinaryObjectFile.h"
#include "llvm/Object/Error.h"
#include "llvm/Object/MachOUniversal.h"
#include "llvm/Object/Minidump.h"
@@ -44,9 +45,14 @@ MemoryBufferRef Binary::getMemoryBufferRef() const { return Data; }
Expected<std::unique_ptr<Binary>> object::createBinary(MemoryBufferRef Buffer,
LLVMContext *Context,
- bool InitContent) {
+ bool InitContent,
+ bool RawBinary) {
file_magic Type = identify_magic(Buffer.getBuffer());
+ if (RawBinary) {
+ return ObjectFile::createBinaryObjectFile(Buffer);
+ }
+
switch (Type) {
case file_magic::archive:
return Archive::create(Buffer);
@@ -103,8 +109,10 @@ Expected<std::unique_ptr<Binary>> object::createBinary(MemoryBufferRef Buffer,
llvm_unreachable("Unexpected Binary File Type");
}
-Expected<OwningBinary<Binary>>
-object::createBinary(StringRef Path, LLVMContext *Context, bool InitContent) {
+Expected<OwningBinary<Binary>> object::createBinary(StringRef Path,
+ LLVMContext *Context,
+ bool InitContent,
+ bool RawBinary) {
ErrorOr<std::unique_ptr<MemoryBuffer>> FileOrErr =
MemoryBuffer::getFileOrSTDIN(Path, /*IsText=*/false,
/*RequiresNullTerminator=*/false);
@@ -113,7 +121,7 @@ object::createBinary(StringRef Path, LLVMContext *Context, bool InitContent) {
std::unique_ptr<MemoryBuffer> &Buffer = FileOrErr.get();
Expected<std::unique_ptr<Binary>> BinOrErr =
- createBinary(Buffer->getMemBufferRef(), Context, InitContent);
+ createBinary(Buffer->getMemBufferRef(), Context, InitContent, RawBinary);
if (!BinOrErr)
return BinOrErr.takeError();
std::unique_ptr<Binary> &Bin = BinOrErr.get();
diff --git a/llvm/lib/Object/BinaryObjectFile.cpp b/llvm/lib/Object/BinaryObjectFile.cpp
new file mode 100644
index 00000000000000..a3a3846fa822db
--- /dev/null
+++ b/llvm/lib/Object/BinaryObjectFile.cpp
@@ -0,0 +1,242 @@
+//===- BinaryObjectFile.cpp - Binary object file implementation -----------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Part of the BinaryObjectFile class implementation.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Object/BinaryObjectFile.h"
+#include "llvm/MC/TargetRegistry.h"
+#include "llvm/Object/Error.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/TargetParser/SubtargetFeature.h"
+#include "llvm/TargetParser/Triple.h"
+#include <algorithm>
+#include <cstddef>
+#include <cstdint>
+#include <memory>
+#include <optional>
+#include <string>
+#include <utility>
+
+using namespace llvm;
+using namespace object;
+
+BinaryObjectFile::BinaryObjectFile(MemoryBufferRef Source)
+ : ObjectFile(ID_Binary, Source), Symbols(), Sections() {
+ Symbols.push_back(BinarySymbol{});
+ Symbols[0].Name = ".data";
+ Sections.push_back(BinarySection{});
+ Sections[0].Size = Source.getBufferSize();
+}
+
+Expected<std::unique_ptr<BinaryObjectFile>>
+ObjectFile::createBinaryObjectFile(MemoryBufferRef Obj) {
+ Error Err = Error::success();
+ auto ObjectFile = std::make_unique<BinaryObjectFile>(Obj);
+ return std::move(ObjectFile);
+}
+
+bool BinaryObjectFile::is64Bit() const { return true; }
+
+basic_symbol_iterator BinaryObjectFile::symbol_begin() const {
+ DataRefImpl Ref;
+ Ref.d.a = 1; // Arbitrary non-zero value so that Ref.p is non-null
+ Ref.d.b = 0; // Symbol index
+ return BasicSymbolRef(Ref, this);
+}
+
+basic_symbol_iterator BinaryObjectFile::symbol_end() const {
+ DataRefImpl Ref;
+ Ref.d.a = 1; // Arbitrary non-zero value so that Ref.p is non-null
+ Ref.d.b = Symbols.size(); // Symbol index
+ return BasicSymbolRef(Ref, this);
+}
+
+void BinaryObjectFile::moveSymbolNext(DataRefImpl &Symb) const { Symb.d.b++; }
+
+const BinarySymbol &
+BinaryObjectFile::getBinarySymbol(const DataRefImpl &Symb) const {
+ assert(Symb.d.b < Symbols.size());
+ return Symbols[Symb.d.b];
+}
+
+const BinarySymbol &
+BinaryObjectFile::getBinarySymbol(const SymbolRef &Symb) const {
+ return getBinarySymbol(Symb.getRawDataRefImpl());
+}
+
+const BinarySection &
+BinaryObjectFile::getBinarySection(const DataRefImpl Ref) const {
+ assert(Ref.d.a < Sections.size());
+ return Sections[Ref.d.a];
+}
+
+const BinarySection &
+BinaryObjectFile::getBinarySection(const SectionRef &Section) const {
+ return getBinarySection(Section.getRawDataRefImpl());
+}
+
+const BinaryRelocation &
+BinaryObjectFile::getBinaryRelocation(const RelocationRef &Ref) const {
+ return getBinaryRelocation(Ref.getRawDataRefImpl());
+}
+
+const BinaryRelocation &
+BinaryObjectFile::getBinaryRelocation(DataRefImpl Ref) const {
+ assert(Ref.d.a < Sections.size());
+ const BinarySection &Sec = Sections[Ref.d.a];
+ assert(Ref.d.b < Sec.Relocations.size());
+ return Sec.Relocations[Ref.d.b];
+}
+
+Expected<StringRef> BinaryObjectFile::getSymbolName(DataRefImpl Symb) const {
+ return getBinarySymbol(Symb).Name;
+}
+
+Expected<uint32_t> BinaryObjectFile::getSymbolFlags(DataRefImpl Symb) const {
+ return getBinarySymbol(Symb).Flags;
+}
+
+uint64_t BinaryObjectFile::getSymbolValueImpl(DataRefImpl Symb) const {
+ return getBinarySymbol(Symb).Value;
+}
+
+uint64_t BinaryObjectFile::getCommonSymbolSizeImpl(DataRefImpl Symb) const {
+ llvm_unreachable("not yet implemented");
+ return 0;
+}
+
+Expected<SymbolRef::Type>
+BinaryObjectFile::getSymbolType(DataRefImpl Symb) const {
+ return SymbolRef::ST_Other;
+}
+
+Expected<section_iterator>
+BinaryObjectFile::getSymbolSection(DataRefImpl Symb) const {
+ DataRefImpl Ref;
+ Ref.d.a = 0;
+ return section_iterator(SectionRef(Ref, this));
+}
+
+Expected<uint64_t> BinaryObjectFile::getSymbolAddress(DataRefImpl Sym) const {
+ return getSymbolValue(Sym);
+}
+
+section_iterator BinaryObjectFile::section_begin() const {
+ DataRefImpl Ref;
+ Ref.d.a = 0;
+ return section_iterator(SectionRef(Ref, this));
+}
+
+section_iterator BinaryObjectFile::section_end() const {
+ DataRefImpl Ref;
+ Ref.d.a = Sections.size();
+ return section_iterator(SectionRef(Ref, this));
+}
+void BinaryObjectFile::moveSectionNext(DataRefImpl &Sec) const { Sec.d.a++; }
+
+Expected<StringRef> BinaryObjectFile::getSectionName(DataRefImpl Ref) const {
+ return getBinarySection(Ref).Name;
+}
+
+uint64_t BinaryObjectFile::getSectionAddress(DataRefImpl Ref) const {
+ return getBinarySection(Ref).Address;
+}
+uint64_t BinaryObjectFile::getSectionIndex(DataRefImpl Ref) const {
+ return getBinarySection(Ref).Index;
+}
+uint64_t BinaryObjectFile::getSectionSize(DataRefImpl Ref) const {
+ return getBinarySection(Ref).Size;
+}
+
+Expected<ArrayRef<uint8_t>>
+BinaryObjectFile::getSectionContents(DataRefImpl Sec) const {
+ return ArrayRef<uint8_t>((const uint8_t *)Data.getBuffer().data(),
+ Data.getBufferSize());
+}
+
+uint64_t BinaryObjectFile::getSectionAlignment(DataRefImpl Sec) const {
+ return 1;
+}
+
+bool BinaryObjectFile::isSectionCompressed(DataRefImpl Sec) const {
+ return false;
+}
+bool BinaryObjectFile::isSectionText(DataRefImpl Sec) const { return true; }
+bool BinaryObjectFile::isSectionData(DataRefImpl Sec) const { return false; }
+bool BinaryObjectFile::isSectionBSS(DataRefImpl Sec) const { return false; }
+bool BinaryObjectFile::isSectionVirtual(DataRefImpl Sec) const { return false; }
+
+relocation_iterator BinaryObjectFile::section_rel_begin(DataRefImpl Ref) const {
+ DataRefImpl RelocRef;
+ RelocRef.d.a = Ref.d.a;
+ RelocRef.d.b = 0;
+ return relocation_iterator(RelocationRef(RelocRef, this));
+}
+
+relocation_iterator BinaryObjectFile::section_rel_end(DataRefImpl Ref) const {
+ const BinarySection &Sec = getBinarySection(Ref);
+ DataRefImpl RelocRef;
+ RelocRef.d.a = Ref.d.a;
+ RelocRef.d.b = Sec.Relocations.size();
+ return relocation_iterator(RelocationRef(RelocRef, this));
+}
+
+void BinaryObjectFile::moveRelocationNext(DataRefImpl &Rel) const { Rel.d.b++; }
+
+uint64_t BinaryObjectFile::getRelocationOffset(DataRefImpl Ref) const {
+ const BinaryRelocation &Rel = getBinaryRelocation(Ref);
+ return Rel.Offset;
+}
+
+symbol_iterator BinaryObjectFile::getRelocationSymbol(DataRefImpl Ref) const {
+ const BinaryRelocation &Rel = getBinaryRelocation(Ref);
+ DataRefImpl Sym;
+ Sym.d.a = 1;
+ Sym.d.b = Rel.Symbol;
+ return symbol_iterator(SymbolRef(Sym, this));
+}
+
+uint64_t BinaryObjectFile::getRelocationType(DataRefImpl Ref) const {
+ const BinaryRelocation &Rel = getBinaryRelocation(Ref);
+ return Rel.Type;
+}
+
+void BinaryObjectFile::getRelocationTypeName(
+ DataRefImpl Ref, SmallVectorImpl<char> &Result) const {
+ const BinaryRelocation &Rel = getBinaryRelocation(Ref);
+ StringRef Res;
+ switch (Rel.Type) {
+ case 0:
+ default:
+ Res = "unknown";
+ break;
+ }
+ Result.append(Res.begin(), Res.end());
+}
+
+uint8_t BinaryObjectFile::getBytesInAddress() const {
+ return is64Bit() ? 8 : 4;
+}
+
+StringRef BinaryObjectFile::getFileFormatName() const { return "binary"; }
+
+Triple::ArchType BinaryObjectFile::getArch() const {
+ return Triple::UnknownArch;
+}
+
+Expected<SubtargetFeatures> BinaryObjectFile::getFeatures() const {
+ return SubtargetFeatures();
+}
+
+std::optional<StringRef> BinaryObjectFile::tryGetCPUName() const {
+ return std::nullopt;
+}
+
+bool BinaryObjectFile::isRelocatableObject() const { return false; }
diff --git a/llvm/lib/Object/CMakeLists.txt b/llvm/lib/Object/CMakeLists.txt
index bfb420e57a7f49..0b1a3d32b0ef0d 100644
--- a/llvm/lib/Object/CMakeLists.txt
+++ b/llvm/lib/Object/CMakeLists.txt
@@ -33,6 +33,7 @@ add_llvm_component_library(LLVMObject
WindowsMachineFlag.cpp
WindowsResource.cpp
XCOFFObjectFile.cpp
+ BinaryObjectFile.cpp
ADDITIONAL_HEADER_DIRS
${LLVM_MAIN_INCLUDE_DIR}/llvm/Object
diff --git a/llvm/tools/llvm-objdump/BinaryDump.cpp b/llvm/tools/llvm-objdump/BinaryDump.cpp
new file mode 100644
index 00000000000000..6c2c691ef1d74e
--- /dev/null
+++ b/llvm/tools/llvm-objdump/BinaryDump.cpp
@@ -0,0 +1,42 @@
+//===-- BinaryDump.cpp - raw-binary dumper ----------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// This file implements the raw binary dumper for llvm-objdump.
+///
+//===----------------------------------------------------------------------===//
+
+#include "BinaryDump.h"
+
+#include "llvm-objdump.h"
+#include "llvm/Object/BinaryObjectFile.h"
+#include "llvm/Object/ObjectFile.h"
+#include "llvm/Support/Error.h"
+
+using namespace llvm;
+using namespace llvm::object;
+
+namespace {
+class BinaryDumper : public objdump::Dumper {
+ const BinaryObjectFile &Obj;
+
+public:
+ BinaryDumper(const BinaryObjectFile &O) : Dumper(O), Obj(O) {}
+};
+} // namespace
+
+std::unique_ptr<objdump::Dumper>
+objdump::createBinaryDumper(const BinaryObjectFile &Obj) {
+ return std::make_unique<BinaryDumper>(Obj);
+}
+
+Error objdump::getBinaryRelocationValueString(const BinaryObjectFile *Obj,
+ const RelocationRef &RelRef,
+ SmallVectorImpl<char> &Result) {
+ return Error::success();
+}
diff --git a/llvm/tools/llvm-objdump/BinaryDump.h b/llvm/tools/llvm-objdump/BinaryDump.h
new file mode 100644
index 00000000000000..f8c76e11b20088
--- /dev/null
+++ b/llvm/tools/llvm-objdump/BinaryDump.h
@@ -0,0 +1,30 @@
+//===-- BinaryDump.h - raw-binary dumper ------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TOOLS_LLVM_OBJDUMP_BINARYDUMP_H
+#define LLVM_TOOLS_LLVM_OBJDUMP_BINARYDUMP_H
+
+#include "llvm/ADT/SmallVector.h"
+
+namespace llvm {
+class Error;
+namespace object {
+class BinaryObjectFile;
+class RelocationRef;
+} // namespace object
+
+namespace objdump {
+
+Error getBinaryRelocationValueString(const object::BinaryObjectFile *Obj,
+ const object::RelocationRef &RelRef,
+ ...
[truncated]
|
jh7370
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should aim for this new binary option (not to be confused with the triple option) to be command-line compatible with the equivalent GNU option, i.e. to have the same name. This is similar to our existing option behaviour. NB: for textual output, it doesn't need to be byte-for-byte identical.
That being said, internally, I'd like to reduce the ambiguity caused by the term "binary" since any file on disk is a binary file, just usually with some specific structure (i.e. ELF/COFF etc). This is particularly an issue because we already have a Binary class, which reflects this relationship. Could we refer to it as RawBinary everywhere in file/variable/function names etc as appropriate?
Is this intended to have a functional impact, as there is no testing included.
| bool RawBinary) { | ||
| file_magic Type = identify_magic(Buffer.getBuffer()); | ||
|
|
||
| if (RawBinary) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: no need for braces for single line if.
| // | ||
| //===----------------------------------------------------------------------===// | ||
| // | ||
| // This file declares the BinaryObjectFile class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be really helpful to have a summary of what a BinaryObjectFile actually represents/how it represents it.
| } | ||
|
|
||
| uint64_t BinaryObjectFile::getCommonSymbolSizeImpl(DataRefImpl Symb) const { | ||
| llvm_unreachable("not yet implemented"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"not yet implemented" implies to me that this code could be reached, but just hasn't been written yet, which means llvm_unreachable would be inappropriate. Should this just be a regular assert?
| bool BinaryObjectFile::isSectionCompressed(DataRefImpl Sec) const { | ||
| return false; | ||
| } | ||
| bool BinaryObjectFile::isSectionText(DataRefImpl Sec) const { return true; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: for all of these trivial functions, I'd inline them into the header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isSectionText => false, isSectionData => true to be similar to GNU objdump.
-D disassembles .data but .d doesn't
| //===----------------------------------------------------------------------===// | ||
|
|
||
| #include "BinaryDump.h" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we don't usually have blank lines separating #include directives.
| } // namespace object | ||
|
|
||
| namespace objdump { | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: you're being a bit hit-and-miss with where your blank lines are when starting/ending namespaces.
| if (FilterSections.empty() && !DisassembleAll && | ||
| (!Section.isText() || Section.isVirtual())) | ||
| continue; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: please reinstate this blank line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jh,
I am fine with most of the changes you suggested except for command-line compatibility with GNU objdump. the reasoning is we want to be consistent with --triple and if we implement -b as per GNU then you have now widened the scope of this patch to add manual detection for file tables formats with "well defined" labels in their magic. I have never found the need to specify a file type for a file that already has file magic. I like -b and --binary. but it's your call.
although I especially agree with reinstating blank lines which is why I am replying here. I guess I can't blame clang-format-diff.py, for this but I don't remember seeing this although my cursor was definitely there. 😄
I don't have time right now to go through a respin with the rename or to add tests because I have moved on to something else. I can return to this when I have free time. I started off with Raw but didn't like the sound of it so I went back to Binary after I discovered that BinaryObjectFile didn't clash with BinaryFile. In fact I think BinaryFile should be the one to change because it is the most vague and has relatively few references. Binary Artefact sounds better because it is an output that can be used as input. RawBinary is acceptable for this but if all possible I would vote for a name without Raw in it because it aliases with a word from nature, although I used it once in the if statement so that it is unambiguous there but I think once is enough. Your call.
if you beat me to it, I will close this PR. I've added a Signed-off-by to the commit and reinstated the blank line. the main point of the PR is so that the search engine picks up the search terms so that next version of me can pull a patch shared as a PR.
thanks, michael.
| if (InputArgs.getLastArg(OBJDUMP_binary)) { | ||
| BinaryFile = true; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: no need for braces.
27d5de3 to
9022bc9
Compare
in regards to my earlier comment, I will omit the short option |
$ echo '49 0f c7 0f' | xxd -r -p - > test.bin
$ llvm-objdump -d -Mintel --binary --triple x86_64 test.bin
test.bin: file format binary
Disassembly of section :
0000000000000000 <.data>:
0: 49 0f c7 0f cmpxchg16b xmmword ptr [r15]
Signed-off-by: Michael Clark <[email protected]>
9022bc9 to
ca82401
Compare
|
Note: the parent commit is from Aug, which is probably too old. I know that building a large repository like llvm-project is painful, but a branch older than one month easily leads to various test issues (others have changed something that would require you to adapt). We can support In GNU objdump, we can specify In llvm-project, we emulate the bfdname in at least two places:
|
| WindowsMachineFlag.cpp | ||
| WindowsResource.cpp | ||
| XCOFFObjectFile.cpp | ||
| BinaryObjectFile.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alphabetical order
|
BinaryObjectFile.cpp:40 has an unchecked |
| return Rel.Offset; | ||
| } | ||
|
|
||
| symbol_iterator BinaryObjectFile::getRelocationSymbol(DataRefImpl Ref) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no relocations, so functions like getRelocationSymbol/section_rel_begin can be simplified to return an empty/dummy value. Some functions like moveNext can use llvm_unreachable.
this approach doesn't seem useful to me unless you implement I call this a "pile on" where you gate addition of some feature due to a bunch of tangentially related issues to prevent incremental progress. I agree in concept to all of the other trival formatting, diagnostic and stylistic issues but until you either decide to stay clear of a clash or faithfully implement compatibility, then I don't see some intermediate point as being useful because the use cases will still fail. you sort of have to decide about in any case it is useful to leave the PR here because someone might want the function with the slightly different way of invoking it and pull the patch from the PR to their local tree, i.e. in a way that matches |
this pull request occurred because I frequently use
-bbinary -mi386:x86-64 -Mintelwith the binutils objdump to disassemble binary files. in this patch, the code adds one section and one symbol named.data. it uses-bor--binaryalong with the pre-existing--tripleoption which makes it very close but slightly different from binutils objdump. this needs testing with other targets and currently it returns true tobool is64Bit()which I guess is mostly just for formatting. it doesn't know what triple it has been invoked with. I guess it should work fine on 32-bit targets.objdump -D -bbinary -mi386:x86-64 -Mintel test.binllvm-objdump -d -b --triple x86_64 -Mintel test.bin