-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[lldb] Add VirtualDataExtractor for virtual address translation #168802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🐧 Linux x64 Test Results
|
a225802 to
9512330
Compare
|
@llvm/pr-subscribers-lldb Author: Jonas Devlieghere (JDevlieghere) ChangesIntroduce VirtualDataExtractor, a DataExtractor subclass that enables reading data at virtual addresses by translating them to physical buffer offsets using a lookup table. The lookup table maps virtual address ranges to physical offsets and enforces boundaries to prevent reads from crossing entry limits. The new class inherits from DataExtractor, overriding GetData and PeekData to provide transparent virtual address translation for most of the DataExtractor methods. The exception are the unchecked methods, that bypass those methods and are overloaded as well. Patch is 39.00 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/168802.diff 6 Files Affected:
diff --git a/lldb/include/lldb/Utility/DataExtractor.h b/lldb/include/lldb/Utility/DataExtractor.h
index b4960f5e87c85..fe217795ff3b1 100644
--- a/lldb/include/lldb/Utility/DataExtractor.h
+++ b/lldb/include/lldb/Utility/DataExtractor.h
@@ -334,7 +334,8 @@ class DataExtractor {
/// \return
/// A pointer to the bytes in this object's data if the offset
/// and length are valid, or nullptr otherwise.
- const void *GetData(lldb::offset_t *offset_ptr, lldb::offset_t length) const {
+ virtual const void *GetData(lldb::offset_t *offset_ptr,
+ lldb::offset_t length) const {
const uint8_t *ptr = PeekData(*offset_ptr, length);
if (ptr)
*offset_ptr += length;
@@ -829,7 +830,8 @@ class DataExtractor {
/// A non-nullptr data pointer if \a offset is a valid offset and
/// there are \a length bytes available at that offset, nullptr
/// otherwise.
- const uint8_t *PeekData(lldb::offset_t offset, lldb::offset_t length) const {
+ virtual const uint8_t *PeekData(lldb::offset_t offset,
+ lldb::offset_t length) const {
if (ValidOffsetForDataOfSize(offset, length))
return m_start + offset;
return nullptr;
diff --git a/lldb/include/lldb/Utility/VirtualDataExtractor.h b/lldb/include/lldb/Utility/VirtualDataExtractor.h
new file mode 100644
index 0000000000000..a57d83dde21be
--- /dev/null
+++ b/lldb/include/lldb/Utility/VirtualDataExtractor.h
@@ -0,0 +1,82 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLDB_UTILITY_VIRTUALDATAEXTRACTOR_H
+#define LLDB_UTILITY_VIRTUALDATAEXTRACTOR_H
+
+#include "lldb/Utility/DataExtractor.h"
+#include "lldb/Utility/RangeMap.h"
+#include "lldb/lldb-types.h"
+
+namespace lldb_private {
+
+/// A DataExtractor subclass that allows reading data at virtual addresses
+/// using a lookup table that maps virtual address ranges to physical offsets.
+///
+/// This class maintains a lookup table where each entry contains:
+/// - base: starting virtual address for this entry
+/// - size: size of this entry in bytes
+/// - data: physical offset in the underlying data buffer
+///
+/// Reads are translated from virtual addresses to physical offsets using
+/// this lookup table. Reads cannot cross entry boundaries and this is
+/// enforced with assertions.
+class VirtualDataExtractor : public DataExtractor {
+public:
+ /// Type alias for the range map used internally.
+ /// Maps virtual addresses (base) to physical offsets (data).
+ using LookupTable =
+ RangeDataVector<lldb::offset_t, lldb::offset_t, lldb::offset_t>;
+
+ VirtualDataExtractor() = default;
+
+ VirtualDataExtractor(const void *data, lldb::offset_t data_length,
+ lldb::ByteOrder byte_order, uint32_t addr_size,
+ LookupTable lookup_table);
+
+ VirtualDataExtractor(const lldb::DataBufferSP &data_sp,
+ lldb::ByteOrder byte_order, uint32_t addr_size,
+ LookupTable lookup_table);
+
+ const void *GetData(lldb::offset_t *offset_ptr,
+ lldb::offset_t length) const override;
+
+ const uint8_t *PeekData(lldb::offset_t offset,
+ lldb::offset_t length) const override;
+
+ uint8_t GetU8_unchecked(lldb::offset_t *offset_ptr) const;
+
+ uint16_t GetU16_unchecked(lldb::offset_t *offset_ptr) const;
+
+ uint32_t GetU32_unchecked(lldb::offset_t *offset_ptr) const;
+
+ uint64_t GetU64_unchecked(lldb::offset_t *offset_ptr) const;
+
+ uint64_t GetMaxU64_unchecked(lldb::offset_t *offset_ptr,
+ size_t byte_size) const;
+
+ uint64_t GetAddress_unchecked(lldb::offset_t *offset_ptr) const;
+
+ const LookupTable &GetLookupTable() const { return m_lookup_table; }
+
+protected:
+ /// Find the lookup entry that contains the given virtual address.
+ const LookupTable::Entry *FindEntry(lldb::offset_t virtual_addr) const;
+
+ /// Validate that a read at a virtual address is within bounds and
+ /// does not cross entry boundaries.
+ bool ValidateVirtualRead(lldb::offset_t virtual_addr,
+ lldb::offset_t length) const;
+
+private:
+ LookupTable m_lookup_table;
+};
+
+} // namespace lldb_private
+
+#endif // LLDB_UTILITY_VIRTUALDATAEXTRACTOR_H
diff --git a/lldb/source/Utility/CMakeLists.txt b/lldb/source/Utility/CMakeLists.txt
index 1dd4d63f7016f..4696ed4690d37 100644
--- a/lldb/source/Utility/CMakeLists.txt
+++ b/lldb/source/Utility/CMakeLists.txt
@@ -78,6 +78,7 @@ add_lldb_library(lldbUtility NO_INTERNAL_DEPENDENCIES
UserIDResolver.cpp
VASprintf.cpp
VMRange.cpp
+ VirtualDataExtractor.cpp
XcodeSDK.cpp
ZipFile.cpp
diff --git a/lldb/source/Utility/VirtualDataExtractor.cpp b/lldb/source/Utility/VirtualDataExtractor.cpp
new file mode 100644
index 0000000000000..537ba3930a91a
--- /dev/null
+++ b/lldb/source/Utility/VirtualDataExtractor.cpp
@@ -0,0 +1,164 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "lldb/Utility/VirtualDataExtractor.h"
+#include <cassert>
+
+using namespace lldb;
+using namespace lldb_private;
+
+VirtualDataExtractor::VirtualDataExtractor(const void *data,
+ offset_t data_length,
+ ByteOrder byte_order,
+ uint32_t addr_size,
+ LookupTable lookup_table)
+ : DataExtractor(data, data_length, byte_order, addr_size),
+ m_lookup_table(std::move(lookup_table)) {
+ m_lookup_table.Sort();
+}
+
+VirtualDataExtractor::VirtualDataExtractor(const DataBufferSP &data_sp,
+ ByteOrder byte_order,
+ uint32_t addr_size,
+ LookupTable lookup_table)
+ : DataExtractor(data_sp, byte_order, addr_size),
+ m_lookup_table(std::move(lookup_table)) {
+ m_lookup_table.Sort();
+}
+
+const VirtualDataExtractor::LookupTable::Entry *
+VirtualDataExtractor::FindEntry(offset_t virtual_addr) const {
+ // Use RangeDataVector's binary search instead of linear search.
+ return m_lookup_table.FindEntryThatContains(virtual_addr);
+}
+
+bool VirtualDataExtractor::ValidateVirtualRead(offset_t virtual_addr,
+ offset_t length) const {
+ const LookupTable::Entry *entry = FindEntry(virtual_addr);
+ if (!entry)
+ return false;
+
+ // Assert that the read does not cross entry boundaries.
+ // RangeData.Contains() checks if a range is fully contained.
+ assert(entry->Contains(LookupTable::Range(virtual_addr, length)) &&
+ "Read crosses lookup table entry boundary");
+
+ // Also validate that the physical offset is within the data buffer.
+ // RangeData.data contains the physical offset.
+ offset_t physical_offset = entry->data + (virtual_addr - entry->base);
+ return ValidOffsetForDataOfSize(physical_offset, length);
+}
+
+const void *VirtualDataExtractor::GetData(offset_t *offset_ptr,
+ offset_t length) const {
+ // Override to treat offset as virtual address.
+ if (!offset_ptr)
+ return nullptr;
+
+ offset_t virtual_addr = *offset_ptr;
+
+ if (!ValidateVirtualRead(virtual_addr, length))
+ return nullptr;
+
+ const LookupTable::Entry *entry = FindEntry(virtual_addr);
+ assert(entry && "ValidateVirtualRead should have found an entry");
+
+ offset_t physical_offset = entry->data + (virtual_addr - entry->base);
+ // Use base class PeekData directly to avoid recursion.
+ const void *result = DataExtractor::PeekData(physical_offset, length);
+
+ if (result) {
+ // Advance the virtual offset pointer.
+ *offset_ptr += length;
+ }
+
+ return result;
+}
+
+const uint8_t *VirtualDataExtractor::PeekData(offset_t offset,
+ offset_t length) const {
+ // Override to treat offset as virtual address.
+ if (!ValidateVirtualRead(offset, length))
+ return nullptr;
+
+ const LookupTable::Entry *entry = FindEntry(offset);
+ assert(entry && "ValidateVirtualRead should have found an entry");
+
+ offset_t physical_offset = entry->data + (offset - entry->base);
+ // Use the base class PeekData with the physical offset.
+ return DataExtractor::PeekData(physical_offset, length);
+}
+
+uint8_t VirtualDataExtractor::GetU8_unchecked(offset_t *offset_ptr) const {
+ offset_t virtual_addr = *offset_ptr;
+ const LookupTable::Entry *entry = FindEntry(virtual_addr);
+ assert(entry && "Unchecked methods require valid virtual address");
+
+ offset_t physical_offset = entry->data + (virtual_addr - entry->base);
+ uint8_t result = DataExtractor::GetU8_unchecked(&physical_offset);
+ *offset_ptr += 1;
+ return result;
+}
+
+uint16_t VirtualDataExtractor::GetU16_unchecked(offset_t *offset_ptr) const {
+ offset_t virtual_addr = *offset_ptr;
+ const LookupTable::Entry *entry = FindEntry(virtual_addr);
+ assert(entry && "Unchecked methods require valid virtual address");
+
+ offset_t physical_offset = entry->data + (virtual_addr - entry->base);
+ uint16_t result = DataExtractor::GetU16_unchecked(&physical_offset);
+ *offset_ptr += 2;
+ return result;
+}
+
+uint32_t VirtualDataExtractor::GetU32_unchecked(offset_t *offset_ptr) const {
+ offset_t virtual_addr = *offset_ptr;
+ const LookupTable::Entry *entry = FindEntry(virtual_addr);
+ assert(entry && "Unchecked methods require valid virtual address");
+
+ offset_t physical_offset = entry->data + (virtual_addr - entry->base);
+ uint32_t result = DataExtractor::GetU32_unchecked(&physical_offset);
+ *offset_ptr += 4;
+ return result;
+}
+
+uint64_t VirtualDataExtractor::GetU64_unchecked(offset_t *offset_ptr) const {
+ offset_t virtual_addr = *offset_ptr;
+ const LookupTable::Entry *entry = FindEntry(virtual_addr);
+ assert(entry && "Unchecked methods require valid virtual address");
+
+ offset_t physical_offset = entry->data + (virtual_addr - entry->base);
+ uint64_t result = DataExtractor::GetU64_unchecked(&physical_offset);
+ *offset_ptr += 8;
+ return result;
+}
+
+uint64_t VirtualDataExtractor::GetMaxU64_unchecked(offset_t *offset_ptr,
+ size_t byte_size) const {
+ offset_t virtual_addr = *offset_ptr;
+ const LookupTable::Entry *entry = FindEntry(virtual_addr);
+ assert(entry && "Unchecked methods require valid virtual address");
+
+ offset_t physical_offset = entry->data + (virtual_addr - entry->base);
+ uint64_t result =
+ DataExtractor::GetMaxU64_unchecked(&physical_offset, byte_size);
+ *offset_ptr += byte_size;
+ return result;
+}
+
+uint64_t
+VirtualDataExtractor::GetAddress_unchecked(offset_t *offset_ptr) const {
+ offset_t virtual_addr = *offset_ptr;
+ const LookupTable::Entry *entry = FindEntry(virtual_addr);
+ assert(entry && "Unchecked methods require valid virtual address");
+
+ offset_t physical_offset = entry->data + (virtual_addr - entry->base);
+ uint64_t result = DataExtractor::GetAddress_unchecked(&physical_offset);
+ *offset_ptr += m_addr_size;
+ return result;
+}
diff --git a/lldb/unittests/Utility/CMakeLists.txt b/lldb/unittests/Utility/CMakeLists.txt
index aed4177f5edee..77b52079cf32b 100644
--- a/lldb/unittests/Utility/CMakeLists.txt
+++ b/lldb/unittests/Utility/CMakeLists.txt
@@ -48,6 +48,7 @@ add_lldb_unittest(UtilityTests
UserIDResolverTest.cpp
UUIDTest.cpp
VASprintfTest.cpp
+ VirtualDataExtractorTest.cpp
VMRangeTest.cpp
XcodeSDKTest.cpp
diff --git a/lldb/unittests/Utility/VirtualDataExtractorTest.cpp b/lldb/unittests/Utility/VirtualDataExtractorTest.cpp
new file mode 100644
index 0000000000000..cb9edbc8950d9
--- /dev/null
+++ b/lldb/unittests/Utility/VirtualDataExtractorTest.cpp
@@ -0,0 +1,708 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "lldb/Utility/VirtualDataExtractor.h"
+#include "lldb/Utility/DataBufferHeap.h"
+#include "gtest/gtest.h"
+
+using namespace lldb_private;
+using namespace lldb;
+
+TEST(VirtualDataExtractorTest, BasicConstruction) {
+ // Create a simple data buffer.
+ uint8_t buffer[] = {0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08};
+
+ // Create a lookup table that maps virtual addresses to physical offsets.
+ VirtualDataExtractor::LookupTable lookup_table;
+ // Virtual address 0x1000-0x1008 maps to physical offset 0-8.
+ // Entry(base=virtual_offset, size, data=physical_offset).
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(0x1000, 8, 0));
+
+ VirtualDataExtractor extractor(buffer, sizeof(buffer), eByteOrderLittle, 4,
+ std::move(lookup_table));
+
+ EXPECT_EQ(extractor.GetByteSize(), 8U);
+}
+
+TEST(VirtualDataExtractorTest, GetDataAtVirtualOffset) {
+ uint8_t buffer[] = {0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08};
+
+ VirtualDataExtractor::LookupTable lookup_table;
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(0x1000, 8, 0));
+
+ VirtualDataExtractor extractor(buffer, sizeof(buffer), eByteOrderLittle, 4,
+ std::move(lookup_table));
+
+ offset_t virtual_offset = 0x1000;
+ const void *data = extractor.GetData(&virtual_offset, 4);
+
+ ASSERT_NE(data, nullptr);
+ EXPECT_EQ(virtual_offset, 0x1004U);
+ EXPECT_EQ(memcmp(data, buffer, 4), 0);
+}
+
+TEST(VirtualDataExtractorTest, GetDataAtVirtualOffsetInvalid) {
+ uint8_t buffer[] = {0x01, 0x02, 0x03, 0x04};
+
+ VirtualDataExtractor::LookupTable lookup_table;
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(0x1000, 4, 0));
+
+ VirtualDataExtractor extractor(buffer, sizeof(buffer), eByteOrderLittle, 4,
+ std::move(lookup_table));
+
+ // Try to read from an invalid virtual address.
+ offset_t virtual_offset = 0x2000;
+ const void *data = extractor.GetData(&virtual_offset, 4);
+
+ EXPECT_EQ(data, nullptr);
+}
+
+TEST(VirtualDataExtractorTest, GetU8AtVirtualOffset) {
+ uint8_t buffer[] = {0x12, 0x34, 0x56, 0x78};
+
+ VirtualDataExtractor::LookupTable lookup_table;
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(0x1000, 4, 0));
+
+ VirtualDataExtractor extractor(buffer, sizeof(buffer), eByteOrderLittle, 4,
+ std::move(lookup_table));
+
+ offset_t virtual_offset = 0x1000;
+ EXPECT_EQ(extractor.GetU8(&virtual_offset), 0x12U);
+ EXPECT_EQ(virtual_offset, 0x1001U);
+
+ EXPECT_EQ(extractor.GetU8(&virtual_offset), 0x34U);
+ EXPECT_EQ(virtual_offset, 0x1002U);
+}
+
+TEST(VirtualDataExtractorTest, GetU16AtVirtualOffset) {
+ uint8_t buffer[] = {0x12, 0x34, 0x56, 0x78};
+
+ VirtualDataExtractor::LookupTable lookup_table;
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(0x1000, 4, 0));
+
+ VirtualDataExtractor extractor(buffer, sizeof(buffer), eByteOrderLittle, 4,
+ std::move(lookup_table));
+
+ offset_t virtual_offset = 0x1000;
+ EXPECT_EQ(extractor.GetU16(&virtual_offset), 0x3412U);
+ EXPECT_EQ(virtual_offset, 0x1002U);
+
+ EXPECT_EQ(extractor.GetU16(&virtual_offset), 0x7856U);
+ EXPECT_EQ(virtual_offset, 0x1004U);
+}
+
+TEST(VirtualDataExtractorTest, GetU32AtVirtualOffset) {
+ uint8_t buffer[] = {0x12, 0x34, 0x56, 0x78, 0x9A, 0xBC, 0xDE, 0xF0};
+
+ VirtualDataExtractor::LookupTable lookup_table;
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(0x1000, 8, 0));
+
+ VirtualDataExtractor extractor(buffer, sizeof(buffer), eByteOrderLittle, 4,
+ std::move(lookup_table));
+
+ offset_t virtual_offset = 0x1000;
+ EXPECT_EQ(extractor.GetU32(&virtual_offset), 0x78563412U);
+ EXPECT_EQ(virtual_offset, 0x1004U);
+
+ EXPECT_EQ(extractor.GetU32(&virtual_offset), 0xF0DEBC9AU);
+ EXPECT_EQ(virtual_offset, 0x1008U);
+}
+
+TEST(VirtualDataExtractorTest, GetU64AtVirtualOffset) {
+ uint8_t buffer[] = {0x12, 0x34, 0x56, 0x78, 0x9A, 0xBC, 0xDE, 0xF0};
+
+ VirtualDataExtractor::LookupTable lookup_table;
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(0x1000, 8, 0));
+
+ VirtualDataExtractor extractor(buffer, sizeof(buffer), eByteOrderLittle, 8,
+ std::move(lookup_table));
+
+ offset_t virtual_offset = 0x1000;
+ EXPECT_EQ(extractor.GetU64(&virtual_offset), 0xF0DEBC9A78563412ULL);
+ EXPECT_EQ(virtual_offset, 0x1008U);
+}
+
+TEST(VirtualDataExtractorTest, GetAddressAtVirtualOffset) {
+ uint8_t buffer[] = {0x12, 0x34, 0x56, 0x78};
+
+ VirtualDataExtractor::LookupTable lookup_table;
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(0x1000, 4, 0));
+
+ VirtualDataExtractor extractor(buffer, sizeof(buffer), eByteOrderLittle, 4,
+ std::move(lookup_table));
+
+ offset_t virtual_offset = 0x1000;
+ EXPECT_EQ(extractor.GetAddress(&virtual_offset), 0x78563412U);
+ EXPECT_EQ(virtual_offset, 0x1004U);
+}
+
+TEST(VirtualDataExtractorTest, BigEndian) {
+ uint8_t buffer[] = {0x12, 0x34, 0x56, 0x78};
+
+ VirtualDataExtractor::LookupTable lookup_table;
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(0x1000, 4, 0));
+
+ VirtualDataExtractor extractor(buffer, sizeof(buffer), eByteOrderBig, 4,
+ std::move(lookup_table));
+
+ offset_t virtual_offset = 0x1000;
+ EXPECT_EQ(extractor.GetU16(&virtual_offset), 0x1234U);
+ EXPECT_EQ(virtual_offset, 0x1002U);
+
+ EXPECT_EQ(extractor.GetU16(&virtual_offset), 0x5678U);
+ EXPECT_EQ(virtual_offset, 0x1004U);
+}
+
+TEST(VirtualDataExtractorTest, MultipleEntries) {
+ // Create a buffer with distinct patterns for each section.
+ uint8_t buffer[] = {
+ 0x01, 0x02, 0x03, 0x04, // Physical offset 0-3.
+ 0x11, 0x12, 0x13, 0x14, // Physical offset 4-7.
+ 0x21, 0x22, 0x23, 0x24 // Physical offset 8-11.
+ };
+
+ VirtualDataExtractor::LookupTable lookup_table;
+ // Map different virtual address ranges to different physical offsets.
+ // Entry(base=virtual_offset, size, data=physical_offset).
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(
+ 0x1000, 4, 0)); // Virt 0x1000-0x1004 -> phys 0-4.
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(
+ 0x2000, 4, 4)); // Virt 0x2000-0x2004 -> phys 4-8.
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(
+ 0x3000, 4, 8)); // Virt 0x3000-0x3004 -> phys 8-12.
+
+ VirtualDataExtractor extractor(buffer, sizeof(buffer), eByteOrderLittle, 4,
+ std::move(lookup_table));
+
+ // Test reading from first virtual range.
+ offset_t virtual_offset = 0x1000;
+ EXPECT_EQ(extractor.GetU8(&virtual_offset), 0x01U);
+
+ // Test reading from second virtual range.
+ virtual_offset = 0x2000;
+ EXPECT_EQ(extractor.GetU8(&virtual_offset), 0x11U);
+
+ // Test reading from third virtual range.
+ virtual_offset = 0x3000;
+ EXPECT_EQ(extractor.GetU8(&virtual_offset), 0x21U);
+}
+
+TEST(VirtualDataExtractorTest, NonContiguousVirtualAddresses) {
+ uint8_t buffer[] = {0xAA, 0xBB, 0xCC, 0xDD};
+
+ VirtualDataExtractor::LookupTable lookup_table;
+ // Create non-contiguous virtual address mapping.
+ // Entry(base=virtual_offset, size, data=physical_offset).
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(
+ 0x1000, 2, 0)); // Virt 0x1000-0x1002 -> phys 0-2.
+ lookup_table.Append(VirtualDataExtractor::LookupTable::Entry(
+ 0x5000, 2, 2)); // Virt 0x5000-0x5002 -> phys 2-4.
+
+ VirtualDataExtractor extractor...
[truncated]
|
|
The motivation for this is the shared cache. A new API will allow us to get our hands on segments that are not laid out the same way they are when mapped into memory. By using the VirtualDataExtractor we can make it look like it is and avoid having to change ObjectFileMachO. |
jasonmolenda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, this should do what we'll need for the shared cache segment reordering. Test coverage looks good.
9512330 to
e5f556a
Compare
Introduce VirtualDataExtractor, a DataExtractor subclass that enables reading data at virtual addresses by translating them to physical buffer offsets using a lookup table. The lookup table maps virtual address ranges to physical offsets and enforces boundaries to prevent reads from crossing entry limits. The new class inherits from DataExtractor, overriding GetData and PeekData to provide transparent virtual address translation for most of the DataExtractor methods. The exception are the unchecked methods, that bypass those methods and are overloaded as well.
e5f556a to
da01924
Compare
|
I wasn't happy with the repetition in the tests so I started with a helper and then realized that I could eliminate the helper by adding a constructor overload to |
| /// this lookup table. Reads cannot cross entry boundaries and this is | ||
| /// enforced with assertions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume you would do partial reads of some form but neither DataExtractor or the users of this class would be setup to handle that.
|
|
||
| VirtualDataExtractor(const void *data, lldb::offset_t data_length, | ||
| lldb::ByteOrder byte_order, uint32_t addr_size, | ||
| LookupTable lookup_table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const LookupTable& ? I always forget whether this makes a difference, sometimes it seems to make things worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you do std::move it later.
DavidSpickett
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
ObjectFile has an m_data DataExtractor ivar which may be default constructed initially, or initialized with a DataBuffer passed in to a ctor. Subclasses will provide the DataExtrator with a Buffer source if not. When a DataBuffer is passed in to the base class ctor, the DataExtractor only has its buffer initalized; we don't yet know the address size and endianness to fully initialize the DataExtractor. This patch changes ObjectFile to instead have a DataExtractorSP ivar which is always initialized with at least a default-constructed DataExtractor object in the base class ctor. The next patch I will be writing is to change the ObjectFile ctor which accepts a DataBuffer to instead accept a DataExtractorSP, so the caller can intialize it with a DataExtractor subclass -- the VirtualizeDataExtractor being added in llvm#168802 The change is otherwise mechanical; all `m_data.` changed to `m_data_up->` and all the places where `m_data` was passed in for a by-ref call were changed to `*m_data_up.get()`. The unique pointer is always initialized to contain an object. I can't remember off hand if I'm making a mistake using a unique_ptr here, given that the ctor may take a DataExtractor as an argument. The caller will have to do std::move(extractor_up) when it calls the ObjectFile ctor for correct behavior. Even though a unique_ptr makes sense internal to ObjectFile, given that it can be passed as an argument, should I use the more straightforward shared_ptr? An ObjectFile only has one of them, so the extra storage for the refcount isn't important. I built & ran the testsuite on macOS and on aarch64-Ubuntu (thanks for getting the Linux testsuite to run on SME-only systems David). All of the ObjectFile subclasses I modifed compile cleanly, but I haven't tested them beyond any unit tests they may have (prob breakpad). rdar://148939795
ObjectFile has an m_data DataExtractor ivar which may be default constructed initially, or initialized with a DataBuffer passed in to its ctor. If the DataExtractor does not get a DataBuffer source passed in, the subclass will initialize it with access to the object file's data. When a DataBuffer is passed in to the base class ctor, the DataExtractor only has its buffer initialized; ObjectFile doesn't yet know the address size and endianness to fully initialize the DataExtractor. This patch changes ObjectFile to instead have a DataExtractorSP ivar which is always initialized with at least a default-constructed DataExtractor object in the base class ctor. The next patch I will be writing is to change the ObjectFile ctor to take an optional DataExtractorSP, so the caller can pass a DataExtractor subclass -- the VirtualizeDataExtractor being added via #168802 instead of a DataBuffer which is trivially saved into the DataExtractor. The change is otherwise mechanical; all `m_data.` changed to `m_data_sp->` and all the places where `m_data` was passed in for a by-ref call were changed to `*m_data_sp.get()`. The shared pointer is always initialized to contain an object. I built & ran the testsuite on macOS and on aarch64-Ubuntu (thanks for getting the Linux testsuite to run on SME-only systems David). All of the ObjectFile subclasses I modifed compile cleanly, but I haven't tested them beyond any unit tests they may have (prob breakpad). rdar://148939795
…ptr (#170066) ObjectFile has an m_data DataExtractor ivar which may be default constructed initially, or initialized with a DataBuffer passed in to its ctor. If the DataExtractor does not get a DataBuffer source passed in, the subclass will initialize it with access to the object file's data. When a DataBuffer is passed in to the base class ctor, the DataExtractor only has its buffer initialized; ObjectFile doesn't yet know the address size and endianness to fully initialize the DataExtractor. This patch changes ObjectFile to instead have a DataExtractorSP ivar which is always initialized with at least a default-constructed DataExtractor object in the base class ctor. The next patch I will be writing is to change the ObjectFile ctor to take an optional DataExtractorSP, so the caller can pass a DataExtractor subclass -- the VirtualizeDataExtractor being added via llvm/llvm-project#168802 instead of a DataBuffer which is trivially saved into the DataExtractor. The change is otherwise mechanical; all `m_data.` changed to `m_data_sp->` and all the places where `m_data` was passed in for a by-ref call were changed to `*m_data_sp.get()`. The shared pointer is always initialized to contain an object. I built & ran the testsuite on macOS and on aarch64-Ubuntu (thanks for getting the Linux testsuite to run on SME-only systems David). All of the ObjectFile subclasses I modifed compile cleanly, but I haven't tested them beyond any unit tests they may have (prob breakpad). rdar://148939795
…#168802) Introduce VirtualDataExtractor, a DataExtractor subclass that enables reading data at virtual addresses by translating them to physical buffer offsets using a lookup table. The lookup table maps virtual address ranges to physical offsets and enforces boundaries to prevent reads from crossing entry limits. The new class inherits from DataExtractor, overriding GetData and PeekData to provide transparent virtual address translation for most of the DataExtractor methods. The exception are the unchecked methods, that bypass those methods and are overloaded as well.
…#168802) Introduce VirtualDataExtractor, a DataExtractor subclass that enables reading data at virtual addresses by translating them to physical buffer offsets using a lookup table. The lookup table maps virtual address ranges to physical offsets and enforces boundaries to prevent reads from crossing entry limits. The new class inherits from DataExtractor, overriding GetData and PeekData to provide transparent virtual address translation for most of the DataExtractor methods. The exception are the unchecked methods, that bypass those methods and are overloaded as well.
…70066) ObjectFile has an m_data DataExtractor ivar which may be default constructed initially, or initialized with a DataBuffer passed in to its ctor. If the DataExtractor does not get a DataBuffer source passed in, the subclass will initialize it with access to the object file's data. When a DataBuffer is passed in to the base class ctor, the DataExtractor only has its buffer initialized; ObjectFile doesn't yet know the address size and endianness to fully initialize the DataExtractor. This patch changes ObjectFile to instead have a DataExtractorSP ivar which is always initialized with at least a default-constructed DataExtractor object in the base class ctor. The next patch I will be writing is to change the ObjectFile ctor to take an optional DataExtractorSP, so the caller can pass a DataExtractor subclass -- the VirtualizeDataExtractor being added via llvm#168802 instead of a DataBuffer which is trivially saved into the DataExtractor. The change is otherwise mechanical; all `m_data.` changed to `m_data_sp->` and all the places where `m_data` was passed in for a by-ref call were changed to `*m_data_sp.get()`. The shared pointer is always initialized to contain an object. I built & ran the testsuite on macOS and on aarch64-Ubuntu (thanks for getting the Linux testsuite to run on SME-only systems David). All of the ObjectFile subclasses I modifed compile cleanly, but I haven't tested them beyond any unit tests they may have (prob breakpad). rdar://148939795
…#168802) Introduce VirtualDataExtractor, a DataExtractor subclass that enables reading data at virtual addresses by translating them to physical buffer offsets using a lookup table. The lookup table maps virtual address ranges to physical offsets and enforces boundaries to prevent reads from crossing entry limits. The new class inherits from DataExtractor, overriding GetData and PeekData to provide transparent virtual address translation for most of the DataExtractor methods. The exception are the unchecked methods, that bypass those methods and are overloaded as well.
…70066) ObjectFile has an m_data DataExtractor ivar which may be default constructed initially, or initialized with a DataBuffer passed in to its ctor. If the DataExtractor does not get a DataBuffer source passed in, the subclass will initialize it with access to the object file's data. When a DataBuffer is passed in to the base class ctor, the DataExtractor only has its buffer initialized; ObjectFile doesn't yet know the address size and endianness to fully initialize the DataExtractor. This patch changes ObjectFile to instead have a DataExtractorSP ivar which is always initialized with at least a default-constructed DataExtractor object in the base class ctor. The next patch I will be writing is to change the ObjectFile ctor to take an optional DataExtractorSP, so the caller can pass a DataExtractor subclass -- the VirtualizeDataExtractor being added via llvm#168802 instead of a DataBuffer which is trivially saved into the DataExtractor. The change is otherwise mechanical; all `m_data.` changed to `m_data_sp->` and all the places where `m_data` was passed in for a by-ref call were changed to `*m_data_sp.get()`. The shared pointer is always initialized to contain an object. I built & ran the testsuite on macOS and on aarch64-Ubuntu (thanks for getting the Linux testsuite to run on SME-only systems David). All of the ObjectFile subclasses I modifed compile cleanly, but I haven't tested them beyond any unit tests they may have (prob breakpad). rdar://148939795
The ObjectFile plugin interface accepts an optional DataBufferSP argument. If the caller has the contents of the binary, it can provide this in that DataBufferSP. The ObjectFile subclasses in their CreateInstance methods will fill in the DataBufferSP with the actual binary contents if it is not set. ObjectFile base class creates an ivar DataExtractor from the DataBufferSP passed in. My next patch will be a caller that creates a VirtualDataExtractor with the binary data, and needs to pass that in to the ObjectFile plugin, instead of the bag-of-bytes DataBufferSP. It builds on the previous patch changing ObjectFile's ivar from DataExtractor to DataExtractorSP so I could pass in a subclass in the shared ptr. And it will be using the VirtualDataExtractor that Jonas added in #168802 No behavior is changed by the patch; we're simply moving the creation of the DataExtractor to the caller, instead of a DataBuffer that is immediately used to set up the ObjectFile DataExtractor. The patch is a bit complicated because all of the ObjectFile subclasses have to initialize their DataExtractor to pass in to the base class. I ran the testsuite on macOS and on AArch64 Ubutnu. (btw David, I ran it under qemu on my M4 mac with SME-no-SVE again, Ubuntu 25.10, checked lshw(1) cpu capabilities, and qemu doesn't seem to be virtualizing the SME, that explains why the testsuite passes) rdar://148939795 --------- Co-authored-by: Jonas Devlieghere <[email protected]>
The ObjectFile plugin interface accepts an optional DataBufferSP argument. If the caller has the contents of the binary, it can provide this in that DataBufferSP. The ObjectFile subclasses in their CreateInstance methods will fill in the DataBufferSP with the actual binary contents if it is not set. ObjectFile base class creates an ivar DataExtractor from the DataBufferSP passed in. My next patch will be a caller that creates a VirtualDataExtractor with the binary data, and needs to pass that in to the ObjectFile plugin, instead of the bag-of-bytes DataBufferSP. It builds on the previous patch changing ObjectFile's ivar from DataExtractor to DataExtractorSP so I could pass in a subclass in the shared ptr. And it will be using the VirtualDataExtractor that Jonas added in llvm/llvm-project#168802 No behavior is changed by the patch; we're simply moving the creation of the DataExtractor to the caller, instead of a DataBuffer that is immediately used to set up the ObjectFile DataExtractor. The patch is a bit complicated because all of the ObjectFile subclasses have to initialize their DataExtractor to pass in to the base class. I ran the testsuite on macOS and on AArch64 Ubutnu. (btw David, I ran it under qemu on my M4 mac with SME-no-SVE again, Ubuntu 25.10, checked lshw(1) cpu capabilities, and qemu doesn't seem to be virtualizing the SME, that explains why the testsuite passes) rdar://148939795 --------- Co-authored-by: Jonas Devlieghere <[email protected]>
Introduce VirtualDataExtractor, a DataExtractor subclass that enables reading data at virtual addresses by translating them to physical buffer offsets using a lookup table. The lookup table maps virtual address ranges to physical offsets and enforces boundaries to prevent reads from crossing entry limits.
The new class inherits from DataExtractor, overriding GetData and PeekData to provide transparent virtual address translation for most of the DataExtractor methods. The exception are the unchecked methods, that bypass those methods and are overloaded as well.