Skip to content

Commit f7b3ff2

Browse files
[lldb] Implement Process::ReadMemoryRanges (llvm#163651)
This commit introduces a base-class implementation for a method that reads memory from multiple ranges at once. This implementation simply calls the underlying `ReadMemoryFromInferior` method on each requested range, intentionally bypassing the memory caching mechanism (though this may be easily changed in the future). `Process` implementations that can be perform this operation more efficiently - e.g. with the MultiMemPacket described in [1] - are expected to override this method. As an example, this commit changes AppleObjCClassDescriptorV2 to use the new API. Note about the API ------------------ In the RFC, we discussed having the API return some kind of class `ReadMemoryRangesResult`. However, while writing such a class, it became clear that it was merely wrapping a vector, without providing anything useful. For example, this class: ``` struct ReadMemoryRangesResult { ReadMemoryRangesResult( llvm::SmallVector<llvm::MutableArrayRef<uint8_t>> ranges) : ranges(std::move(ranges)) {} llvm::ArrayRef<llvm::MutableArrayRef<uint8_t>> getRanges() const { return ranges; } private: llvm::SmallVector<llvm::MutableArrayRef<uint8_t>> ranges; }; ``` As can be seen in the added test and in the added use-case (AppleObjCClassDescriptorV2), users of this API will just iterate over the vector of memory buffers. So they want a return type that can be iterated over, and the vector seems more natural than creating a new class and defining iterators for it. Likewise, in the RFC, we discussed wrapping the result into an `Expected`. Upon experimenting with the code, this feels like it limits what the API is able to do as the base class implementation never needs to fail the entire result, it's the individual reads that may fail and this is expressed through a zero-length result. Any derived classes overriding `ReadMemoryRanges` should also never produce a top level failure: if they did, they can just fall back to the base class implementation, which would produce a better result. The choice of having the caller allocate a buffer and pass it to `Process::ReadMemoryRanges` is done mostly to follow conventions already done in the Process class. [1]: https://discourse.llvm.org/t/rfc-a-new-vectorized-memory-read-packet/ (cherry picked from commit f2cb997) (cherry picked from commit bccb899)
1 parent 9d05187 commit f7b3ff2

File tree

4 files changed

+220
-11
lines changed

4 files changed

+220
-11
lines changed

lldb/include/lldb/Target/Process.h

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1574,6 +1574,28 @@ class Process : public std::enable_shared_from_this<Process>,
15741574
virtual size_t ReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
15751575
Status &error);
15761576

1577+
/// Read from multiple memory ranges and write the results into buffer.
1578+
/// This calls ReadMemoryFromInferior multiple times, once per range,
1579+
/// bypassing the read cache. Process implementations that can perform this
1580+
/// operation more efficiently should override this.
1581+
///
1582+
/// \param[in] ranges
1583+
/// A collection of ranges (base address + size) to read from.
1584+
///
1585+
/// \param[out] buffer
1586+
/// A buffer where the read memory will be written to. It must be at least
1587+
/// as long as the sum of the sizes of each range.
1588+
///
1589+
/// \return
1590+
/// A vector of MutableArrayRef, where each MutableArrayRef is a slice of
1591+
/// the input buffer into which the memory contents were copied. The size
1592+
/// of the slice indicates how many bytes were read successfully. Partial
1593+
/// reads are always performed from the start of the requested range,
1594+
/// never from the middle or end.
1595+
virtual llvm::SmallVector<llvm::MutableArrayRef<uint8_t>>
1596+
ReadMemoryRanges(llvm::ArrayRef<Range<lldb::addr_t, size_t>> ranges,
1597+
llvm::MutableArrayRef<uint8_t> buffer);
1598+
15771599
/// Read of memory from a process.
15781600
///
15791601
/// This function has the same semantics of ReadMemory except that it

lldb/source/Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCClassDescriptorV2.cpp

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -279,22 +279,23 @@ ClassDescriptorV2::ReadMethods(llvm::ArrayRef<lldb::addr_t> addresses,
279279
const size_t num_methods = addresses.size();
280280

281281
llvm::SmallVector<uint8_t, 0> buffer(num_methods * size, 0);
282-
llvm::DenseSet<uint32_t> failed_indices;
283282

284-
for (auto [idx, addr] : llvm::enumerate(addresses)) {
285-
Status error;
286-
process->ReadMemory(addr, buffer.data() + idx * size, size, error);
287-
if (error.Fail())
288-
failed_indices.insert(idx);
289-
}
283+
llvm::SmallVector<Range<addr_t, size_t>> mem_ranges =
284+
llvm::to_vector(llvm::map_range(llvm::seq(num_methods), [&](size_t idx) {
285+
return Range<addr_t, size_t>(addresses[idx], size);
286+
}));
287+
288+
llvm::SmallVector<llvm::MutableArrayRef<uint8_t>> read_results =
289+
process->ReadMemoryRanges(mem_ranges, buffer);
290290

291291
llvm::SmallVector<method_t, 0> methods;
292292
methods.reserve(num_methods);
293-
for (auto [idx, addr] : llvm::enumerate(addresses)) {
294-
if (failed_indices.contains(idx))
293+
for (auto [addr, memory] : llvm::zip(addresses, read_results)) {
294+
// Ignore partial reads.
295+
if (memory.size() != size)
295296
continue;
296-
DataExtractor extractor(buffer.data() + idx * size, size,
297-
process->GetByteOrder(),
297+
298+
DataExtractor extractor(memory.data(), size, process->GetByteOrder(),
298299
process->GetAddressByteSize());
299300
methods.push_back(method_t());
300301
methods.back().Read(extractor, process, addr, relative_selector_base_addr,

lldb/source/Target/Process.cpp

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2119,6 +2119,50 @@ size_t Process::ReadMemory(addr_t addr, void *buf, size_t size, Status &error) {
21192119
}
21202120
}
21212121

2122+
llvm::SmallVector<llvm::MutableArrayRef<uint8_t>>
2123+
Process::ReadMemoryRanges(llvm::ArrayRef<Range<lldb::addr_t, size_t>> ranges,
2124+
llvm::MutableArrayRef<uint8_t> buffer) {
2125+
uint64_t total_ranges_len = 0;
2126+
for (auto range : ranges)
2127+
total_ranges_len += range.size;
2128+
// If the buffer is not large enough, this is a programmer error.
2129+
// In production builds, gracefully fail by returning a length of 0 for all
2130+
// ranges.
2131+
assert(buffer.size() >= total_ranges_len && "provided buffer is too short");
2132+
if (buffer.size() < total_ranges_len) {
2133+
llvm::MutableArrayRef<uint8_t> empty;
2134+
return {ranges.size(), empty};
2135+
}
2136+
2137+
llvm::SmallVector<llvm::MutableArrayRef<uint8_t>> results;
2138+
2139+
// While `buffer` has space, take the next requested range and read
2140+
// memory into a `buffer` piece, then slice it to remove the used memory.
2141+
for (auto [addr, range_len] : ranges) {
2142+
Status status;
2143+
size_t num_bytes_read =
2144+
ReadMemoryFromInferior(addr, buffer.data(), range_len, status);
2145+
// FIXME: ReadMemoryFromInferior promises to return 0 in case of errors, but
2146+
// it doesn't; it never checks for errors.
2147+
if (status.Fail())
2148+
num_bytes_read = 0;
2149+
2150+
assert(num_bytes_read <= range_len && "read more than requested bytes");
2151+
if (num_bytes_read > range_len) {
2152+
// In production builds, gracefully fail by returning length zero for this
2153+
// range.
2154+
results.emplace_back();
2155+
continue;
2156+
}
2157+
2158+
results.push_back(buffer.take_front(num_bytes_read));
2159+
// Slice buffer to remove the used memory.
2160+
buffer = buffer.drop_front(num_bytes_read);
2161+
}
2162+
2163+
return results;
2164+
}
2165+
21222166
void Process::DoFindInMemory(lldb::addr_t start_addr, lldb::addr_t end_addr,
21232167
const uint8_t *buf, size_t size,
21242168
AddressRanges &matches, size_t alignment,

lldb/unittests/Target/MemoryTest.cpp

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
#include "lldb/Utility/ArchSpec.h"
1818
#include "lldb/Utility/DataBufferHeap.h"
1919
#include "gtest/gtest.h"
20+
#include <cstdint>
2021

2122
using namespace lldb_private;
2223
using namespace lldb_private::repro;
@@ -226,3 +227,144 @@ TEST_F(MemoryTest, TesetMemoryCacheRead) {
226227
// instead of using an
227228
// old cache
228229
}
230+
231+
/// A process class that, when asked to read memory from some address X, returns
232+
/// the least significant byte of X.
233+
class DummyReaderProcess : public Process {
234+
public:
235+
// If true, `DoReadMemory` will not return all requested bytes.
236+
// It's not possible to control exactly how many bytes will be read, because
237+
// Process::ReadMemoryFromInferior tries to fulfill the entire request by
238+
// reading smaller chunks until it gets nothing back.
239+
bool read_less_than_requested = false;
240+
bool read_more_than_requested = false;
241+
242+
size_t DoReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
243+
Status &error) override {
244+
if (read_less_than_requested && size > 0)
245+
size--;
246+
if (read_more_than_requested)
247+
size *= 2;
248+
uint8_t *buffer = static_cast<uint8_t *>(buf);
249+
for (size_t addr = vm_addr; addr < vm_addr + size; addr++)
250+
buffer[addr - vm_addr] = static_cast<uint8_t>(addr); // LSB of addr.
251+
return size;
252+
}
253+
// Boilerplate, nothing interesting below.
254+
DummyReaderProcess(lldb::TargetSP target_sp, lldb::ListenerSP listener_sp)
255+
: Process(target_sp, listener_sp) {}
256+
bool CanDebug(lldb::TargetSP, bool) override { return true; }
257+
Status DoDestroy() override { return {}; }
258+
void RefreshStateAfterStop() override {}
259+
bool DoUpdateThreadList(ThreadList &, ThreadList &) override { return false; }
260+
llvm::StringRef GetPluginName() override { return "Dummy"; }
261+
};
262+
263+
TEST_F(MemoryTest, TestReadMemoryRanges) {
264+
ArchSpec arch("x86_64-apple-macosx-");
265+
266+
Platform::SetHostPlatform(PlatformRemoteMacOSX::CreateInstance(true, &arch));
267+
268+
DebuggerSP debugger_sp = Debugger::CreateInstance();
269+
ASSERT_TRUE(debugger_sp);
270+
271+
TargetSP target_sp = CreateTarget(debugger_sp, arch);
272+
ASSERT_TRUE(target_sp);
273+
274+
ListenerSP listener_sp(Listener::MakeListener("dummy"));
275+
ProcessSP process_sp =
276+
std::make_shared<DummyReaderProcess>(target_sp, listener_sp);
277+
ASSERT_TRUE(process_sp);
278+
279+
{
280+
llvm::SmallVector<uint8_t, 0> buffer(1024, 0);
281+
// Read 8 ranges of 128 bytes with arbitrary base addresses.
282+
llvm::SmallVector<Range<addr_t, size_t>> ranges = {
283+
{0x12345, 128}, {0x11112222, 128}, {0x77777777, 128},
284+
{0xffaabbccdd, 128}, {0x0, 128}, {0x4242424242, 128},
285+
{0x17171717, 128}, {0x99999, 128}};
286+
287+
llvm::SmallVector<llvm::MutableArrayRef<uint8_t>> read_results =
288+
process_sp->ReadMemoryRanges(ranges, buffer);
289+
290+
for (auto [range, memory] : llvm::zip(ranges, read_results)) {
291+
ASSERT_EQ(memory.size(), 128u);
292+
addr_t range_base = range.GetRangeBase();
293+
for (auto [idx, byte] : llvm::enumerate(memory))
294+
ASSERT_EQ(byte, static_cast<uint8_t>(range_base + idx));
295+
}
296+
}
297+
298+
auto &dummy_process = static_cast<DummyReaderProcess &>(*process_sp);
299+
dummy_process.read_less_than_requested = true;
300+
{
301+
llvm::SmallVector<uint8_t, 0> buffer(1024, 0);
302+
llvm::SmallVector<Range<addr_t, size_t>> ranges = {
303+
{0x12345, 128}, {0x11112222, 128}, {0x77777777, 128}};
304+
llvm::SmallVector<llvm::MutableArrayRef<uint8_t>> read_results =
305+
dummy_process.ReadMemoryRanges(ranges, buffer);
306+
for (auto [range, memory] : llvm::zip(ranges, read_results)) {
307+
ASSERT_LT(memory.size(), 128u);
308+
addr_t range_base = range.GetRangeBase();
309+
for (auto [idx, byte] : llvm::enumerate(memory))
310+
ASSERT_EQ(byte, static_cast<uint8_t>(range_base + idx));
311+
}
312+
}
313+
}
314+
315+
using MemoryDeathTest = MemoryTest;
316+
317+
TEST_F(MemoryDeathTest, TestReadMemoryRangesReturnsTooMuch) {
318+
ArchSpec arch("x86_64-apple-macosx-");
319+
Platform::SetHostPlatform(PlatformRemoteMacOSX::CreateInstance(true, &arch));
320+
DebuggerSP debugger_sp = Debugger::CreateInstance();
321+
ASSERT_TRUE(debugger_sp);
322+
TargetSP target_sp = CreateTarget(debugger_sp, arch);
323+
ASSERT_TRUE(target_sp);
324+
ListenerSP listener_sp(Listener::MakeListener("dummy"));
325+
ProcessSP process_sp =
326+
std::make_shared<DummyReaderProcess>(target_sp, listener_sp);
327+
ASSERT_TRUE(process_sp);
328+
329+
auto &dummy_process = static_cast<DummyReaderProcess &>(*process_sp);
330+
dummy_process.read_more_than_requested = true;
331+
llvm::SmallVector<uint8_t, 0> buffer(1024, 0);
332+
llvm::SmallVector<Range<addr_t, size_t>> ranges = {{0x12345, 128}};
333+
334+
llvm::SmallVector<llvm::MutableArrayRef<uint8_t>> read_results;
335+
ASSERT_DEBUG_DEATH(
336+
{ read_results = process_sp->ReadMemoryRanges(ranges, buffer); },
337+
"read more than requested bytes");
338+
#ifdef NDEBUG
339+
// With asserts off, the read should return empty ranges.
340+
ASSERT_EQ(read_results.size(), 1u);
341+
ASSERT_TRUE(read_results[0].empty());
342+
#endif
343+
}
344+
345+
TEST_F(MemoryDeathTest, TestReadMemoryRangesWithShortBuffer) {
346+
ArchSpec arch("x86_64-apple-macosx-");
347+
Platform::SetHostPlatform(PlatformRemoteMacOSX::CreateInstance(true, &arch));
348+
DebuggerSP debugger_sp = Debugger::CreateInstance();
349+
ASSERT_TRUE(debugger_sp);
350+
TargetSP target_sp = CreateTarget(debugger_sp, arch);
351+
ASSERT_TRUE(target_sp);
352+
ListenerSP listener_sp(Listener::MakeListener("dummy"));
353+
ProcessSP process_sp =
354+
std::make_shared<DummyReaderProcess>(target_sp, listener_sp);
355+
ASSERT_TRUE(process_sp);
356+
357+
llvm::SmallVector<uint8_t, 0> short_buffer(10, 0);
358+
llvm::SmallVector<Range<addr_t, size_t>> ranges = {{0x12345, 128},
359+
{0x11, 128}};
360+
llvm::SmallVector<llvm::MutableArrayRef<uint8_t>> read_results;
361+
ASSERT_DEBUG_DEATH(
362+
{ read_results = process_sp->ReadMemoryRanges(ranges, short_buffer); },
363+
"provided buffer is too short");
364+
#ifdef NDEBUG
365+
// With asserts off, the read should return empty ranges.
366+
ASSERT_EQ(read_results.size(), ranges.size());
367+
for (llvm::MutableArrayRef<uint8_t> result : read_results)
368+
ASSERT_TRUE(result.empty());
369+
#endif
370+
}

0 commit comments

Comments
 (0)