[LLDB][NFC] Calculate the region size of an in memory image if size isn't specified #123148

clayborg · 2025-01-16T01:47:44Z

This patch cleans up some code and allows users to not have to specify a size and now get a more realistic value when reading modules from memory. This patch is in preparation for starting to have the dynamic loader load files from memory when we have core files that contain enough memory for ELF or mach-o files.

…cified. This patch cleans up some code and allows users to not have to specify a size and now get a more realistic value when reading modules from memory. This patch is in preparation for starting to have the dynamic loader load files from memory when we have core files that contain enough memory for ELF or mach-o files.

llvmbot · 2025-01-16T01:48:21Z

@llvm/pr-subscribers-lldb

Author: Greg Clayton (clayborg)

Changes

This patch cleans up some code and allows users to not have to specify a size and now get a more realistic value when reading modules from memory. This patch is in preparation for starting to have the dynamic loader load files from memory when we have core files that contain enough memory for ELF or mach-o files.

Full diff: https://github.com/llvm/llvm-project/pull/123148.diff

3 Files Affected:

(modified) lldb/include/lldb/Target/Process.h (+22-1)
(modified) lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp (+1-10)
(modified) lldb/source/Target/Process.cpp (+20)

diff --git a/lldb/include/lldb/Target/Process.h b/lldb/include/lldb/Target/Process.h
index a184e6dd891aff..c43ddd938cd988 100644
--- a/lldb/include/lldb/Target/Process.h
+++ b/lldb/include/lldb/Target/Process.h
@@ -1923,9 +1923,30 @@ class Process : public std::enable_shared_from_this<Process>,
   ///     the instruction has completed executing.
   bool GetWatchpointReportedAfter();
 
+  /// Load a module image from memory.
+  ///
+  /// \param[in] file_spec
+  ///     The path to use to represent this module. This will show up in the
+  ///     the image list.
+  ///
+  /// \param[in] header_addr
+  ///     The address of the object file header.
+  ///
+  /// \param[in] size_to_read
+  ///     The number of contiguous bytes to read that can be used to provide
+  ///     all of the data for this module in memory. If this value is set to
+  ///     zero, the memory region that contains \a header_addr will be found
+  ///     and the size will be to the end address of this memory region. If
+  ///     there is no memory region info that contains \a header_addr, then
+  ///     default to 512 bytes.
+  ///
+  /// \return
+  ///     A valid module shared pointer if this succeeds, or an empty shared
+  ///     pointer if no ObjectFile plug-ins recognize the object file header or
+  ///     memory can not be read from \a header_addr.
   lldb::ModuleSP ReadModuleFromMemory(const FileSpec &file_spec,
                                       lldb::addr_t header_addr,
-                                      size_t size_to_read = 512);
+                                      size_t size_to_read = 0);
 
   /// Attempt to get the attributes for a region of memory in the process.
   ///
diff --git a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
index 5614bc32468d5c..4e3ad44ddcd691 100644
--- a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
+++ b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
@@ -538,16 +538,7 @@ void DynamicLoaderPOSIXDYLD::LoadVDSO() {
 
   FileSpec file("[vdso]");
 
-  MemoryRegionInfo info;
-  Status status = m_process->GetMemoryRegionInfo(m_vdso_base, info);
-  if (status.Fail()) {
-    Log *log = GetLog(LLDBLog::DynamicLoader);
-    LLDB_LOG(log, "Failed to get vdso region info: {0}", status);
-    return;
-  }
-
-  if (ModuleSP module_sp = m_process->ReadModuleFromMemory(
-          file, m_vdso_base, info.GetRange().GetByteSize())) {
+  if (ModuleSP module_sp = m_process->ReadModuleFromMemory(file, m_vdso_base)) {
     UpdateLoadedSections(module_sp, LLDB_INVALID_ADDRESS, m_vdso_base, false);
     m_process->GetTarget().GetImages().AppendIfNeeded(module_sp);
   }
diff --git a/lldb/source/Target/Process.cpp b/lldb/source/Target/Process.cpp
index 68485a40a3fcce..9ce32ff6f306e9 100644
--- a/lldb/source/Target/Process.cpp
+++ b/lldb/source/Target/Process.cpp
@@ -2544,6 +2544,26 @@ ModuleSP Process::ReadModuleFromMemory(const FileSpec &file_spec,
       progress_up = std::make_unique<Progress>(
           "Reading binary from memory", file_spec.GetFilename().GetString());
 
+    if (size_to_read == 0) {
+      // No size was provided, figure out the size of the memory region that
+      // contains "header_addr"
+      MemoryRegionInfo header_region_info;
+      Status error(GetMemoryRegionInfo(header_addr, header_region_info));
+      // Default to 512 in case we can't find a memory region.
+      size_to_read = 512;
+      if (error.Success()) {
+        // We found a memory region, set the range of bytes ro read to read to
+        // the end of the memory region. This should be enough to contain the
+        // file header and important bits.
+        const auto &range = header_region_info.GetRange();
+        const addr_t end = range.GetRangeBase() + range.GetByteSize();
+        if (end > header_addr) {
+          const addr_t new_size = end - header_addr;
+          if (new_size > size_to_read)
+            size_to_read = new_size;
+        }
+      }
+    }
     ObjectFile *objfile = module_sp->GetMemoryObjectFile(
         shared_from_this(), header_addr, error, size_to_read);
     if (objfile)

labath · 2025-01-16T08:45:55Z

The reason this code looks the way it does is that we've had crashes when trying to read corrupted elf files from memory, where we load the size of the elf file from the process memory, find that out that it's size is 935872395 GB, try to allocate a host buffer of that size, and then explode when the allocation failed.

Since the elf code wasn't really ready to work with files loaded from memory (it assumed section headers would always be present -- something which you're changing now), we removed the code which tries to dynamically size the buffer for reading the file. The only case we needed to support was reading the vdso image from memory (which worked because the vdso is a complete elf image with section headers and all, and it fits into a single page of memory), which is why we've added this workaround to determine the region size from the outside.

This patch institutionalizes that workaround, but I'm unsure if that's the right direction for this. I'd like to understand where you want to take this afterwards. If the next step is to teach ObjectFileELF to read from the remaining memory regions (which I think you'll have to do, because most ELF files will not consist of a single memory region), then I think this change is unnecessary -- and maybe even harmful. Unnecessary, because if the code will know how to handle the additional memory regions, then it should be easy to teach it to extend the first region as well. Harmful, because if anything goes wrong (there is no ELF file at the place where we think it should be), then we could end up reading a huge block of memory for nothing (since the memory region data comes from a more trusted source, we shouldn't end up with the ridiculous values as we did before, but we could still end up reading several gigabytes of data only to find out that it doesn't actually represent an ELF file).

So, I think it would be better to keep the current algorithm, which is to read a small chunk of memory -- just enough to check whether we're truly dealing with an object file, and leave the rest of the work to the object file plugin (if I'm not mistaken, that's the logic we use when reading from a file as well). Once the elf memory reading code gets sufficiently robust, we can remove the vdso GetMemoryRegion call as well. Or we can keep it as a performance optimization. We can see how it goes, but either way, I don't think that a GetMemoryRegion call should be necessary for loading of other object files.

What do you make of all this? How were you planning to approach things?

Jlalond · 2025-01-16T20:58:24Z

lldb/source/Target/Process.cpp

+        // We found a memory region, set the range of bytes ro read to read to
+        // the end of the memory region. This should be enough to contain the
+        // file header and important bits.
+        const auto &range = header_region_info.GetRange();


Echoing what Pavel said a little bit, I do think an upper bound makes sense. I think we can be pretty heavy handed with that upper-bound. I'm not an expert on ELF files by any means, but I suspect something like 10MB or even 100 MB, will be a sane expectation for even the largest set of ELF headers.

Jlalond · 2025-01-16T21:01:22Z

lldb/source/Target/Process.cpp

+      // Default to 512 in case we can't find a memory region.
+      size_to_read = 512;
+      if (error.Success()) {
+        // We found a memory region, set the range of bytes ro read to read to


set the range of bytes ro read to read to the end of the memory region.

Typo plus maybe rephrase to

set the range to read to the end of the memory region.

Jlalond · 2025-01-16T21:15:34Z

lldb/source/Target/Process.cpp

+        // the end of the memory region. This should be enough to contain the
+        // file header and important bits.
+        const auto &range = header_region_info.GetRange();
+        const addr_t end = range.GetRangeBase() + range.GetByteSize();


Should this be an error?

The end should always be greater than the addr correct?

clayborg · 2025-01-22T23:14:14Z

The reason this code looks the way it does is that we've had crashes when trying to read corrupted elf files from memory, where we load the size of the elf file from the process memory, find that out that it's size is 935872395 GB, try to allocate a host buffer of that size, and then explode when the allocation failed.

How do you load the size of the ELF file from process memory? Are you saying the memory region information was corrupted in such cases?

Since the elf code wasn't really ready to work with files loaded from memory (it assumed section headers would always be present -- something which you're changing now),

They are not present in memory and never have been. So I changed the ELF code to not rely on requiring sections when the PT_DYNAMIC info or PT_NOTE info points to the same data without requiring section info.

we removed the code which tries to dynamically size the buffer for reading the file. The only case we needed to support was reading the vdso image from memory (which worked because the vdso is a complete elf image with section headers and all, and it fits into a single page of memory), which is why we've added this workaround to determine the region size from the outside.

This patch institutionalizes that workaround, but I'm unsure if that's the right direction for this. I'd like to understand where you want to take this afterwards. If the next step is to teach ObjectFileELF to read from the remaining memory regions (which I think you'll have to do, because most ELF files will not consist of a single memory region), then I think this change is unnecessary -- and maybe even harmful. Unnecessary, because if the code will know how to handle the additional memory regions, then it should be easy to teach it to extend the first region as well. Harmful, because if anything goes wrong (there is no ELF file at the place where we think it should be), then we could end up reading a huge block of memory for nothing (since the memory region data comes from a more trusted source, we shouldn't end up with the ridiculous values as we did before, but we could still end up reading several gigabytes of data only to find out that it doesn't actually represent an ELF file).

Right now reading 512 bytes means if we ask for program headers, some or all program headers might be returned as all PT_NULL program headers due to truncation and the way the parsing code is right now.

This patch is really to get ELF core files to be able to load what it needs effectively from memory like the PT_DYNAMIC and entries in the dynamic section like:

dynamic symbol table
hash table for the symbol table
the DT_DEBUG entry for the shared library list

And so we can access the PT_NOTE data so we can get to the GNU build ID information and anything else that is encoded into PT_LOAD[0].

If we stick to 512 bytes, it often isn't enough to get all the data for the program headers, and this was causing issues when we couldn't locate the shared library list in the dynamic loader using this ELF file, or get to the note data.

So, I think it would be better to keep the current algorithm, which is to read a small chunk of memory -- just enough to check whether we're truly dealing with an object file, and leave the rest of the work to the object file plugin (if I'm not mistaken, that's the logic we use when reading from a file as well). Once the elf memory reading code gets sufficiently robust, we can remove the vdso GetMemoryRegion call as well. Or we can keep it as a performance optimization. We can see how it goes, but either way, I don't think that a GetMemoryRegion call should be necessary for loading of other object files.

What do you make of all this? How were you planning to approach things?

This patch is needed before we can start reading all ELF files from memory when loading a core file. Right now if you don't specify an executable when loading a core file you get nothing. No shared libraries or anything. If a fix can be made to be able to correctly load ELF files from memory, we suddenly can not have a full image list with just a core file and no other executables because we will be able to correctly load them. We will also get the GNU build IDs from this memory ELF to ensure we don't allow someone to load the incorrect /usr/lib/libc.so from the current machine. If we can't figure out what the UUID was supposed to be, loading a core file often just loads what is on the current host and backtraces don't work at all due to using the wrong files.

So with a new patch I really just want to ensure we can load all program header data and the things that the program headers point to for any PT_DYNAMIC and PT_NOTE stuff. The current object file stuff assumes we have the entire file mapped into memory if it is read from disk, so some accesses start failing if there are lots of program headers that exceed 512 bytes and also if the static data (not memory regions, but static data from the ELF file on disk in the first PT_LOAD[0] segment) they point to isn't available.

Looking up the first memory region typically covers the contents of the PT_LOAD[0] section so it gathers all of the data. I don't get contiguous segments (I don't ask for PT_LOAD[0] + PT_LOAD[1] ...). So we can either do this in the ELF file by manually increasing the amount by grabbing the program headers and making sure we can load all program headers and all data they point to that we care and need like PT_DYNAMIC contents and PT_NOTE stuff, or we can just go with the first memory region. The first memory region is typically a page or two and contains all that we want. I am fine with either way, but this current methodoligy goes for the easy way and falls back to the previous 512 byte default if no memory region info is available.

We can modify the ObjectFIleELF when in memory to be able to read subsequent regions correctly by special casing if we have a process. This already doesn't work when 512 bytes is all we get, so this patch will only make things better.

If we parse all program headers and look for the PT_DYNAMIC and PT_NOTE segments, they can point to very large offsets within the ELF image, so if we try to do things intelligently, we might end up reading way too much. If the ELF file is corrupted in memory, same thing. The memory region seemed like the safest thing to do.

But I am currently working with some Android folks and we do need to worry about loading ELF headers from the middle of an APK file. I am not sure how the memory regions look for shared libraries that are uncompressed inside of the APK files. If the memory regions are ok, then this patch would work, but if there is one memory region for the entire APK it wouldn't. Though I don't see how this can work if there are not valid memory regions for each shared library within the APK because they need different permissions.

Let me know if you think I should try to intelligently resize within ObjectFileELF, or if anything I have said above made you change your mind.

labath · 2025-01-23T10:31:12Z

The reason this code looks the way it does is that we've had crashes when trying to read corrupted elf files from memory, where we load the size of the elf file from the process memory, find that out that it's size is 935872395 GB, try to allocate a host buffer of that size, and then explode when the allocation failed.

How do you load the size of the ELF file from process memory? Are you saying the memory region information was corrupted in such cases?

You can't load the ELF file size from process memory -- but that didn't stop the code from trying. :P (I.e., it was wrong/buggy)

IIRC (this was ~ten years ago), what happened was that the code looked at section header members in the ELF header (e_shoff, e_shentsize, e_shnum, which are present and filled in even when the headers themselves are not mapped in), and then tried to allocate enough memory to hold everything from the beginning of the file to the end. It didn't use the memory region information to validate this data (if it did, it would have found out that the memory is not there), it just took the data at face value. If those fields contained garbage, it would crash when attempting to allocate exabytes of memory.

They are not present in memory and never have been.

Not entirely true (but I get the idea). Section headers don't have to be present in memory. They also usually not present there. The vdso pseudo-module, however, always has sections headers (technically I guess it also doesn't need them, but every vdso I've seen has them).

(lldb) image dump objfile [vdso]
Dumping headers for 1 module(s).
0x5592a46ea680:   ObjectFileELF, file = '', arch = x86_64
ELF Header
e_ident[EI_MAG0   ] = 0x7f
e_ident[EI_MAG1   ] = 0x45 'E'
e_ident[EI_MAG2   ] = 0x4c 'L'
e_ident[EI_MAG3   ] = 0x46 'F'

<snip>

Program Headers
IDX  p_type          p_offset p_vaddr  p_paddr  p_filesz p_memsz  p_flags                   p_align
==== --------------- -------- -------- -------- -------- -------- ------------------------- --------
[ 0] PT_LOAD         00000000 00000000 00000000 000008df 000008df 00000005 (PF_X      PF_R) 00001000
[ 1] PT_DYNAMIC      000003a0 000003a0 000003a0 00000120 00000120 00000004 (          PF_R) 00000008
[ 2] PT_NOTE         000004c0 000004c0 000004c0 00000054 00000054 00000004 (          PF_R) 00000004
[ 3] PT_GNU_EH_FRAME 00000514 00000514 00000514 00000034 00000034 00000004 (          PF_R) 00000004

Section Headers
IDX  name     type         flags                            addr     offset   size     link     info     addralgn entsize  Name
==== -------- ------------ -------------------------------- -------- -------- -------- -------- -------- -------- -------- ====================
[ 0] 00000000 SHT_NULL     00000000 (                     ) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
[ 1] 0000000f SHT_HASH     00000002 (      ALLOC          ) 00000120 00000120 00000044 00000003 00000000 00000008 00000004 .hash
[ 2] 0000000b 0x6ffffff6   00000002 (      ALLOC          ) 00000168 00000168 00000050 00000003 00000000 00000008 00000000 .gnu.hash
[ 3] 00000015 SHT_DYNSYM   00000002 (      ALLOC          ) 000001b8 000001b8 00000120 00000004 00000001 00000008 00000018 .dynsym

etc.

So I changed the ELF code to not rely on requiring sections when the PT_DYNAMIC info or PT_NOTE info points to the same data without requiring section info.

👍

The first memory region is typically a page or two and contains all that we want.

That depends on how the file is linked. By changing the linker flags, I can easily make it so that the entire code section of the binary ends up in the first memory region. That means the region can be arbitrarily (~gigabytes) large.

Also, there's really no reason why the program headers have to come directly after the elf header, or even be present in the first memory region. Producing this is harder, but
here's a perfectly valid elf file (it just prints "Hello world when executed), whose program headers are in the second memory region. (That said, you might not be always able to load this kind of a file from memory because you can't always find the program headers from the elf headers -- this is why the dynamic linker also stores a pointer to the dynamic section.)

But I am currently working with some Android folks and we do need to worry about loading ELF headers from the middle of an APK file. I am not sure how the memory regions look for shared libraries that are uncompressed inside of the APK files. If the memory regions are ok, then this patch would work, but if there is one memory region for the entire APK it wouldn't. Though I don't see how this can work if there are not valid memory regions for each shared library within the APK because they need different permissions.

They will have multiple memory regions, exactly as described by PT_LOAD segments. Basically, the only difference is in the offset argument to the mmap call that the loader makes. Everything I said before is still true though -- the first of those PT_LOAD segments could still be extremely large.

Let me know if you think I should try to intelligently resize within ObjectFileELF, or if anything I have said above made you change your mind.

Not really (but thanks for the background/explanation). I still believe the best way to achieve what you want is inside ObjectFileELF. Basically, to increase the size of the loaded memory if the program headers don't fit into the original, just like you described. The reason for that is that here we can know what is the size of the program headers we expect, and so we can read just that piece of memory, instead of a potentially very large chunk. (Using the memory region info to confirm the data from the elf header is still a good idea though, so that we avoid that problem with corrupted headers).

I'd also be fine with increasing the initial chunk of memory being loaded to say 4kb, so that we don't need to perform this second load in the most common case.

clayborg requested review from jasonmolenda and labath January 16, 2025 01:47

clayborg requested a review from JDevlieghere as a code owner January 16, 2025 01:47

llvmbot added the lldb label Jan 16, 2025

clayborg requested review from Jlalond and jeffreytan81 January 16, 2025 18:56

Jlalond reviewed Jan 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LLDB][NFC] Calculate the region size of an in memory image if size isn't specified #123148

[LLDB][NFC] Calculate the region size of an in memory image if size isn't specified #123148

Uh oh!

clayborg commented Jan 16, 2025

Uh oh!

llvmbot commented Jan 16, 2025

Uh oh!

labath commented Jan 16, 2025

Uh oh!

Jlalond Jan 16, 2025

Uh oh!

Jlalond Jan 16, 2025

Uh oh!

Jlalond Jan 16, 2025

Uh oh!

clayborg commented Jan 22, 2025

Uh oh!

labath commented Jan 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[LLDB][NFC] Calculate the region size of an in memory image if size isn't specified #123148

Are you sure you want to change the base?

[LLDB][NFC] Calculate the region size of an in memory image if size isn't specified #123148

Uh oh!

Conversation

clayborg commented Jan 16, 2025

Uh oh!

llvmbot commented Jan 16, 2025

Uh oh!

labath commented Jan 16, 2025

Uh oh!

Jlalond Jan 16, 2025

Choose a reason for hiding this comment

Uh oh!

Jlalond Jan 16, 2025

Choose a reason for hiding this comment

Uh oh!

Jlalond Jan 16, 2025

Choose a reason for hiding this comment

Uh oh!

clayborg commented Jan 22, 2025

Uh oh!

labath commented Jan 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants