[lldb] Account for registers being host endian when casting values #150011

DavidSpickett · 2025-07-22T12:50:46Z

Follow up to #148836
which fixed some of this issue but not all of it.

Our Value/ValueObject system does not store the endian directly
in the values. Instead it assumes that the endian of the result
of a cast can be assumed to be the target's endian, or the host
but only as a fallback. It assumes the place it is copying from
is also that endian.

This breaks down when you have register values. These are always
host endian and continue to be when cast. Casting them to big
endian when on a little endian host breaks certain calls like
GetValueAsUnsigned.

To fix this, check the context of the value. If it has a register
context, always treat it as host endian and make the result host
endian.

I had an alternative where I passed an "is_register" flag into
all calls to this, but it felt like a layering violation and changed
many more lines.

This solution isn't much more robust, but it works for all the test
cases I know of. Perhaps you can create a register value without
a RegisterInfo backing it, but I don't know of a way myself.

For testing, I had to add a minimal program file for each arch
so that there is a type system to support the casting. This is
generated from YAML since we only need the machine and endian
to be set.

DavidSpickett · 2025-07-22T12:51:16Z

First commit is #148836, please review the second onward.

llvmbot · 2025-07-22T12:51:20Z

@llvm/pr-subscribers-lldb

Author: David Spickett (DavidSpickett)

Changes

Fixes #135707

Follow up to #148836
which fixed some of this issue but not all of it.

Our Value/ValueObject system does not store the endian directly
in the values. Instead it assumes that the endian of the result
of a cast can be assumed to be the target's endian, or the host
but only as a fallback. It assumes the place it is copying from
is also that endian.

This breaks down when you have register values. These are always
host endian and continue to be when cast. Casting them to big
endian when on a little endian host breaks certain calls like
GetValueAsUnsigned.

To fix this, check the context of the value. If it has a register
context, always treat it as host endian and make the result host
endian.

I had an alternative where I passed an "is_register" flag into
all calls to this, but it felt like a layering violation and changed
many more lines.

This solution isn't much more robust, but it works for all the test
cases I know of. Perhaps you can create a register value without
a RegisterInfo backing it, but I don't know of a way myself.

For testing, I had to add a minimal program file for each arch
so that there is a type system to support the casting. This is
generated from YAML since we only need the machine and endian
to be set.

Full diff: https://github.com/llvm/llvm-project/pull/150011.diff

3 Files Affected:

(modified) lldb/source/Core/Value.cpp (+4-1)
(modified) lldb/source/Expression/Materializer.cpp (+11-14)
(added) lldb/test/API/commands/expression/TestRegisterExpressionEndian.py (+128)

diff --git a/lldb/source/Core/Value.cpp b/lldb/source/Core/Value.cpp
index c91b3f852f986..ff37c87f5dd6a 100644
--- a/lldb/source/Core/Value.cpp
+++ b/lldb/source/Core/Value.cpp
@@ -496,7 +496,10 @@ Status Value::GetValueAsData(ExecutionContext *exe_ctx, DataExtractor &data,
     if (exe_ctx) {
       Target *target = exe_ctx->GetTargetPtr();
       if (target) {
-        data.SetByteOrder(target->GetArchitecture().GetByteOrder());
+        // Registers are always stored in host endian.
+        data.SetByteOrder(m_context_type == ContextType::RegisterInfo
+                              ? endian::InlHostByteOrder()
+                              : target->GetArchitecture().GetByteOrder());
         data.SetAddressByteSize(target->GetArchitecture().GetAddressByteSize());
         break;
       }
diff --git a/lldb/source/Expression/Materializer.cpp b/lldb/source/Expression/Materializer.cpp
index 17ea1596806d0..8fc3df1360824 100644
--- a/lldb/source/Expression/Materializer.cpp
+++ b/lldb/source/Expression/Materializer.cpp
@@ -1376,29 +1376,26 @@ class EntityRegister : public Materializer::Entity {
       return;
     }
 
-    DataExtractor register_data;
-
-    if (!reg_value.GetData(register_data)) {
-      err = Status::FromErrorStringWithFormat(
-          "couldn't get the data for register %s", m_register_info.name);
-      return;
-    }
-
-    if (register_data.GetByteSize() != m_register_info.byte_size) {
+    if (reg_value.GetByteSize() != m_register_info.byte_size) {
       err = Status::FromErrorStringWithFormat(
           "data for register %s had size %llu but we expected %llu",
-          m_register_info.name, (unsigned long long)register_data.GetByteSize(),
+          m_register_info.name, (unsigned long long)reg_value.GetByteSize(),
           (unsigned long long)m_register_info.byte_size);
       return;
     }
 
-    m_register_contents = std::make_shared<DataBufferHeap>(
-        register_data.GetDataStart(), register_data.GetByteSize());
+    lldb_private::DataBufferHeap buf(reg_value.GetByteSize(), 0);
+    reg_value.GetAsMemoryData(m_register_info, buf.GetBytes(),
+                              buf.GetByteSize(), map.GetByteOrder(), err);
+    if (!err.Success())
+      return;
+
+    m_register_contents = std::make_shared<DataBufferHeap>(buf);
 
     Status write_error;
 
-    map.WriteMemory(load_addr, register_data.GetDataStart(),
-                    register_data.GetByteSize(), write_error);
+    map.WriteMemory(load_addr, buf.GetBytes(), reg_value.GetByteSize(),
+                    write_error);
 
     if (!write_error.Success()) {
       err = Status::FromErrorStringWithFormat(
diff --git a/lldb/test/API/commands/expression/TestRegisterExpressionEndian.py b/lldb/test/API/commands/expression/TestRegisterExpressionEndian.py
new file mode 100644
index 0000000000000..d6de8731385b6
--- /dev/null
+++ b/lldb/test/API/commands/expression/TestRegisterExpressionEndian.py
@@ -0,0 +1,128 @@
+""" Check that registers written to memory for expression evaluation are
+    written using the target's endian not the host's.
+"""
+
+from enum import Enum
+from textwrap import dedent
+import lldb
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test.decorators import *
+from lldbsuite.test.gdbclientutils import *
+from lldbsuite.test.lldbgdbclient import GDBRemoteTestBase
+
+
+class Endian(Enum):
+    BIG = 0
+    LITTLE = 1
+
+
+class Responder(MockGDBServerResponder):
+    def __init__(self, doc, endian):
+        super().__init__()
+        self.target_xml = doc
+        self.endian = endian
+
+    def qXferRead(self, obj, annex, offset, length):
+        if annex == "target.xml":
+            return self.target_xml, False
+        return (None,)
+
+    def readRegister(self, regnum):
+        return "E01"
+
+    def readRegisters(self):
+        # 64 bit pc value.
+        data = ["00", "00", "00", "00", "00", "00", "12", "34"]
+        if self.endian == Endian.LITTLE:
+            data.reverse()
+        return "".join(data)
+
+
+class TestXMLRegisterFlags(GDBRemoteTestBase):
+    def do_endian_test(self, endian):
+        architecture, pc_reg_name, yaml_file, data, machine = {
+            Endian.BIG: ("s390x", "pswa", "s390x.yaml", "ELFDATA2MSB", "EM_S390"),
+            Endian.LITTLE: (
+                "aarch64",
+                "pc",
+                "aarch64.yaml",
+                "ELFDATA2LSB",
+                "EM_AARCH64",
+            ),
+        }[endian]
+
+        self.server.responder = Responder(
+            dedent(
+                f"""\
+            <?xml version="1.0"?>
+              <target version="1.0">
+                <architecture>{architecture}</architecture>
+                <feature>
+                  <reg name="{pc_reg_name}" bitsize="64"/>
+                </feature>
+            </target>"""
+            ),
+            endian,
+        )
+
+        # We need to have a program file, so that we have a full type system,
+        # so that we can do the casts later.
+        obj_path = self.getBuildArtifact("main.o")
+        yaml_path = self.getBuildArtifact(yaml_file)
+        with open(yaml_path, "w") as f:
+            f.write(
+                dedent(
+                    f"""\
+                --- !ELF
+                FileHeader:
+                  Class:    ELFCLASS64
+                  Data:     {data}
+                  Type:     ET_REL
+                  Machine:  {machine}
+                ...
+                """
+                )
+            )
+        self.yaml2obj(yaml_path, obj_path)
+        target = self.dbg.CreateTarget(obj_path)
+
+        process = self.connect(target)
+        lldbutil.expect_state_changes(
+            self, self.dbg.GetListener(), process, [lldb.eStateStopped]
+        )
+
+        # If expressions convert register values into target endian, the
+        # result of register read, expr and casts should be the same.
+        pc_value = "0x0000000000001234"
+        self.expect(
+            "register read pc",
+            substrs=[pc_value],
+        )
+        self.expect("expr --format hex -- $pc", substrs=[pc_value])
+
+        pc = (
+            process.thread[0]
+            .frame[0]
+            .GetRegisters()
+            .GetValueAtIndex(0)
+            .GetChildMemberWithName("pc")
+        )
+        ull = target.FindTypes("unsigned long long").GetTypeAtIndex(0)
+        pc_ull = pc.Cast(ull)
+
+        self.assertEqual(pc.GetValue(), pc_ull.GetValue())
+        self.assertEqual(pc.GetValueAsAddress(), pc_ull.GetValueAsAddress())
+        self.assertEqual(pc.GetValueAsSigned(), pc_ull.GetValueAsSigned())
+        self.assertEqual(pc.GetValueAsUnsigned(), pc_ull.GetValueAsUnsigned())
+
+    @skipIfXmlSupportMissing
+    @skipIfRemote
+    @skipIfLLVMTargetMissing("AArch64")
+    def test_little_endian_target(self):
+        self.do_endian_test(Endian.LITTLE)
+
+    @skipIfXmlSupportMissing
+    @skipIfRemote
+    @skipIfLLVMTargetMissing("SystemZ")
+    def test_big_endian_target(self):
+        self.do_endian_test(Endian.BIG)

JDevlieghere

LGTM

lldb/source/Core/Value.cpp

Fixes llvm#135707 Follow up to llvm#148836 which fixed some of this issue but not all of it. Our Value/ValueObject system does not store the endian directly in the values. Instead it assumes that the endian of the result of a cast can be assumed to be the target's endian, or the host but only as a fallback. It assumes the place it is copying from is also that endian. This breaks down when you have register values. These are always host endian and continue to be when cast. Casting them to big endian when on a little endian host breaks certain calls like GetValueAsUnsigned (oddly, others did still work, but I couldn't figure out why). To fix this, check the context of the value. If it has a register context, always treat it as host endian and make the result host endian. I had an alternative where I passed an "is_register" flag into all calls to this, but it felt like a layering violation and changed many more lines. This solution isn't much more robust, but it works for all the test cases I know of. Perhaps you can create a register value without a RegisterInfo backing it, but I don't know of a way myself. For testing, I had to add a minimal program file for each arch so that there is a type system to support the casting. This is generated from YAML since we only need the machine and endian to be set.

llvm-ci · 2025-08-13T14:12:12Z

LLVM Buildbot has detected a new failure on builder lldb-aarch64-ubuntu running on linaro-lldb-aarch64-ubuntu while building lldb at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/22549

Here is the relevant piece of the build log for the reference

Step 6 (test) failure: build (failure)
...
PASS: lldb-shell :: Subprocess/clone-follow-child-softbp.test (1685 of 2307)
PASS: lldb-shell :: Settings/TestStopCommandSourceOnError.test (1686 of 2307)
PASS: lldb-shell :: Subprocess/clone-follow-child.test (1687 of 2307)
PASS: lldb-shell :: Subprocess/clone-follow-parent-softbp.test (1688 of 2307)
PASS: lldb-shell :: Settings/TestFrameFormatName.test (1689 of 2307)
PASS: lldb-shell :: Subprocess/clone-follow-parent.test (1690 of 2307)
PASS: lldb-shell :: Subprocess/fork-follow-child-softbp.test (1691 of 2307)
PASS: lldb-shell :: Subprocess/fork-follow-child.test (1692 of 2307)
PASS: lldb-shell :: Subprocess/fork-follow-parent-softbp.test (1693 of 2307)
UNRESOLVED: lldb-api :: functionalities/statusline/TestStatusline.py (1694 of 2307)
******************** TEST 'lldb-api :: functionalities/statusline/TestStatusline.py' FAILED ********************
Script:
--
/usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --arch aarch64 --build-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib --cmake-build-type Release /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/statusline -p TestStatusline.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 22.0.0git (https://github.com/llvm/llvm-project.git revision b3396c5e96f366ba85daf5f6404f18eb8a467aea)
  clang revision b3396c5e96f366ba85daf5f6404f18eb8a467aea
  llvm revision b3396c5e96f366ba85daf5f6404f18eb8a467aea
Skipping the following test categories: ['libc++', 'msvcstl', 'dsym', 'gmodules', 'debugserver', 'objc']

--
Command Output (stderr):
--
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test (TestStatusline.TestStatusline)
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_deadlock (TestStatusline.TestStatusline)
lldb-server exiting...
FAIL: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_modulelist_deadlock (TestStatusline.TestStatusline)
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_no_color (TestStatusline.TestStatusline)
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_no_target (TestStatusline.TestStatusline)
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_resize (TestStatusline.TestStatusline)
======================================================================
ERROR: test_modulelist_deadlock (TestStatusline.TestStatusline)
   Regression test for a deadlock that occurs when the status line is enabled before connecting
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/statusline/TestStatusline.py", line 199, in test_modulelist_deadlock
    self.expect(
  File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbpexpect.py", line 94, in expect
    self.child.expect_exact(s)
  File "/usr/local/lib/python3.10/dist-packages/pexpect/spawnbase.py", line 432, in expect_exact
    return exp.expect_loop(timeout)
  File "/usr/local/lib/python3.10/dist-packages/pexpect/expect.py", line 181, in expect_loop
    return self.timeout(e)
  File "/usr/local/lib/python3.10/dist-packages/pexpect/expect.py", line 144, in timeout
    raise exc

DavidSpickett requested a review from JDevlieghere as a code owner July 22, 2025 12:50

llvmbot added the lldb label Jul 22, 2025

This was referenced Jul 22, 2025

[LLDB] s390x, incorrect byte order issues with Cast and p/x $pc #135707

Closed

[LLDB] MIPS, incorrect byte order issues with Cast and p/x #135840

Closed

DavidSpickett requested review from Michael137 and labath August 5, 2025 13:10

JDevlieghere approved these changes Aug 12, 2025

View reviewed changes

lldb/source/Core/Value.cpp Outdated Show resolved Hide resolved

DavidSpickett mentioned this pull request Aug 12, 2025

[lldb] Convert registers values into target endian for expressions #148836

Merged

DavidSpickett added 2 commits August 13, 2025 13:52

address comment

18b61cf

DavidSpickett force-pushed the lldb-s390x-cast branch from 680949f to 18b61cf Compare August 13, 2025 13:57

DavidSpickett changed the title ~~[lldb] account for registers being host endian when casting values~~ [lldb] Account for registers being host endian when casting values Aug 13, 2025

DavidSpickett merged commit b3396c5 into llvm:main Aug 13, 2025
7 of 9 checks passed

DavidSpickett deleted the lldb-s390x-cast branch August 13, 2025 13:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[lldb] Account for registers being host endian when casting values #150011

[lldb] Account for registers being host endian when casting values #150011

Uh oh!

DavidSpickett commented Jul 22, 2025

Uh oh!

DavidSpickett commented Jul 22, 2025

Uh oh!

llvmbot commented Jul 22, 2025

Uh oh!

JDevlieghere left a comment

Uh oh!

Uh oh!

Uh oh!

llvm-ci commented Aug 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[lldb] Account for registers being host endian when casting values #150011

[lldb] Account for registers being host endian when casting values #150011

Uh oh!

Conversation

DavidSpickett commented Jul 22, 2025

Uh oh!

DavidSpickett commented Jul 22, 2025

Uh oh!

llvmbot commented Jul 22, 2025

Uh oh!

JDevlieghere left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

llvm-ci commented Aug 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants