Skip to content

Conversation

@n2h9
Copy link
Contributor

@n2h9 n2h9 commented Nov 24, 2025

Description

Contribution to this topic Rich Disassembler for LLDB, this part.

The rich disassembler output should be exposed as structured data and made available through LLDB’s scripting API so more tooling could be built on top of this

This pr introduces new method AnnotateStructured in VariableAnnotator class, which returns the result as a vector of VariableAnnotation structured data, compared to original Annotate.

Additionally structured data is enhanced with information inferred from DWARFExpressionEntry and variable declaration data.

I have moved this part of functionality form a bigger pr #165163 to make it easier to review, deliver smaller chunk faster in an incremental way.

Testing

Run test with

./build/bin/lldb-dotest -v -p TestVariableAnnotationsDisassembler.py lldb/test/API/functionalities/disassembler-variables

all tests (9 existing) are passing.

screenshot 2025-11-24 screenshot
screenshot 2025-11-26 image

n2h9 added 2 commits November 24, 2025 21:02
…ctured data: introduce VariableAnnotator::AnnotateStructured method

Signed-off-by: Nikita B <[email protected]>
…ctured data: introduce VariableAnnotator::AnnotateStructured method: fix typo

Signed-off-by: Nikita B <[email protected]>
@n2h9 n2h9 requested a review from JDevlieghere as a code owner November 24, 2025 21:26
@llvmbot llvmbot added the lldb label Nov 24, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 24, 2025

@llvm/pr-subscribers-lldb

Author: None (n2h9)

Changes

Description

This pr introduces new method AnnotateStructured in VariableAnnotator class, which returns the result as a vector of VariableAnnotation structured data, compared to original Annotate.

Additionally structured data is enhanced with information inferred from DWARFExpressionEntry and variable declaration data.

I have moved this part of functionality form a bigger pr #165163 to make it easier to review, deliver smaller chunk faster in an incremental way.

Testing

Run test with

./build/bin/lldb-dotest -v -p TestVariableAnnotationsDisassembler.py lldb/test/API/functionalities/disassembler-variables

all tests (9 existing) are passing.

<details>
<summary>screenshot 2025-11-24</summary>

<img width="1344" height="875" alt="screenshot" src="https://github.com/user-attachments/assets/863e0fca-1e3e-43dc-bfa3-4b78ce287ae6" />

</details>


Full diff: https://github.com/llvm/llvm-project/pull/169408.diff

2 Files Affected:

  • (modified) lldb/include/lldb/Core/Disassembler.h (+23-7)
  • (modified) lldb/source/Core/Disassembler.cpp (+79-21)
diff --git a/lldb/include/lldb/Core/Disassembler.h b/lldb/include/lldb/Core/Disassembler.h
index 5de314109b0cc..a7922d4a4f703 100644
--- a/lldb/include/lldb/Core/Disassembler.h
+++ b/lldb/include/lldb/Core/Disassembler.h
@@ -570,24 +570,40 @@ class Disassembler : public std::enable_shared_from_this<Disassembler>,
   const Disassembler &operator=(const Disassembler &) = delete;
 };
 
+/// Structured data for a single variable annotation.
+struct VariableAnnotation {
+  std::string variable_name;
+  /// Location description (e.g., "r15", "undef", "const_0").
+  std::string location_description;
+  /// Whether variable is live at this instruction.
+  bool is_live;
+  /// Register numbering scheme for location interpretation.
+  lldb::RegisterKind register_kind;
+  /// Where this annotation is valid.
+  std::optional<lldb_private::AddressRange> address_range;
+  /// Source file where variable was declared.
+  std::optional<std::string> decl_file;
+  /// Line number where variable was declared.
+  std::optional<uint32_t> decl_line;
+  /// Variable's type name.
+  std::optional<std::string> type_name;
+};
+
 /// Tracks live variable annotations across instructions and produces
 /// per-instruction "events" like `name = RDI` or `name = <undef>`.
 class VariableAnnotator {
-  struct VarState {
-    /// Display name.
-    std::string name;
-    /// Last printed location (empty means <undef>).
-    std::string last_loc;
-  };
 
   // Live state from the previous instruction, keyed by Variable::GetID().
-  llvm::DenseMap<lldb::user_id_t, VarState> m_live_vars;
+  llvm::DenseMap<lldb::user_id_t, VariableAnnotation> m_live_vars;
 
 public:
   /// Compute annotation strings for a single instruction and update
   /// `m_live_vars`. Returns only the events that should be printed *at this
   /// instruction*.
   std::vector<std::string> Annotate(Instruction &inst);
+
+  /// Returns structured data for all variables relevant at this instruction.
+  std::vector<VariableAnnotation> AnnotateStructured(Instruction &inst);
 };
 
 } // namespace lldb_private
diff --git a/lldb/source/Core/Disassembler.cpp b/lldb/source/Core/Disassembler.cpp
index f2eb887986bfb..c695aa5d696b2 100644
--- a/lldb/source/Core/Disassembler.cpp
+++ b/lldb/source/Core/Disassembler.cpp
@@ -286,6 +286,10 @@ bool Disassembler::ElideMixedSourceAndDisassemblyLine(
   return false;
 }
 
+static constexpr const char *UndefLocation = "undef";
+static const std::string UndefLocationFormatted =
+    llvm::formatv("<{0}>", UndefLocation).str();
+
 // For each instruction, this block attempts to resolve in-scope variables
 // and determine if the current PC falls within their
 // DWARF location entry. If so, it prints a simplified annotation using the
@@ -300,16 +304,41 @@ bool Disassembler::ElideMixedSourceAndDisassemblyLine(
 // disassembled instruction stream, similar to how debug information
 // enhances source-level debugging.
 std::vector<std::string> VariableAnnotator::Annotate(Instruction &inst) {
+  auto structured_annotations = AnnotateStructured(inst);
+
   std::vector<std::string> events;
+  events.reserve(structured_annotations.size());
+
+  for (const auto &annotation : structured_annotations) {
+    auto location = (annotation.location_description == UndefLocation
+                         ? UndefLocationFormatted
+                         : annotation.location_description);
+
+    auto display_string =
+        llvm::formatv("{0} = {1}", annotation.variable_name, location).str();
+
+    events.push_back(std::move(display_string));
+  }
+
+  return events;
+}
+
+std::vector<VariableAnnotation>
+VariableAnnotator::AnnotateStructured(Instruction &inst) {
+  std::vector<VariableAnnotation> annotations;
 
   auto module_sp = inst.GetAddress().GetModule();
 
-  // If we lost module context, everything becomes <undef>.
+  // If we lost module context, mark all live variables as UndefLocation.
   if (!module_sp) {
-    for (const auto &KV : m_live_vars)
-      events.emplace_back(llvm::formatv("{0} = <undef>", KV.second.name).str());
+    for (const auto &KV : m_live_vars) {
+      auto annotation_entity = KV.second;
+      annotation_entity.is_live = false;
+      annotation_entity.location_description = UndefLocation;
+      annotations.push_back(annotation_entity);
+    }
     m_live_vars.clear();
-    return events;
+    return annotations;
   }
 
   // Resolve function/block at this *file* address.
@@ -319,10 +348,14 @@ std::vector<std::string> VariableAnnotator::Annotate(Instruction &inst) {
   if (!module_sp->ResolveSymbolContextForAddress(iaddr, mask, sc) ||
       !sc.function) {
     // No function context: everything dies here.
-    for (const auto &KV : m_live_vars)
-      events.emplace_back(llvm::formatv("{0} = <undef>", KV.second.name).str());
+    for (const auto &KV : m_live_vars) {
+      auto annotation_entity = KV.second;
+      annotation_entity.is_live = false;
+      annotation_entity.location_description = UndefLocation;
+      annotations.push_back(annotation_entity);
+    }
     m_live_vars.clear();
-    return events;
+    return annotations;
   }
 
   // Collect in-scope variables for this instruction into current_vars.
@@ -349,7 +382,7 @@ std::vector<std::string> VariableAnnotator::Annotate(Instruction &inst) {
   // Prefer "register-only" output when we have an ABI.
   opts.PrintRegisterOnly = static_cast<bool>(abi_sp);
 
-  llvm::DenseMap<lldb::user_id_t, VarState> current_vars;
+  llvm::DenseMap<lldb::user_id_t, VariableAnnotation> current_vars;
 
   for (size_t i = 0, e = var_list.GetSize(); i != e; ++i) {
     lldb::VariableSP v = var_list.GetVariableAtIndex(i);
@@ -376,8 +409,26 @@ std::vector<std::string> VariableAnnotator::Annotate(Instruction &inst) {
     if (loc.empty())
       continue;
 
-    current_vars.try_emplace(v->GetID(),
-                             VarState{std::string(name), std::string(loc)});
+    std::optional<std::string> decl_file;
+    std::optional<uint32_t> decl_line;
+    std::optional<std::string> type_name;
+
+    const Declaration &decl = v->GetDeclaration();
+    if (decl.GetFile()) {
+      decl_file = decl.GetFile().GetFilename().AsCString();
+      if (decl.GetLine() > 0)
+        decl_line = decl.GetLine();
+    }
+
+    if (Type *type = v->GetType())
+      if (const char *type_str = type->GetName().AsCString())
+        type_name = type_str;
+
+    current_vars.try_emplace(
+        v->GetID(),
+        VariableAnnotation{std::string(name), std::string(loc), true,
+                           entry.expr->GetRegisterKind(), entry.file_range,
+                           decl_file, decl_line, type_name});
   }
 
   // Diff m_live_vars → current_vars.
@@ -387,24 +438,31 @@ std::vector<std::string> VariableAnnotator::Annotate(Instruction &inst) {
     auto it = m_live_vars.find(KV.first);
     if (it == m_live_vars.end()) {
       // Newly live.
-      events.emplace_back(
-          llvm::formatv("{0} = {1}", KV.second.name, KV.second.last_loc).str());
-    } else if (it->second.last_loc != KV.second.last_loc) {
+      auto annotation_entity = KV.second;
+      annotation_entity.is_live = true;
+      annotations.push_back(annotation_entity);
+    } else if (it->second.location_description !=
+               KV.second.location_description) {
       // Location changed.
-      events.emplace_back(
-          llvm::formatv("{0} = {1}", KV.second.name, KV.second.last_loc).str());
+      auto annotation_entity = KV.second;
+      annotation_entity.is_live = true;
+      annotations.push_back(annotation_entity);
     }
   }
 
-  // 2) Ends: anything that was live but is not in current_vars becomes <undef>.
-  for (const auto &KV : m_live_vars) {
-    if (!current_vars.count(KV.first))
-      events.emplace_back(llvm::formatv("{0} = <undef>", KV.second.name).str());
-  }
+  // 2) Ends: anything that was live but is not in current_vars becomes
+  // UndefLocation.
+  for (const auto &KV : m_live_vars)
+    if (!current_vars.count(KV.first)) {
+      auto annotation_entity = KV.second;
+      annotation_entity.is_live = false;
+      annotation_entity.location_description = UndefLocation;
+      annotations.push_back(annotation_entity);
+    }
 
   // Commit new state.
   m_live_vars = std::move(current_vars);
-  return events;
+  return annotations;
 }
 
 void Disassembler::PrintInstructions(Debugger &debugger, const ArchSpec &arch,

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks good, just a few nits.

// disassembled instruction stream, similar to how debug information
// enhances source-level debugging.
std::vector<std::string> VariableAnnotator::Annotate(Instruction &inst) {
auto structured_annotations = AnnotateStructured(inst);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use auto when the type isn't obvious from the right-hand side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated ✅

std::vector<std::string> events;
events.reserve(structured_annotations.size());

for (const auto &annotation : structured_annotations) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I can't tell what type auto is here by reading the code. Applies to other parts of this PR as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated ✅

Here and in most other places within this pr.

@github-actions
Copy link

github-actions bot commented Nov 26, 2025

🐧 Linux x64 Test Results

  • 33176 tests passed
  • 491 tests skipped

n2h9 added 2 commits November 26, 2025 22:34
…ctured data: introduce VariableAnnotator::AnnotateStructured method: replace auto with actual type where type is not obvious; add const modifier to local variables which are read only

Signed-off-by: Nikita B <[email protected]>
…ctured data: introduce VariableAnnotator::AnnotateStructured method: add std::move when push back annotation entity into vector

Signed-off-by: Nikita B <[email protected]>
@n2h9
Copy link
Contributor Author

n2h9 commented Nov 26, 2025

Also were updated since last review (while doing auto replacement):

  • add const statement for local variables where they are supposed to be read only;
  • add move statement where push_back annotation_entity to annotations vector.

tests on the latest revision are passing (screenshot in the pr description);

@n2h9 n2h9 requested a review from JDevlieghere November 26, 2025 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants