Skip to content

Stateful variable-location annotations in Disassembler::PrintInstructions() (follow-up to #147460) #152887

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 43 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
8ed8c54
[lldb] Add DWARFExpressionEntry and GetExpressionEntryAtAddress() to …
UltimateForce21 Jun 11, 2025
1db5002
Update lldb/include/lldb/Expression/DWARFExpressionList.h
UltimateForce21 Jun 19, 2025
a26010b
Update lldb/include/lldb/Expression/DWARFExpressionList.h
UltimateForce21 Jun 19, 2025
72237b7
Update lldb/source/Expression/DWARFExpressionList.cpp
UltimateForce21 Jun 19, 2025
94e4951
Update DWARFExpressionList.h
UltimateForce21 Jun 24, 2025
e8142da
Update DWARFExpressionList.cpp
UltimateForce21 Jun 24, 2025
7e8741e
Update DWARFExpressionList.h
UltimateForce21 Jun 28, 2025
c4cd77f
Update DWARFExpressionList.cpp
UltimateForce21 Jun 28, 2025
62c02a9
Change GetExpressionEntryAtAddress to return std::optional instead of…
UltimateForce21 Jun 29, 2025
d015971
Update DWARFExpressionList.cpp
UltimateForce21 Jul 2, 2025
60898ea
Add underflow/overflow checks to GetExpressionEntryAtAddressi
UltimateForce21 Jul 3, 2025
3462165
Make file_range optional in DWARFExpressionEntry for always-valid expr
UltimateForce21 Jul 8, 2025
2ed8443
Annotate Instruction::Dump() with DWARF variable locations
UltimateForce21 Jul 3, 2025
8c6b22d
Added Initial Basic API test for rich variable annotation in disassem…
UltimateForce21 Jul 5, 2025
842a9e5
Improved DWARF variable annotation printing and alignment
UltimateForce21 Jul 6, 2025
2fa6d24
Filter out partial DWARF decoding errors from disassembly annotations
UltimateForce21 Jul 6, 2025
6bbc8aa
Ignore annotations with only decoding errors
UltimateForce21 Jul 6, 2025
cbbc924
Add tests for disassembly variable annotations and decoding edge cases
UltimateForce21 Jul 6, 2025
b887db2
Rebase disassembler annotations branch onto updated DWARFExpressionEn…
UltimateForce21 Jul 8, 2025
912ba6d
Add `PrintRegisterOnly` flag in `struct DIDumpOptions` and created ne…
UltimateForce21 Jul 9, 2025
09c4d04
Add high-level comment explaining rich disassembly annotation logic i…
UltimateForce21 Jul 20, 2025
6e17f77
Add comment clarifying annotation column length check in Instruction:…
UltimateForce21 Jul 20, 2025
31431c0
Refactor variable annotation logic in `Instruction::Dump` using `anno…
UltimateForce21 Jul 20, 2025
9c5cb8f
Use range-based for loop for variable list iteration in Instruction::…
UltimateForce21 Jul 20, 2025
ca8510c
Consolidated DumpLocation and DumpLocationWithOptions using default D…
UltimateForce21 Jul 20, 2025
ffefe5f
Use `llvm::join` to simplify annotation output formatting
UltimateForce21 Jul 20, 2025
fae745a
Merge branch 'main' into add-disassembler-annotations
UltimateForce21 Aug 4, 2025
dcddf16
Fix formatting to match LLVM style
UltimateForce21 Aug 4, 2025
7bac074
More formatting fixes
UltimateForce21 Aug 4, 2025
79c0a9e
Fix formatting for code and tests
UltimateForce21 Aug 5, 2025
c7f1b30
Ported annotations from Instruction::Dump to Disassembler::PrintInstr…
UltimateForce21 Aug 7, 2025
3d19b02
Added `--rich` option for disassembler annotations and updated SBFram…
UltimateForce21 Aug 8, 2025
6ca4bb6
Formatting changes.
UltimateForce21 Aug 8, 2025
4bf584e
Merge branch 'main' into add-disassembler-annotations
UltimateForce21 Aug 8, 2025
b1f13e7
Redo Workflow tests
UltimateForce21 Aug 9, 2025
7b526fc
Updated from add-disassembler-annotations to include enable_annotatio…
UltimateForce21 Aug 9, 2025
10fddc4
Added basic stateful variable location annotations to disassembly output
UltimateForce21 Aug 9, 2025
b784868
Formatting changes.
UltimateForce21 Aug 10, 2025
cb0cd3a
Moved rich annotations flag into Disassembler options
UltimateForce21 Aug 11, 2025
fbd4e65
Switched to llvm::SmallDenseMap for live_vars in PrintInstructions
UltimateForce21 Aug 11, 2025
77fa1ed
Fixed code style to match LLVM convention
UltimateForce21 Aug 12, 2025
7069b6a
Formatting changes.
UltimateForce21 Aug 12, 2025
212a401
Switched rich annotations CLI flag from -R (--rich) to -v (--variable)
UltimateForce21 Aug 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions lldb/include/lldb/Core/Disassembler.h
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ class Instruction {

virtual bool IsAuthenticated() = 0;

bool CanSetBreakpoint ();
bool CanSetBreakpoint();

virtual size_t Decode(const Disassembler &disassembler,
const DataExtractor &data,
Expand Down Expand Up @@ -282,7 +282,7 @@ std::function<bool(const Instruction::Operand &)> FetchImmOp(int64_t &imm);

std::function<bool(const Instruction::Operand &)>
MatchOpType(Instruction::Operand::Type type);
}
} // namespace OperandMatchers

class InstructionList {
public:
Expand Down Expand Up @@ -316,20 +316,19 @@ class InstructionList {
/// @param[in] ignore_calls
/// It true, then fine the first branch instruction that isn't
/// a function call (a branch that calls and returns to the next
/// instruction). If false, find the instruction index of any
/// instruction). If false, find the instruction index of any
/// branch in the list.
///
///
/// @param[out] found_calls
/// If non-null, this will be set to true if any calls were found in
/// If non-null, this will be set to true if any calls were found in
/// extending the range.
///
///
/// @return
/// The instruction index of the first branch that is at or past
/// \a start. Returns UINT32_MAX if no matching branches are
/// \a start. Returns UINT32_MAX if no matching branches are
/// found.
//------------------------------------------------------------------
uint32_t GetIndexOfNextBranchInstruction(uint32_t start,
bool ignore_calls,
uint32_t GetIndexOfNextBranchInstruction(uint32_t start, bool ignore_calls,
bool *found_calls) const;

uint32_t GetIndexOfInstructionAtLoadAddress(lldb::addr_t load_addr,
Expand Down Expand Up @@ -399,6 +398,7 @@ class Disassembler : public std::enable_shared_from_this<Disassembler>,
eOptionMarkPCAddress =
(1u << 3), // Mark the disassembly line the contains the PC
eOptionShowControlFlowKind = (1u << 4),
eOptionRichAnnotations = (1u << 5),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eOptionVariableAnnotations

};

enum HexImmediateStyle {
Expand Down
3 changes: 2 additions & 1 deletion lldb/include/lldb/Expression/DWARFExpression.h
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,8 @@ class DWARFExpression {
return data.GetByteSize() > 0;
}

void DumpLocation(Stream *s, lldb::DescriptionLevel level, ABI *abi) const;
void DumpLocation(Stream *s, lldb::DescriptionLevel level, ABI *abi,
llvm::DIDumpOptions options = {}) const;

bool MatchesOperand(StackFrame &frame, const Instruction::Operand &op) const;

Expand Down
8 changes: 8 additions & 0 deletions lldb/source/Commands/CommandObjectDisassemble.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,10 @@ Status CommandObjectDisassemble::CommandOptions::SetOptionValue(
}
} break;

case 'R': //< --rich
enable_rich_annotations = true;
break;

case '\x01':
force = true;
break;
Expand All @@ -180,6 +184,7 @@ void CommandObjectDisassemble::CommandOptions::OptionParsingStarting(
end_addr = LLDB_INVALID_ADDRESS;
symbol_containing_addr = LLDB_INVALID_ADDRESS;
raw = false;
enable_rich_annotations = false;
plugin_name.clear();

Target *target =
Expand Down Expand Up @@ -528,6 +533,9 @@ void CommandObjectDisassemble::DoExecute(Args &command,
if (m_options.raw)
options |= Disassembler::eOptionRawOuput;

if (m_options.enable_rich_annotations)
options |= Disassembler::eOptionRichAnnotations;

llvm::Expected<std::vector<AddressRange>> ranges =
GetRangesForSelectedMode(result);
if (!ranges) {
Expand Down
1 change: 1 addition & 0 deletions lldb/source/Commands/CommandObjectDisassemble.h
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ class CommandObjectDisassemble : public CommandObjectParsed {
// in SetOptionValue if anything the selects a location is set.
lldb::addr_t symbol_containing_addr = 0;
bool force = false;
bool enable_rich_annotations = false;
};

CommandObjectDisassemble(CommandInterpreter &interpreter);
Expand Down
2 changes: 2 additions & 0 deletions lldb/source/Commands/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -361,6 +361,8 @@ let Command = "disassemble" in {
Desc<"Disassemble function containing this address.">;
def disassemble_options_force : Option<"force", "\\x01">, Groups<[2,3,4,5,7]>,
Desc<"Force disassembly of large functions.">;
def disassemble_options_rich : Option<"rich", "R">,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def disassemble_options_rich : Option<"rich", "R">,
def disassemble_options_rich : Option<"variables", "v">,

Desc<"Enable rich disassembly annotations for this invocation.">;
}

let Command = "diagnostics dump" in {
Expand Down
173 changes: 165 additions & 8 deletions lldb/source/Core/Disassembler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,10 @@
#include "lldb/Symbol/Function.h"
#include "lldb/Symbol/Symbol.h"
#include "lldb/Symbol/SymbolContext.h"
#include "lldb/Symbol/Variable.h"
#include "lldb/Symbol/VariableList.h"
#include "lldb/Target/ExecutionContext.h"
#include "lldb/Target/Process.h"
#include "lldb/Target/SectionLoadList.h"
#include "lldb/Target/StackFrame.h"
#include "lldb/Target/Target.h"
Expand All @@ -41,6 +44,7 @@
#include "lldb/lldb-private-enumerations.h"
#include "lldb/lldb-private-interfaces.h"
#include "lldb/lldb-private-types.h"
#include "llvm/ADT/DenseMap.h"
#include "llvm/Support/Compiler.h"
#include "llvm/TargetParser/Triple.h"

Expand Down Expand Up @@ -376,6 +380,147 @@ void Disassembler::PrintInstructions(Debugger &debugger, const ArchSpec &arch,
}
}

// Add rich variable location annotations to the disassembly output.
//
// For each instruction, this block attempts to resolve in-scope variables
// and determine if the current PC falls within their
// DWARF location entry. If so, it prints a simplified annotation using the
// variable name and its resolved location (e.g., "var = reg; " ).
//
// Annotations are only included if the variable has a valid DWARF location
// entry, and the location string is non-empty after filtering. Decoding
// errors and DWARF opcodes are intentionally omitted to keep the output
// concise and user-friendly.
//
// The goal is to give users helpful live variable hints alongside the
// disassembled instruction stream, similar to how debug information
// enhances source-level debugging.

struct VarState {
std::string name; //< Display name.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::string name; //< Display name.
std::string name; ///< Display name.

std::string last_loc; //< Last printed location (empty means <undef>).
bool seen_this_inst = false;
};

// Track live variables across instructions (keyed by stable LLDB user_id_t. 8
// is a good small-buffer guess.
llvm::SmallDenseMap<lldb::user_id_t, VarState, 8> live_vars;

// Stateful annotator: updates live_vars and returns only what should be
// printed for THIS instruction.
auto annotate_variables = [&](Instruction &inst) -> std::vector<std::string> {
std::vector<std::string> events;

StackFrame *frame = exe_ctx.GetFramePtr();
TargetSP target_sp = exe_ctx.GetTargetSP();
ProcessSP process_sp = exe_ctx.GetProcessSP();
if (!frame || !target_sp || !process_sp)
return events;

// Reset "seen" flags for this instruction.
for (auto &kv : live_vars)
kv.second.seen_this_inst = false;

addr_t current_pc = inst.GetAddress().GetLoadAddress(target_sp.get());
addr_t original_pc =
frame->GetFrameCodeAddress().GetLoadAddress(target_sp.get());

// We temporarily move the frame PC so variable locations resolve at this
// instruction.
if (!frame->ChangePC(current_pc))
return events;

VariableListSP var_list_sp = frame->GetInScopeVariableList(true);
if (!var_list_sp) {
// No variables in scope: everything previously live becomes <undef>.
for (auto I = live_vars.begin(), E = live_vars.end(); I != E;) {
auto Cur = I++;
events.push_back(
llvm::formatv("{0} = <undef>", Cur->second.name).str());
live_vars.erase(Cur);
}
frame->ChangePC(original_pc);
return events;
}

SymbolContext sc = frame->GetSymbolContext(eSymbolContextFunction);
addr_t func_load_addr =
sc.function ? sc.function->GetAddress().GetLoadAddress(target_sp.get())
: LLDB_INVALID_ADDRESS;

// Walk all in-scope variables and try to resolve a location.
for (const VariableSP &var_sp : *var_list_sp) {
if (!var_sp)
continue;

// The var_id is a lldb::user_id_t – stable key.
const auto var_id = var_sp->GetID();
const char *name_cstr = var_sp->GetName().AsCString();
llvm::StringRef name = name_cstr ? name_cstr : "<anon>";

auto &expr_list = var_sp->LocationExpressionList();
if (!expr_list.IsValid())
continue;

auto entry_or_err =
expr_list.GetExpressionEntryAtAddress(func_load_addr, current_pc);
if (!entry_or_err)
continue;

auto entry = *entry_or_err;

// Check range if present.
if (entry.file_range &&
!entry.file_range->ContainsFileAddress(
(current_pc - func_load_addr) + expr_list.GetFuncFileAddress()))
continue;

// Render a compact location string.
ABI *abi = process_sp->GetABI().get();
llvm::DIDumpOptions opts;
opts.ShowAddresses = false;
opts.PrintRegisterOnly = true;

StreamString loc_str;
entry.expr->DumpLocation(&loc_str, eDescriptionLevelBrief, abi, opts);
llvm::StringRef loc_clean = llvm::StringRef(loc_str.GetString()).trim();
if (loc_clean.empty())
continue;

auto insert_res =
live_vars.insert({var_id, VarState{std::string(name), loc_clean.str(),
/*seen_this_inst*/ true}});
if (insert_res.second) {
// Newly inserted → print.
events.push_back(llvm::formatv("{0} = {1}", name, loc_clean).str());
} else {
// Already present.
VarState &vs = insert_res.first->second;
vs.seen_this_inst = true;
if (vs.last_loc != loc_clean) {
vs.last_loc = loc_clean.str();
events.push_back(
llvm::formatv("{0} = {1}", vs.name, loc_clean).str());
}
}
}

// Anything previously live that we didn't see a location for at this inst
// is now <undef>.
for (auto I = live_vars.begin(), E = live_vars.end(); I != E;) {
auto Cur = I++;
if (!Cur->second.seen_this_inst) {
events.push_back(
llvm::formatv("{0} = <undef>", Cur->second.name).str());
live_vars.erase(Cur);
}
}

// Restore PC.
frame->ChangePC(original_pc);
return events;
};

previous_symbol = nullptr;
SourceLine previous_line;
for (size_t i = 0; i < num_instructions_found; ++i) {
Expand Down Expand Up @@ -540,10 +685,26 @@ void Disassembler::PrintInstructions(Debugger &debugger, const ArchSpec &arch,
const bool show_bytes = (options & eOptionShowBytes) != 0;
const bool show_control_flow_kind =
(options & eOptionShowControlFlowKind) != 0;
inst->Dump(&strm, max_opcode_byte_size, true, show_bytes,

StreamString inst_line;

inst->Dump(&inst_line, max_opcode_byte_size, true, show_bytes,
show_control_flow_kind, &exe_ctx, &sc, &prev_sc, nullptr,
address_text_size);

if (options & eOptionRichAnnotations) {
std::vector<std::string> annotations = annotate_variables(*inst);
if (!annotations.empty()) {
const size_t annotation_column = 100;
inst_line.FillLastLineToColumn(annotation_column, ' ');
inst_line.PutCString("; ");
inst_line.PutCString(llvm::join(annotations, ", "));
}
}

strm.PutCString(inst_line.GetString());
strm.EOL();

} else {
break;
}
Expand Down Expand Up @@ -724,9 +885,7 @@ bool Instruction::DumpEmulation(const ArchSpec &arch) {
return false;
}

bool Instruction::CanSetBreakpoint () {
return !HasDelaySlot();
}
bool Instruction::CanSetBreakpoint() { return !HasDelaySlot(); }

bool Instruction::HasDelaySlot() {
// Default is false.
Expand Down Expand Up @@ -1073,10 +1232,8 @@ void InstructionList::Append(lldb::InstructionSP &inst_sp) {
m_instructions.push_back(inst_sp);
}

uint32_t
InstructionList::GetIndexOfNextBranchInstruction(uint32_t start,
bool ignore_calls,
bool *found_calls) const {
uint32_t InstructionList::GetIndexOfNextBranchInstruction(
uint32_t start, bool ignore_calls, bool *found_calls) const {
size_t num_instructions = m_instructions.size();

uint32_t next_branch = UINT32_MAX;
Expand Down
8 changes: 4 additions & 4 deletions lldb/source/Expression/DWARFExpression.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,8 @@ void DWARFExpression::UpdateValue(uint64_t const_value,
}

void DWARFExpression::DumpLocation(Stream *s, lldb::DescriptionLevel level,
ABI *abi) const {
ABI *abi,
llvm::DIDumpOptions options) const {
auto *MCRegInfo = abi ? &abi->GetMCRegisterInfo() : nullptr;
auto GetRegName = [&MCRegInfo](uint64_t DwarfRegNum,
bool IsEH) -> llvm::StringRef {
Expand All @@ -79,10 +80,9 @@ void DWARFExpression::DumpLocation(Stream *s, lldb::DescriptionLevel level,
return llvm::StringRef(RegName);
return {};
};
llvm::DIDumpOptions DumpOpts;
DumpOpts.GetNameForDWARFReg = GetRegName;
options.GetNameForDWARFReg = GetRegName;
llvm::DWARFExpression E(m_data.GetAsLLVM(), m_data.GetAddressByteSize());
llvm::printDwarfExpression(&E, s->AsRawOstream(), DumpOpts, nullptr);
llvm::printDwarfExpression(&E, s->AsRawOstream(), options, nullptr);
}

RegisterKind DWARFExpression::GetRegisterKind() const { return m_reg_kind; }
Expand Down
Loading