Skip to content

Commit 8ec4db5

Browse files
UltimateForce21JDevlieghereadrian-prantl
authored
Stateful variable-location annotations in Disassembler::PrintInstructions() (follow-up to llvm#147460) (llvm#152887)
**Context** Follow-up to [llvm#147460](llvm#147460), which added the ability to surface register-resident variable locations. This PR moves the annotation logic out of `Instruction::Dump()` and into `Disassembler::PrintInstructions()`, and adds lightweight state tracking so we only print changes at range starts and when variables go out of scope. --- ## What this does While iterating the instructions for a function, we maintain a “live variable map” keyed by `lldb::user_id_t` (the `Variable`’s ID) to remember each variable’s last emitted location string. For each instruction: - **New (or newly visible) variable** → print `name = <location>` once at the start of its DWARF location range, cache it. - **Location changed** (e.g., DWARF range switched to a different register/const) → print the updated mapping. - **Out of scope** (was tracked previously but not found for the current PC) → print `name = <undef>` and drop it. This produces **concise, stateful annotations** that highlight variable lifetime transitions without spamming every line. --- ## Why in `PrintInstructions()`? - Keeps `Instruction` stateless and avoids changing the `Instruction::Dump()` virtual API. - Makes it straightforward to diff state across instructions (`prev → current`) inside the single driver loop. --- ## How it works (high-level) 1. For the current PC, get in-scope variables via `StackFrame::GetInScopeVariableList(/*get_parent=*/true)`. 2. For each `Variable`, query `DWARFExpressionList::GetExpressionEntryAtAddress(func_load_addr, current_pc)` (added in llvm#144238). 3. If the entry exists, call `DumpLocation(..., eDescriptionLevelBrief, abi)` to get a short, ABI-aware location string (e.g., `DW_OP_reg3 RBX → RBX`). 4. Compare against the last emitted location in the live map: - If not present → emit `name = <location>` and record it. - If different → emit updated mapping and record it. 5. After processing current in-scope variables, compute the set difference vs. the previous map and emit `name = <undef>` for any that disappeared. Internally: - We respect file↔load address translation already provided by `DWARFExpressionList`. - We reuse the ABI to map LLVM register numbers to arch register names. --- ## Example output (x86_64, simplified) ``` -> 0x55c6f5f6a140 <+0>: cmpl $0x2, %edi ; argc = RDI, argv = RSI 0x55c6f5f6a143 <+3>: jl 0x55c6f5f6a176 ; <+54> at d_original_example.c:6:3 0x55c6f5f6a145 <+5>: pushq %r15 0x55c6f5f6a147 <+7>: pushq %r14 0x55c6f5f6a149 <+9>: pushq %rbx 0x55c6f5f6a14a <+10>: movq %rsi, %rbx 0x55c6f5f6a14d <+13>: movl %edi, %r14d 0x55c6f5f6a150 <+16>: movl $0x1, %r15d ; argc = R14 0x55c6f5f6a156 <+22>: nopw %cs:(%rax,%rax) ; i = R15, argv = RBX 0x55c6f5f6a160 <+32>: movq (%rbx,%r15,8), %rdi 0x55c6f5f6a164 <+36>: callq 0x55c6f5f6a030 ; symbol stub for: puts 0x55c6f5f6a169 <+41>: incq %r15 0x55c6f5f6a16c <+44>: cmpq %r15, %r14 0x55c6f5f6a16f <+47>: jne 0x55c6f5f6a160 ; <+32> at d_original_example.c:5:10 0x55c6f5f6a171 <+49>: popq %rbx ; i = <undef> 0x55c6f5f6a172 <+50>: popq %r14 ; argv = RSI 0x55c6f5f6a174 <+52>: popq %r15 ; argc = RDI 0x55c6f5f6a176 <+54>: xorl %eax, %eax 0x55c6f5f6a178 <+56>: retq ``` Only transitions are shown: the start of a location, changes, and end-of-lifetime. --- ## Scope & limitations (by design) - Handles **simple locations** first (registers, const-in-register cases surfaced by `DumpLocation`). - **Memory/composite locations** are out of scope for this PR. - Annotations appear **only at range boundaries** (start/change/end) to minimize noise. - Output is **target-independent**; register names come from the target ABI. ## Implementation notes - All annotation printing now happens in `Disassembler::PrintInstructions()`. - Uses `std::unordered_map<lldb::user_id_t, std::string>` as the live map. - No persistent state across calls; the map is rebuilt while walking instruction by instruction. - **No changes** to the `Instruction` interface. --- ## Requested feedback - Placement and wording of the `<undef>` marker. - Whether we should optionally gate this behind a setting (currently always on when disassembling with an `ExecutionContext`). - Preference for immediate inclusion of tests vs. follow-up patch. --- Thanks for reviewing! Happy to adjust behavior/format based on feedback. --------- Co-authored-by: Jonas Devlieghere <[email protected]> Co-authored-by: Adrian Prantl <[email protected]>
1 parent 6f2840d commit 8ec4db5

File tree

19 files changed

+3119
-95
lines changed

19 files changed

+3119
-95
lines changed

lldb/include/lldb/Core/Disassembler.h

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -169,7 +169,7 @@ class Instruction {
169169

170170
virtual bool IsAuthenticated() = 0;
171171

172-
bool CanSetBreakpoint ();
172+
bool CanSetBreakpoint();
173173

174174
virtual size_t Decode(const Disassembler &disassembler,
175175
const DataExtractor &data,
@@ -282,7 +282,7 @@ std::function<bool(const Instruction::Operand &)> FetchImmOp(int64_t &imm);
282282

283283
std::function<bool(const Instruction::Operand &)>
284284
MatchOpType(Instruction::Operand::Type type);
285-
}
285+
} // namespace OperandMatchers
286286

287287
class InstructionList {
288288
public:
@@ -316,20 +316,19 @@ class InstructionList {
316316
/// @param[in] ignore_calls
317317
/// It true, then fine the first branch instruction that isn't
318318
/// a function call (a branch that calls and returns to the next
319-
/// instruction). If false, find the instruction index of any
319+
/// instruction). If false, find the instruction index of any
320320
/// branch in the list.
321-
///
321+
///
322322
/// @param[out] found_calls
323-
/// If non-null, this will be set to true if any calls were found in
323+
/// If non-null, this will be set to true if any calls were found in
324324
/// extending the range.
325-
///
325+
///
326326
/// @return
327327
/// The instruction index of the first branch that is at or past
328-
/// \a start. Returns UINT32_MAX if no matching branches are
328+
/// \a start. Returns UINT32_MAX if no matching branches are
329329
/// found.
330330
//------------------------------------------------------------------
331-
uint32_t GetIndexOfNextBranchInstruction(uint32_t start,
332-
bool ignore_calls,
331+
uint32_t GetIndexOfNextBranchInstruction(uint32_t start, bool ignore_calls,
333332
bool *found_calls) const;
334333

335334
uint32_t GetIndexOfInstructionAtLoadAddress(lldb::addr_t load_addr,
@@ -399,6 +398,7 @@ class Disassembler : public std::enable_shared_from_this<Disassembler>,
399398
eOptionMarkPCAddress =
400399
(1u << 3), // Mark the disassembly line the contains the PC
401400
eOptionShowControlFlowKind = (1u << 4),
401+
eOptionVariableAnnotations = (1u << 5),
402402
};
403403

404404
enum HexImmediateStyle {

lldb/include/lldb/Expression/DWARFExpression.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,8 @@ class DWARFExpression {
159159
return data.GetByteSize() > 0;
160160
}
161161

162-
void DumpLocation(Stream *s, lldb::DescriptionLevel level, ABI *abi) const;
162+
void DumpLocation(Stream *s, lldb::DescriptionLevel level, ABI *abi,
163+
llvm::DIDumpOptions options = {}) const;
163164

164165
bool MatchesOperand(StackFrame &frame, const Instruction::Operand &op) const;
165166

lldb/source/Commands/CommandObjectDisassemble.cpp

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,10 @@ Status CommandObjectDisassemble::CommandOptions::SetOptionValue(
154154
}
155155
} break;
156156

157+
case 'v':
158+
enable_variable_annotations = true;
159+
break;
160+
157161
case '\x01':
158162
force = true;
159163
break;
@@ -180,6 +184,7 @@ void CommandObjectDisassemble::CommandOptions::OptionParsingStarting(
180184
end_addr = LLDB_INVALID_ADDRESS;
181185
symbol_containing_addr = LLDB_INVALID_ADDRESS;
182186
raw = false;
187+
enable_variable_annotations = false;
183188
plugin_name.clear();
184189

185190
Target *target =
@@ -529,6 +534,9 @@ void CommandObjectDisassemble::DoExecute(Args &command,
529534
if (m_options.raw)
530535
options |= Disassembler::eOptionRawOuput;
531536

537+
if (m_options.enable_variable_annotations)
538+
options |= Disassembler::eOptionVariableAnnotations;
539+
532540
llvm::Expected<std::vector<AddressRange>> ranges =
533541
GetRangesForSelectedMode(result);
534542
if (!ranges) {

lldb/source/Commands/CommandObjectDisassemble.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,7 @@ class CommandObjectDisassemble : public CommandObjectParsed {
7878
// in SetOptionValue if anything the selects a location is set.
7979
lldb::addr_t symbol_containing_addr = 0;
8080
bool force = false;
81+
bool enable_variable_annotations = false;
8182
};
8283

8384
CommandObjectDisassemble(CommandInterpreter &interpreter);

lldb/source/Commands/Options.td

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -366,6 +366,8 @@ let Command = "disassemble" in {
366366
Desc<"Disassemble function containing this address.">;
367367
def disassemble_options_force : Option<"force", "\\x01">, Groups<[2,3,4,5,7]>,
368368
Desc<"Force disassembly of large functions.">;
369+
def disassemble_options_variable : Option<"variable", "v">,
370+
Desc<"Enable variable disassembly annotations for this invocation.">;
369371
}
370372

371373
let Command = "diagnostics dump" in {

lldb/source/Core/Disassembler.cpp

Lines changed: 167 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,11 @@
2626
#include "lldb/Symbol/Function.h"
2727
#include "lldb/Symbol/Symbol.h"
2828
#include "lldb/Symbol/SymbolContext.h"
29+
#include "lldb/Symbol/Variable.h"
30+
#include "lldb/Symbol/VariableList.h"
31+
#include "lldb/Target/ABI.h"
2932
#include "lldb/Target/ExecutionContext.h"
33+
#include "lldb/Target/Process.h"
3034
#include "lldb/Target/SectionLoadList.h"
3135
#include "lldb/Target/StackFrame.h"
3236
#include "lldb/Target/Target.h"
@@ -41,6 +45,8 @@
4145
#include "lldb/lldb-private-enumerations.h"
4246
#include "lldb/lldb-private-interfaces.h"
4347
#include "lldb/lldb-private-types.h"
48+
#include "llvm/ADT/DenseMap.h"
49+
#include "llvm/ADT/StringRef.h"
4450
#include "llvm/Support/Compiler.h"
4551
#include "llvm/TargetParser/Triple.h"
4652

@@ -376,6 +382,147 @@ void Disassembler::PrintInstructions(Debugger &debugger, const ArchSpec &arch,
376382
}
377383
}
378384

385+
// Add variable location annotations to the disassembly output.
386+
//
387+
// For each instruction, this block attempts to resolve in-scope variables
388+
// and determine if the current PC falls within their
389+
// DWARF location entry. If so, it prints a simplified annotation using the
390+
// variable name and its resolved location (e.g., "var = reg; " ).
391+
//
392+
// Annotations are only included if the variable has a valid DWARF location
393+
// entry, and the location string is non-empty after filtering. Decoding
394+
// errors and DWARF opcodes are intentionally omitted to keep the output
395+
// concise and user-friendly.
396+
//
397+
// The goal is to give users helpful live variable hints alongside the
398+
// disassembled instruction stream, similar to how debug information
399+
// enhances source-level debugging.
400+
401+
struct VarState {
402+
std::string name; ///< Display name.
403+
std::string last_loc; ///< Last printed location (empty means <undef>).
404+
bool seen_this_inst = false;
405+
};
406+
407+
// Track live variables across instructions.
408+
llvm::DenseMap<lldb::user_id_t, VarState> live_vars;
409+
410+
// Stateful annotator: updates live_vars and returns only what should be
411+
// printed for THIS instruction.
412+
auto annotate_static = [&](Instruction &inst, Target &target,
413+
ModuleSP module_sp) -> std::vector<std::string> {
414+
std::vector<std::string> events;
415+
416+
// Reset per-instruction seen flags.
417+
for (auto &kv : live_vars)
418+
kv.second.seen_this_inst = false;
419+
420+
const Address &iaddr = inst.GetAddress();
421+
if (!module_sp) {
422+
// Everything previously live becomes <undef>.
423+
for (auto I = live_vars.begin(), E = live_vars.end(); I != E;) {
424+
auto Cur = I++;
425+
events.push_back(
426+
llvm::formatv("{0} = <undef>", Cur->second.name).str());
427+
live_vars.erase(Cur);
428+
}
429+
return events;
430+
}
431+
432+
// Resolve innermost block at this *file* address.
433+
SymbolContext sc;
434+
const lldb::SymbolContextItem mask =
435+
eSymbolContextFunction | eSymbolContextBlock;
436+
if (!module_sp->ResolveSymbolContextForAddress(iaddr, mask, sc) ||
437+
!sc.function) {
438+
// No function context: everything dies here.
439+
for (auto I = live_vars.begin(), E = live_vars.end(); I != E;) {
440+
auto Cur = I++;
441+
events.push_back(
442+
llvm::formatv("{0} = <undef>", Cur->second.name).str());
443+
live_vars.erase(Cur);
444+
}
445+
return events;
446+
}
447+
448+
Block *B = sc.block; ///< Innermost block containing iaddr.
449+
VariableList var_list;
450+
if (B) {
451+
auto filter = [](Variable *v) -> bool { return v && !v->IsArtificial(); };
452+
453+
B->AppendVariables(/*can_create*/ true,
454+
/*get_parent_variables*/ true,
455+
/*stop_if_block_is_inlined_function*/ false,
456+
/*filter*/ filter,
457+
/*variable_list*/ &var_list);
458+
}
459+
460+
const lldb::addr_t pc_file = iaddr.GetFileAddress();
461+
const lldb::addr_t func_file = sc.function->GetAddress().GetFileAddress();
462+
463+
// ABI from Target (pretty reg names if plugin exists). Safe to be null.
464+
lldb::ProcessSP no_process;
465+
lldb::ABISP abi_sp = ABI::FindPlugin(no_process, target.GetArchitecture());
466+
ABI *abi = abi_sp.get();
467+
468+
llvm::DIDumpOptions opts;
469+
opts.ShowAddresses = false;
470+
if (abi)
471+
opts.PrintRegisterOnly = true;
472+
473+
for (size_t i = 0, e = var_list.GetSize(); i != e; ++i) {
474+
lldb::VariableSP v = var_list.GetVariableAtIndex(i);
475+
if (!v || v->IsArtificial())
476+
continue;
477+
478+
const char *nm = v->GetName().AsCString();
479+
llvm::StringRef name = nm ? nm : "<anon>";
480+
481+
lldb_private::DWARFExpressionList &exprs = v->LocationExpressionList();
482+
if (!exprs.IsValid())
483+
continue;
484+
485+
auto entry_or_err = exprs.GetExpressionEntryAtAddress(func_file, pc_file);
486+
if (!entry_or_err)
487+
continue;
488+
489+
auto entry = *entry_or_err;
490+
491+
StreamString loc_ss;
492+
entry.expr->DumpLocation(&loc_ss, eDescriptionLevelBrief, abi, opts);
493+
llvm::StringRef loc = llvm::StringRef(loc_ss.GetString()).trim();
494+
if (loc.empty())
495+
continue;
496+
497+
auto ins = live_vars.insert(
498+
{v->GetID(), VarState{name.str(), loc.str(), /*seen*/ true}});
499+
if (ins.second) {
500+
// Newly live.
501+
events.push_back(llvm::formatv("{0} = {1}", name, loc).str());
502+
} else {
503+
VarState &vs = ins.first->second;
504+
vs.seen_this_inst = true;
505+
if (vs.last_loc != loc) {
506+
vs.last_loc = loc.str();
507+
events.push_back(llvm::formatv("{0} = {1}", vs.name, loc).str());
508+
}
509+
}
510+
}
511+
512+
// Anything previously live that we didn't see a location for at this inst
513+
// is now <undef>.
514+
for (auto I = live_vars.begin(), E = live_vars.end(); I != E;) {
515+
auto Cur = I++;
516+
if (!Cur->second.seen_this_inst) {
517+
events.push_back(
518+
llvm::formatv("{0} = <undef>", Cur->second.name).str());
519+
live_vars.erase(Cur);
520+
}
521+
}
522+
523+
return events;
524+
};
525+
379526
previous_symbol = nullptr;
380527
SourceLine previous_line;
381528
for (size_t i = 0; i < num_instructions_found; ++i) {
@@ -540,10 +687,26 @@ void Disassembler::PrintInstructions(Debugger &debugger, const ArchSpec &arch,
540687
const bool show_bytes = (options & eOptionShowBytes) != 0;
541688
const bool show_control_flow_kind =
542689
(options & eOptionShowControlFlowKind) != 0;
543-
inst->Dump(&strm, max_opcode_byte_size, true, show_bytes,
690+
691+
StreamString inst_line;
692+
693+
inst->Dump(&inst_line, max_opcode_byte_size, true, show_bytes,
544694
show_control_flow_kind, &exe_ctx, &sc, &prev_sc, nullptr,
545695
address_text_size);
696+
697+
if ((options & eOptionVariableAnnotations) && target_sp) {
698+
auto annotations = annotate_static(*inst, *target_sp, module_sp);
699+
if (!annotations.empty()) {
700+
const size_t annotation_column = 100;
701+
inst_line.FillLastLineToColumn(annotation_column, ' ');
702+
inst_line.PutCString("; ");
703+
inst_line.PutCString(llvm::join(annotations, ", "));
704+
}
705+
}
706+
707+
strm.PutCString(inst_line.GetString());
546708
strm.EOL();
709+
547710
} else {
548711
break;
549712
}
@@ -724,9 +887,7 @@ bool Instruction::DumpEmulation(const ArchSpec &arch) {
724887
return false;
725888
}
726889

727-
bool Instruction::CanSetBreakpoint () {
728-
return !HasDelaySlot();
729-
}
890+
bool Instruction::CanSetBreakpoint() { return !HasDelaySlot(); }
730891

731892
bool Instruction::HasDelaySlot() {
732893
// Default is false.
@@ -1073,10 +1234,8 @@ void InstructionList::Append(lldb::InstructionSP &inst_sp) {
10731234
m_instructions.push_back(inst_sp);
10741235
}
10751236

1076-
uint32_t
1077-
InstructionList::GetIndexOfNextBranchInstruction(uint32_t start,
1078-
bool ignore_calls,
1079-
bool *found_calls) const {
1237+
uint32_t InstructionList::GetIndexOfNextBranchInstruction(
1238+
uint32_t start, bool ignore_calls, bool *found_calls) const {
10801239
size_t num_instructions = m_instructions.size();
10811240

10821241
uint32_t next_branch = UINT32_MAX;

lldb/source/Expression/DWARFExpression.cpp

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,8 @@ void DWARFExpression::UpdateValue(uint64_t const_value,
6767
}
6868

6969
void DWARFExpression::DumpLocation(Stream *s, lldb::DescriptionLevel level,
70-
ABI *abi) const {
70+
ABI *abi,
71+
llvm::DIDumpOptions options) const {
7172
auto *MCRegInfo = abi ? &abi->GetMCRegisterInfo() : nullptr;
7273
auto GetRegName = [&MCRegInfo](uint64_t DwarfRegNum,
7374
bool IsEH) -> llvm::StringRef {
@@ -79,10 +80,9 @@ void DWARFExpression::DumpLocation(Stream *s, lldb::DescriptionLevel level,
7980
return llvm::StringRef(RegName);
8081
return {};
8182
};
82-
llvm::DIDumpOptions DumpOpts;
83-
DumpOpts.GetNameForDWARFReg = GetRegName;
83+
options.GetNameForDWARFReg = GetRegName;
8484
llvm::DWARFExpression E(m_data.GetAsLLVM(), m_data.GetAddressByteSize());
85-
llvm::printDwarfExpression(&E, s->AsRawOstream(), DumpOpts, nullptr);
85+
llvm::printDwarfExpression(&E, s->AsRawOstream(), options, nullptr);
8686
}
8787

8888
RegisterKind DWARFExpression::GetRegisterKind() const { return m_reg_kind; }

0 commit comments

Comments
 (0)