-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Background
Without debug information, developers cannot use `gdb`, `lldb`, or crash reporters to understand what their compiled code is doing. DWARF is the standard debug format on Linux (ELF) and macOS (Mach-O/DSYM). Emitting DWARF is a prerequisite for production compiler status.
Scope
Phase 1 — Line number tables (.debug_line)
The minimum useful debug info: map each machine instruction back to a source file and line number. Enables:
gdbstep-by-step debugging- Crash stack traces with file:line
The LLVM IR already carries source location metadata (!dbg nodes) when provided by the frontend. This phase threads those locations through the codegen pipeline into DWARF line number program opcodes.
Phase 2 — Type and variable information (.debug_info + .debug_abbrev)
Emit DIEs (Debug Information Entries) for:
- Compilation unit (
DW_TAG_compile_unit) - Subprograms/functions (
DW_TAG_subprogram) with parameter and return type - Basic types (
DW_TAG_base_type): i8/i16/i32/i64/f32/f64 - Local variables (
DW_TAG_variable) with location expressions
Phase 3 — Frame information (.debug_frame / .eh_frame)
Required for unwinding (exception handling, stack traces):
- CFI directives (Call Frame Information) describing how to unwind the stack at each instruction
- On Linux:
.eh_framesection for libgcc/libunwind - On macOS: Compact Unwind information in
__TEXT,__unwind_info
IR changes needed
The parser must accept and preserve LLVM debug metadata:
!0 = !DILocation(line: 10, column: 5, scope: !1)The printer must emit it back. The codegen must attach !dbg locations to machine instructions.
Acceptance criteria
-
.debug_linesection emitted for any module that carries!DILocationmetadata -
dwarfdump our.oshows correct file/line mappings -
gdbcan set a breakpoint by function name on a compiled object - Phase 2 (type DIEs) implemented for base integer and float types
- All DWARF output validated with
llvm-dwarfdump --verify