Skip to content

Commit 9409a84

Browse files
committed
Cranelift: add debug tag infrastructure.
This PR adds *debug tags*, a kind of metadata that can attach to CLIF instructions and be lowered to VCode instructions and as metadata on the produced compiled code. It also adds opaque descriptor blobs carried with stackslots. Together, these two features allow decorating IR with first-class debug instrumentation that is properly preserved by the compiler, including across optimizations and inlining. (Wasmtime's use of these features will come in followup PRs.) The key idea of a "debug tag" is to allow the Cranelift embedder to express whatever information it needs to, in a format that is opaque to Cranelift itself, except for the parts that need translation during lowering. In particular, the `DebugTag::StackSlot` variant gets translated to a physical offset into the stackframe in the compiled metadata output. So, for example, the embedder can emit a tag referring to a stackslot, and another describing an offset in that stackslot. The debug tags exist as a *sequence* on any given instruction; the meaning of the sequence is known only to the embedder, *except* that during inlining, the tags for the inlining call instruction are prepended to the tags of inlined instructions. In this way, a canonical use-case of tags as describing original source-language frames can preserve the source-language view even when multiple functions are inlined into one. The descriptor on a stackslot may look a little odd at first, but its purpose is to allow serializing some description of stackslot-contained runtime user-program data, in a way that is firmly attached to the stackslot. In particular, in the face of inlining, this descriptor is copied into the inlining (parent) function from the inlined function when the stackslot entity is copied; no other metadata outside Cranelift needs to track the identity of stackslots and know about that motion. This fits nicely with the ability of tags to refer to stackslots; together, the embedder can annotate instructions as having certain state in stackslots, and describe the format of that state per stackslot. This infrastructure is tested with some compile-tests now; testing of the interpretation of the metadata output will come with end-to-end debug instrumentation tests in a followup PR.
1 parent fde5b75 commit 9409a84

File tree

33 files changed

+910
-124
lines changed

33 files changed

+910
-124
lines changed

cranelift/codegen/src/cursor.rs

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,11 @@
22
//!
33
//! This module defines cursor data types that can be used for inserting instructions.
44
5-
use crate::ir;
5+
use crate::{
6+
inst_predicates::has_lowering_side_effect,
7+
ir::{self},
8+
};
9+
use alloc::vec::Vec;
610

711
/// The possible positions of a cursor.
812
#[derive(Clone, Copy, PartialEq, Eq, Debug)]
@@ -34,6 +38,9 @@ pub trait Cursor {
3438
/// Set the source location that should be assigned to new instructions.
3539
fn set_srcloc(&mut self, srcloc: ir::SourceLoc);
3640

41+
/// Set the debug tags that should be assigned to new side-effecting instructions.
42+
fn set_debug_tags(&mut self, tags: Vec<ir::DebugTag>);
43+
3744
/// Borrow a reference to the function layout that this cursor is navigating.
3845
fn layout(&self) -> &ir::Layout;
3946

@@ -61,6 +68,30 @@ pub trait Cursor {
6168
self
6269
}
6370

71+
/// Exchange this cursor for one with a set debug tag list.
72+
///
73+
/// These tags will be attached to all newly inserted
74+
/// side-effecting instructions.
75+
///
76+
/// This is intended to be used as a builder method:
77+
///
78+
/// ```
79+
/// # use cranelift_codegen::ir::{Function, Block, DebugTag};
80+
/// # use cranelift_codegen::cursor::{Cursor, FuncCursor};
81+
/// fn edit_func(func: &mut Function, tags: Vec<DebugTag>) {
82+
/// let mut pos = FuncCursor::new(func).with_debug_tags(tags);
83+
///
84+
/// // Use `pos`...
85+
/// }
86+
/// ```
87+
fn with_debug_tags(mut self, tags: Vec<ir::DebugTag>) -> Self
88+
where
89+
Self: Sized,
90+
{
91+
self.set_debug_tags(tags);
92+
self
93+
}
94+
6495
/// Rebuild this cursor positioned at `pos`.
6596
fn at_position(mut self, pos: CursorPosition) -> Self
6697
where
@@ -617,6 +648,7 @@ pub trait Cursor {
617648
pub struct FuncCursor<'f> {
618649
pos: CursorPosition,
619650
srcloc: ir::SourceLoc,
651+
debug_tags: Vec<ir::DebugTag>,
620652

621653
/// The referenced function.
622654
pub func: &'f mut ir::Function,
@@ -628,6 +660,7 @@ impl<'f> FuncCursor<'f> {
628660
Self {
629661
pos: CursorPosition::Nowhere,
630662
srcloc: Default::default(),
663+
debug_tags: vec![],
631664
func,
632665
}
633666
}
@@ -661,6 +694,10 @@ impl<'f> Cursor for FuncCursor<'f> {
661694
self.srcloc = srcloc;
662695
}
663696

697+
fn set_debug_tags(&mut self, tags: Vec<ir::DebugTag>) {
698+
self.debug_tags = tags;
699+
}
700+
664701
fn layout(&self) -> &ir::Layout {
665702
&self.func.layout
666703
}
@@ -684,6 +721,9 @@ impl<'c, 'f> ir::InstInserterBase<'c> for &'c mut FuncCursor<'f> {
684721
if !self.srcloc.is_default() {
685722
self.func.set_srcloc(inst, self.srcloc);
686723
}
724+
if has_lowering_side_effect(self.func, inst) && !self.debug_tags.is_empty() {
725+
self.func.debug_tags.set(inst, self.debug_tags.clone());
726+
}
687727
&mut self.func.dfg
688728
}
689729
}

cranelift/codegen/src/inline.rs

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
//! Cranelift the body of the callee that is to be inlined.
2121
2222
use crate::cursor::{Cursor as _, FuncCursor};
23-
use crate::ir::{self, ExceptionTableData, ExceptionTableItem, InstBuilder as _};
23+
use crate::ir::{self, DebugTag, ExceptionTableData, ExceptionTableItem, InstBuilder as _};
2424
use crate::result::CodegenResult;
2525
use crate::trace;
2626
use crate::traversals::Dfs;
@@ -366,6 +366,10 @@ fn inline_one(
366366
// callee.
367367
let mut last_inlined_block = inline_block_layout(func, call_block, callee, &entity_map);
368368

369+
// Get a copy of debug tags on the call instruction; these are
370+
// prepended to debug tags on inlined instructions.
371+
let call_debug_tags = func.debug_tags.get(call_inst).to_vec();
372+
369373
// Translate each instruction from the callee into the caller,
370374
// appending them to their associated block in the caller.
371375
//
@@ -403,6 +407,29 @@ fn inline_one(
403407
let inlined_inst = func.dfg.make_inst(inlined_inst_data);
404408
func.layout.append_inst(inlined_inst, inlined_block);
405409

410+
// Copy over debug tags, translating referenced entities
411+
// as appropriate.
412+
let debug_tags = callee.debug_tags.get(callee_inst);
413+
// If there are tags on the inlined instruction, we always
414+
// add tags, and we prepend any tags from the call
415+
// instruction; but we don't add tags if only the callsite
416+
// had them (this would otherwise mean that every single
417+
// instruction in an inlined function body would get
418+
// tags).
419+
if !debug_tags.is_empty() {
420+
let tags = call_debug_tags
421+
.iter()
422+
.cloned()
423+
.chain(debug_tags.iter().map(|tag| match *tag {
424+
DebugTag::User(value) => DebugTag::User(value),
425+
DebugTag::StackSlot(slot) => {
426+
DebugTag::StackSlot(entity_map.inlined_stack_slot(slot))
427+
}
428+
}))
429+
.collect::<SmallVec<[_; 4]>>();
430+
func.debug_tags.set(inlined_inst, tags);
431+
}
432+
406433
let opcode = callee.dfg.insts[callee_inst].opcode();
407434
if opcode.is_return() {
408435
// Instructions that return do not define any values, so we
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
//! Debug tag storage.
2+
//!
3+
//! Cranelift permits the embedder to place "debug tags" on
4+
//! instructions in CLIF. These tags are sequences of items of various
5+
//! kinds, with no other meaning imposed by Cranelift. They are passed
6+
//! through to metadata provided alongside the compilation result.
7+
//!
8+
//! When Cranelift inlines a function, it will prepend any tags from
9+
//! the call instruction at the inlining callsite to tags on all
10+
//! inlined instructions.
11+
//!
12+
//! These tags can be used, for example, to identify stackslots that
13+
//! store user state, or to denote positions in user source. In
14+
//! general, the intent is to allow perfect reconstruction of original
15+
//! (source-level) program state in an instrumentation-based
16+
//! debug-info scheme, as long as the instruction(s) on which these
17+
//! tags are attached are preserved. This will be the case for any
18+
//! instructions with side-effects.
19+
//!
20+
//! A few answers to design questions that lead to this design:
21+
//!
22+
//! - Why not use the SourceLoc mechanism? Debug tags are richer than
23+
//! that infrastructure because they preserve inlining location and
24+
//! are interleaved properly with any other tags describing the
25+
//! frame.
26+
//! - Why not attach debug tags only to special sequence-point
27+
//! instructions? This is driven by inlining: we should have the
28+
//! semantic information about a callsite attached directly to the
29+
//! call and observe it there, not have a magic "look backward to
30+
//! find a sequence point" behavior in the inliner.
31+
//!
32+
//! In other words, the needs of preserving "virtual" frames across an
33+
//! inlining transform drive this design.
34+
35+
use crate::ir::{Inst, StackSlot};
36+
use alloc::collections::BTreeMap;
37+
use alloc::vec::Vec;
38+
use core::ops::Range;
39+
40+
/// Debug tags for instructions.
41+
#[derive(Clone, PartialEq, Hash, Default)]
42+
#[cfg_attr(
43+
feature = "enable-serde",
44+
derive(serde_derive::Serialize, serde_derive::Deserialize)
45+
)]
46+
pub struct DebugTags {
47+
/// Pool of tags, referred to by `insts` below.
48+
tags: Vec<DebugTag>,
49+
50+
/// Per-instruction range for its list of tags in the tag pool (if
51+
/// any).
52+
///
53+
/// Note: we don't use `PackedOption` and `EntityList` here
54+
/// because the values that we are storing are not entities.
55+
insts: BTreeMap<Inst, Range<u32>>,
56+
}
57+
58+
/// One debug tag.
59+
#[derive(Clone, Debug, PartialEq, Hash)]
60+
#[cfg_attr(
61+
feature = "enable-serde",
62+
derive(serde_derive::Serialize, serde_derive::Deserialize)
63+
)]
64+
pub enum DebugTag {
65+
/// User-specified `u32` value, opaque to Cranelift.
66+
User(u32),
67+
68+
/// A stack slot reference.
69+
StackSlot(StackSlot),
70+
}
71+
72+
impl DebugTags {
73+
/// Set the tags on an instruction, overwriting existing tag list.
74+
pub fn set(&mut self, inst: Inst, tags: impl IntoIterator<Item = DebugTag>) {
75+
let start = u32::try_from(self.tags.len()).unwrap();
76+
self.tags.extend(tags);
77+
let end = u32::try_from(self.tags.len()).unwrap();
78+
if end > start {
79+
self.insts.insert(inst, start..end);
80+
} else {
81+
self.insts.remove(&inst);
82+
}
83+
}
84+
85+
/// Get the tags associated with an instruction.
86+
pub fn get(&self, inst: Inst) -> &[DebugTag] {
87+
if let Some(range) = self.insts.get(&inst) {
88+
let start = usize::try_from(range.start).unwrap();
89+
let end = usize::try_from(range.end).unwrap();
90+
&self.tags[start..end]
91+
} else {
92+
&[]
93+
}
94+
}
95+
96+
/// Clone the tags from one instruction to another.
97+
///
98+
/// This clone is cheap (references the same underlying storage)
99+
/// because the tag lists are immutable.
100+
pub fn clone_tags(&mut self, from: Inst, to: Inst) {
101+
if let Some(range) = self.insts.get(&from).cloned() {
102+
self.insts.insert(to, range);
103+
}
104+
}
105+
106+
/// Are any debug tags present?
107+
///
108+
/// This is used for adjusting margins when pretty-printing CLIF.
109+
pub fn is_empty(&self) -> bool {
110+
self.insts.is_empty()
111+
}
112+
113+
/// Clear all tags.
114+
pub fn clear(&mut self) {
115+
self.insts.clear();
116+
self.tags.clear();
117+
}
118+
}
119+
120+
impl core::fmt::Display for DebugTag {
121+
fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
122+
match self {
123+
DebugTag::User(value) => write!(f, "{value}"),
124+
DebugTag::StackSlot(slot) => write!(f, "{slot}"),
125+
}
126+
}
127+
}

cranelift/codegen/src/ir/function.rs

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
66
use crate::HashMap;
77
use crate::entity::{PrimaryMap, SecondaryMap};
8+
use crate::ir::DebugTags;
89
use crate::ir::{
910
self, Block, DataFlowGraph, DynamicStackSlot, DynamicStackSlotData, DynamicStackSlots,
1011
DynamicType, ExtFuncData, FuncRef, GlobalValue, GlobalValueData, Inst, JumpTable,
@@ -190,6 +191,15 @@ pub struct FunctionStencil {
190191
/// interpreted by Cranelift, only preserved.
191192
pub srclocs: SourceLocs,
192193

194+
/// Opaque debug-info tags on instructions.
195+
///
196+
/// These tags are not interpreted by Cranelift, and are passed
197+
/// through to compilation-result metadata. The only semantic
198+
/// structure that Cranelift imposes is that when inlining, it
199+
/// prepends the callsite call instruction's tags to the tags on
200+
/// inlined instructions.
201+
pub debug_tags: DebugTags,
202+
193203
/// An optional global value which represents an expression evaluating to
194204
/// the stack limit for this function. This `GlobalValue` will be
195205
/// interpreted in the prologue, if necessary, to insert a stack check to
@@ -209,6 +219,7 @@ impl FunctionStencil {
209219
self.dfg.clear();
210220
self.layout.clear();
211221
self.srclocs.clear();
222+
self.debug_tags.clear();
212223
self.stack_limit = None;
213224
}
214225

@@ -408,6 +419,7 @@ impl Function {
408419
layout: Layout::new(),
409420
srclocs: SecondaryMap::new(),
410421
stack_limit: None,
422+
debug_tags: DebugTags::default(),
411423
},
412424
params: FunctionParameters::new(),
413425
}

cranelift/codegen/src/ir/layout.rs

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -756,7 +756,7 @@ mod tests {
756756
use super::*;
757757
use crate::cursor::{Cursor, CursorPosition};
758758
use crate::entity::EntityRef;
759-
use crate::ir::{Block, Inst, SourceLoc};
759+
use crate::ir::{Block, DebugTag, Inst, SourceLoc};
760760
use alloc::vec::Vec;
761761
use core::cmp::Ordering;
762762

@@ -795,6 +795,10 @@ mod tests {
795795
unimplemented!()
796796
}
797797

798+
fn set_debug_tags(&mut self, _tags: Vec<DebugTag>) {
799+
unimplemented!()
800+
}
801+
798802
fn layout(&self) -> &Layout {
799803
self.layout
800804
}

cranelift/codegen/src/ir/mod.rs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ mod atomic_rmw_op;
44
mod builder;
55
pub mod condcodes;
66
pub mod constant;
7+
mod debug_tags;
78
pub mod dfg;
89
pub mod dynamic_type;
910
pub mod entities;
@@ -36,6 +37,7 @@ pub use crate::ir::builder::{
3637
InsertBuilder, InstBuilder, InstBuilderBase, InstInserterBase, ReplaceBuilder,
3738
};
3839
pub use crate::ir::constant::{ConstantData, ConstantPool};
40+
pub use crate::ir::debug_tags::{DebugTag, DebugTags};
3941
pub use crate::ir::dfg::{BlockData, DataFlowGraph, ValueDef};
4042
pub use crate::ir::dynamic_type::{DynamicTypeData, DynamicTypes, dynamic_to_fixed};
4143
pub use crate::ir::entities::{

0 commit comments

Comments
 (0)