Wasmtime: implement debug instrumentation and basic host API to examine runtime state. #11769

cfallin · 2025-10-01T03:37:01Z

This PR implements ideas from the recent RFC to serve as the basis
for Wasm (guest) debugging: it adds a stackslot to each function
translated from Wasm, stores to replicate Wasm VM state in the
stackslot as the program runs, and metadata to describe the format of
that state and allow reading it out at runtime.

As an initial user of this state, this PR adds a basic "stack view"
API that, from host code that has been called from Wasm, can examine
Wasm frames currently on the stack and read out all of their locals
and stack slots.

Note in particular that this PR does not include breakpoints,
watchpoints, stepped execution, or any sort of user interface for any
of this; it is only a foundation.

This PR renames the existing debug option to native_debug, to
distinguish it from the new approach.

(Stacked on #11768, and depends on #11784.)

cfallin · 2025-10-01T03:38:16Z

(Marking this as a draft until I resolve the two issues above; posting to show how the Cranelift side is used in practice, in case that's useful.)

cfallin · 2025-10-03T05:06:27Z

Alright, I've rebased this on the latest #11768, and updated it to take an iterator-based approach instead. The borrowing is a little tricky but does work out: we implement a custom iterator up the stack frames (StackView), which owns the store while alive, and a mutable borrow of that iterator must be passed into methods on the objects it returns (FrameView) to actually read values. All of that gives a safe, lazy (O(1)) way of walking the stack.

In #11783 (underlying iterator that this refactor used) I chose not to replicate all of the complex logic to walk all activations and continuation stack-chains; for now, this API walks only the last activation that entered the host. That seems sufficient for a basic debugger application to me, but I can update it as desired.

This is now stacked on #11768 and #11783, and depends on the just-filed #11784 to be resolved somehow.

This PR adds *debug tags*, a kind of metadata that can attach to CLIF instructions and be lowered to VCode instructions and as metadata on the produced compiled code. It also adds opaque descriptor blobs carried with stackslots. Together, these two features allow decorating IR with first-class debug instrumentation that is properly preserved by the compiler, including across optimizations and inlining. (Wasmtime's use of these features will come in followup PRs.) The key idea of a "debug tag" is to allow the Cranelift embedder to express whatever information it needs to, in a format that is opaque to Cranelift itself, except for the parts that need translation during lowering. In particular, the `DebugTag::StackSlot` variant gets translated to a physical offset into the stackframe in the compiled metadata output. So, for example, the embedder can emit a tag referring to a stackslot, and another describing an offset in that stackslot. The debug tags exist as a *sequence* on any given instruction; the meaning of the sequence is known only to the embedder, *except* that during inlining, the tags for the inlining call instruction are prepended to the tags of inlined instructions. In this way, a canonical use-case of tags as describing original source-language frames can preserve the source-language view even when multiple functions are inlined into one. The descriptor on a stackslot may look a little odd at first, but its purpose is to allow serializing some description of stackslot-contained runtime user-program data, in a way that is firmly attached to the stackslot. In particular, in the face of inlining, this descriptor is copied into the inlining (parent) function from the inlined function when the stackslot entity is copied; no other metadata outside Cranelift needs to track the identity of stackslots and know about that motion. This fits nicely with the ability of tags to refer to stackslots; together, the embedder can annotate instructions as having certain state in stackslots, and describe the format of that state per stackslot. This infrastructure is tested with some compile-tests now; testing of the interpretation of the metadata output will come with end-to-end debug instrumentation tests in a followup PR.

…quence points or calls.

…lty in not-used case.

…4 keys.

…ne runtime state. This PR implements ideas from the [recent RFC] to serve as the basis for Wasm (guest) debugging: it adds a stackslot to each function translated from Wasm, stores to replicate Wasm VM state in the stackslot as the program runs, and metadata to describe the format of that state and allow reading it out at runtime. As an initial user of this state, this PR adds a basic "stack view" API that, from host code that has been called from Wasm, can examine Wasm frames currently on the stack and read out all of their locals and stack slots. Note in particular that this PR does not include breakpoints, watchpoints, stepped execution, or any sort of user interface for any of this; it is only a foundation. This PR still has a few unsatisfying bits that I intend to address: - The "stack view" performs some O(n) work when the view is initially taken, computing some internal data per frame. This is forced by the current design of `Backtrace`, which takes a closure and walks that closure over stack frames eagerly (rather than work as an iterator). It's got some impressive iterator-chain stuff going on internally, so refactoring it to the latter approach might not be *too* bad, but I haven't tackled it yet. A O(1) stack view, that is, one that does work only for frames as the host API is used to walk up the stack, is desirable because some use-cases may want to quickly examine e.g. only the deepest frame (say, running with a breakpoint condition that needs to read a particular local's value after each step). - It includes a new `Config::compiler_force_inlining()` option that is used only for testing that we get the correct frames after inlining. I couldn't get the existing flags to work on a Wasmtime config level and suspect there may be an existing bug there; I will try to split out a fix for it. This PR renames the existing `debug` option to `native_debug`, to distinguish it from the new approach. [recent RFC]: bytecodealliance/rfcs#44

…ogpoint collapsing.

… is enabled.

cfallin · 2025-10-03T20:17:42Z

Rebased on #11783 now that it's landed.

Also a few more refined thoughts on

I chose not to replicate all of the complex logic to walk all activations and continuation stack-chains; for now, this API walks only the last activation that entered the host. That seems sufficient for a basic debugger application to me, but I can update it as desired.

I am realizing in building the next step that we probably do want a "debug session" to apply only to one Wasm activation -- if for no other reason than that the API is much more semantically clear when exits back to the host are atomic events rather than somehow "translucent" (host code invisible but nested Wasm invocations visible). And there are weird corner cases where, for example, the recursively invoked host code itself tries to set up a (nested) debug context that we'd want to rule out anyway. So: the limit to introspect only the most recent activation is a feature, not a bug!

cfallin requested review from a team as code owners October 1, 2025 03:37

cfallin requested review from fitzgen and removed request for a team October 1, 2025 03:37

cfallin marked this pull request as draft October 1, 2025 03:37

cfallin force-pushed the wasmtime-debug-instrumentation branch 12 times, most recently from 67c060e to b5c5804 Compare October 1, 2025 06:01

cfallin mentioned this pull request Oct 1, 2025

Cranelift: add debug tag infrastructure. #11768

Open

cfallin force-pushed the wasmtime-debug-instrumentation branch from b5c5804 to 2c75c5d Compare October 2, 2025 22:39

github-actions bot added the cranelift:meta Everything related to the meta-language. label Oct 3, 2025

cfallin force-pushed the wasmtime-debug-instrumentation branch from 2b2419a to 7426eda Compare October 3, 2025 04:37

cfallin mentioned this pull request Oct 3, 2025

Unclear how to set Wasmtime inlining options from Config API #11784

Open

cfallin marked this pull request as ready for review October 3, 2025 05:03

cfallin added 12 commits October 3, 2025 13:11

Review feedback: add back sequence points and enforce tags only on se…

cd03b2a

…quence points or calls.

Use Vecs for debug metadata in MachBuffer to avoid SmallVec size pena…

3bd3149

…lty in not-used case.

Review feedback: switch from inlined stackslot descriptor blobs to u6…

f212b59

…4 keys.

Update to new APIs on Cranelift side.

4aeef39

Test update.

0dbd5fc

Adjust objdump printing of InstPos on frame progpoints; and adjust pr…

c952a1b

…ogpoint collapsing.

Convert to iterator form.

1b5bbec

Fix path in native-debug tests (debug -> native_debug rename).

75d5149

Enforce that debug_instrumentation can only be enabled when feature…

e074ad2

… is enabled.

Add missing assert.

bdc71e0

cfallin force-pushed the wasmtime-debug-instrumentation branch from a81db1a to bdc71e0 Compare October 3, 2025 20:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Wasmtime: implement debug instrumentation and basic host API to examine runtime state. #11769

Wasmtime: implement debug instrumentation and basic host API to examine runtime state. #11769

Uh oh!

cfallin commented Oct 1, 2025 •

edited

Loading

Uh oh!

cfallin commented Oct 1, 2025

Uh oh!

cfallin commented Oct 3, 2025

Uh oh!

cfallin commented Oct 3, 2025

Uh oh!

Uh oh!

Wasmtime: implement debug instrumentation and basic host API to examine runtime state. #11769

Are you sure you want to change the base?

Wasmtime: implement debug instrumentation and basic host API to examine runtime state. #11769

Uh oh!

Conversation

cfallin commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cfallin commented Oct 1, 2025

Uh oh!

cfallin commented Oct 3, 2025

Uh oh!

cfallin commented Oct 3, 2025

Uh oh!

Uh oh!

cfallin commented Oct 1, 2025 •

edited

Loading