-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Wasmtime: implement debug instrumentation and basic host API to examine runtime state. #11769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Wasmtime: implement debug instrumentation and basic host API to examine runtime state. #11769
Conversation
(Marking this as a draft until I resolve the two issues above; posting to show how the Cranelift side is used in practice, in case that's useful.) |
67c060e
to
b5c5804
Compare
b5c5804
to
2c75c5d
Compare
2b2419a
to
7426eda
Compare
Alright, I've rebased this on the latest #11768, and updated it to take an iterator-based approach instead. The borrowing is a little tricky but does work out: we implement a custom iterator up the stack frames ( In #11783 (underlying iterator that this refactor used) I chose not to replicate all of the complex logic to walk all activations and continuation stack-chains; for now, this API walks only the last activation that entered the host. That seems sufficient for a basic debugger application to me, but I can update it as desired. This is now stacked on #11768 and #11783, and depends on the just-filed #11784 to be resolved somehow. |
This PR adds *debug tags*, a kind of metadata that can attach to CLIF instructions and be lowered to VCode instructions and as metadata on the produced compiled code. It also adds opaque descriptor blobs carried with stackslots. Together, these two features allow decorating IR with first-class debug instrumentation that is properly preserved by the compiler, including across optimizations and inlining. (Wasmtime's use of these features will come in followup PRs.) The key idea of a "debug tag" is to allow the Cranelift embedder to express whatever information it needs to, in a format that is opaque to Cranelift itself, except for the parts that need translation during lowering. In particular, the `DebugTag::StackSlot` variant gets translated to a physical offset into the stackframe in the compiled metadata output. So, for example, the embedder can emit a tag referring to a stackslot, and another describing an offset in that stackslot. The debug tags exist as a *sequence* on any given instruction; the meaning of the sequence is known only to the embedder, *except* that during inlining, the tags for the inlining call instruction are prepended to the tags of inlined instructions. In this way, a canonical use-case of tags as describing original source-language frames can preserve the source-language view even when multiple functions are inlined into one. The descriptor on a stackslot may look a little odd at first, but its purpose is to allow serializing some description of stackslot-contained runtime user-program data, in a way that is firmly attached to the stackslot. In particular, in the face of inlining, this descriptor is copied into the inlining (parent) function from the inlined function when the stackslot entity is copied; no other metadata outside Cranelift needs to track the identity of stackslots and know about that motion. This fits nicely with the ability of tags to refer to stackslots; together, the embedder can annotate instructions as having certain state in stackslots, and describe the format of that state per stackslot. This infrastructure is tested with some compile-tests now; testing of the interpretation of the metadata output will come with end-to-end debug instrumentation tests in a followup PR.
…quence points or calls.
…lty in not-used case.
…ne runtime state. This PR implements ideas from the [recent RFC] to serve as the basis for Wasm (guest) debugging: it adds a stackslot to each function translated from Wasm, stores to replicate Wasm VM state in the stackslot as the program runs, and metadata to describe the format of that state and allow reading it out at runtime. As an initial user of this state, this PR adds a basic "stack view" API that, from host code that has been called from Wasm, can examine Wasm frames currently on the stack and read out all of their locals and stack slots. Note in particular that this PR does not include breakpoints, watchpoints, stepped execution, or any sort of user interface for any of this; it is only a foundation. This PR still has a few unsatisfying bits that I intend to address: - The "stack view" performs some O(n) work when the view is initially taken, computing some internal data per frame. This is forced by the current design of `Backtrace`, which takes a closure and walks that closure over stack frames eagerly (rather than work as an iterator). It's got some impressive iterator-chain stuff going on internally, so refactoring it to the latter approach might not be *too* bad, but I haven't tackled it yet. A O(1) stack view, that is, one that does work only for frames as the host API is used to walk up the stack, is desirable because some use-cases may want to quickly examine e.g. only the deepest frame (say, running with a breakpoint condition that needs to read a particular local's value after each step). - It includes a new `Config::compiler_force_inlining()` option that is used only for testing that we get the correct frames after inlining. I couldn't get the existing flags to work on a Wasmtime config level and suspect there may be an existing bug there; I will try to split out a fix for it. This PR renames the existing `debug` option to `native_debug`, to distinguish it from the new approach. [recent RFC]: bytecodealliance/rfcs#44
…ogpoint collapsing.
a81db1a
to
bdc71e0
Compare
Rebased on #11783 now that it's landed. Also a few more refined thoughts on
I am realizing in building the next step that we probably do want a "debug session" to apply only to one Wasm activation -- if for no other reason than that the API is much more semantically clear when exits back to the host are atomic events rather than somehow "translucent" (host code invisible but nested Wasm invocations visible). And there are weird corner cases where, for example, the recursively invoked host code itself tries to set up a (nested) debug context that we'd want to rule out anyway. So: the limit to introspect only the most recent activation is a feature, not a bug! |
This PR implements ideas from the recent RFC to serve as the basis
for Wasm (guest) debugging: it adds a stackslot to each function
translated from Wasm, stores to replicate Wasm VM state in the
stackslot as the program runs, and metadata to describe the format of
that state and allow reading it out at runtime.
As an initial user of this state, this PR adds a basic "stack view"
API that, from host code that has been called from Wasm, can examine
Wasm frames currently on the stack and read out all of their locals
and stack slots.
Note in particular that this PR does not include breakpoints,
watchpoints, stepped execution, or any sort of user interface for any
of this; it is only a foundation.
This PR renames the existing
debug
option tonative_debug
, todistinguish it from the new approach.
(Stacked on #11768, and depends on #11784.)