Internally within Wasmtime synchronous functions are often a lie

In the spirit of making Wasmtime more sound internally over time this is a major thorn in our side right now. Many functions in Wasmtime are synchronous but they're actually asynchronous in nature due to the possibility that something could be context-switched away at any moment. These context switches are required to happen on a fiber and while I'm not aware of any issues internally within Wasmtime, this is, in my opinion, unnecessarily surprising. I would ideally like to move towards a world where Wasmtime actually uses `async` internally, not just for the `async` feature of wasm itself, and then in synchronous mode it asserts the future is always ready since there aren't any sources of blocking async.

Concretely sources of surprising async-ness are:

* Async resource limiters
* Async GC
* Async call hooks 
* ... maybe more? this isn't intended to be exhaustive

Concretely one location of unsoundness has to do with growth of tables and memories. For example memory growth requires a [split borrow](https://github.com/bytecodealliance/wasmtime/blob/c3ac18b3e4fc3dacd83bbf9e1b0fefb0837a4607/crates/wasmtime/src/runtime/memory.rs#L602-L615) both of and within the store. The same is [true for tables](https://github.com/bytecodealliance/wasmtime/blob/c3ac18b3e4fc3dacd83bbf9e1b0fefb0837a4607/crates/wasmtime/src/runtime/externals/table.rs#L275-L288). This is due to the fact that the methods of growth require a `&mut self` on the runtime data structure but also require, effectively, `&mut StoreOpaque` to suspend the world. Given that memories/tables live within `&mut StoreOpaque`, this is not a sound way to model this problem.

Another concrete location of unsoundness is that [growth of a shared memory](https://github.com/bytecodealliance/wasmtime/blob/c3ac18b3e4fc3dacd83bbf9e1b0fefb0837a4607/crates/wasmtime/src/runtime/vm/memory/shared_memory.rs#L77) suspends the stack with an rwlock write lock on the stack. This means that `unsafe impl Send for FiberFuture {}` is a lie because that's foundationally built on the premise that Wasmtime's own stack-local variables are all `Send`. This is effectively something that's impossible to guarantee and this hypothetically can result in locking an rwlock on one thread and unlocking it on another (technically UB, tested on glibc and other platforms to at least not behave badly for now). I was able to trivially detect this by migrating to `async fn`, however, as the compiler checks the send-ness of local variables.

If hypothetically growth were to use an `async` fn I believe this would be sound. For example growth still requires a limiter, but there's no need to borrow the full store. This is morally a borrow of separate fields in the store, for example (a) the data/limiter and (b) the memory itself. That's sound to model in Wasmtime (albeit clunky).

There are additional downsides to the way this is all implemented today such as async table/memory growth requiring a fiber. This doesn't actually execute any WebAssembly code yet we nonetheless require a fiber to satisfy how suspension works in an async context. This is, in my opinion, unnecessarily unsafe where we're suspending Rust frames on a stack when we don't really need to if things were actually modeled as an `async` function.

---

I'll note that this issue is not going to be easy to solve. We can't "simply go sprinkle some `async`" as there are some fundamental blockers to that. The main one I ran into while attempting this is that `VMStore` is not `Send`. This is a trait object backed by any `Store<T>` for any `T` meaning that we don't know it's `Send` ahead of time. While everything in an async context is `Send` we don't know that for a sync context. I know of no fix for this other than adding `Store<T: Send>` everywhere, but that's a very heavy hammer and is not one we should reach for lightly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Internally within Wasmtime synchronous functions are often a lie #11262

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Internally within Wasmtime synchronous functions are often a lie #11262

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions