-
Notifications
You must be signed in to change notification settings - Fork 339
Description
Currently, Wasmi uses stack slots (a.k.a. Slot) to load and store values from the stack when executing operation of a function. Each function on the call stack has its own region of stack slots that it can access.
For example, there is the i32.add_ssi operation that has 3 operands:
result: Slotlhs: Slotrhs: i32
In this example, result and lhs are both Slot and thus lhs is loaded from the stack and result is stored into the associated stack slot, whereas rhs is an immediate value.
This is more efficient than a stack-based interpreter architecture but we can do better.
A more efficient interpreter architecture is used by both Wasm3 and Makepad's Stitch interpreter. During translation of Wasm to their internal bytecodes they always put the top-most value of the stack into a so-called "register" value. This value can be easily put into its own registers in execution handlers during the interpreter execution loop and thus be access (load/store) way faster than loading and storing from and to stack slots. This whole interpreter architecture is nothing new and those registers are simply called accumulators. For the sake of simplicity it is usually enough to have a single accumulator, thus a single such register value.
Right now, Wasmi's execution handlers already are using 6 integer registers:
vmstate: pointer to the execution context which stores auxiliary data and allows to access the storeip: the instruction pointersp: the stack pointermem0_ptr: pointer to the start of(memory 0)mem0_len: length of(memory 0)inst: pointer to the currently used instance
If we add another integer register for the accumulator we end up with 7 integer registers which puts further pressure on machine codegen of the compiler and likely results in less efficient code to dispatch to execution handlers. Ideally, we'd remove one of the 6 old register uses or move them under vmstate. There are 2 solutions:
- Move
instundervmstate. All instance related bytecode will perform worse, likely way worse. - Wait until Move Wasm function details into the
Store#1492 is merged which allows to remove theinstvalue altogether without requiring a replacement.
This issue tracks progress and evaluation of this optimization and the research behind it.