Skip to content

A fine-grained JIT Interface: func.new bytecodeΒ #1573

@titzer

Description

@titzer

Problem

We consider the problem of dynamically generating Wasm code while running on a Wasm engine.

Harvard architecture, no JIT

Core WebAssembly disallows using runtime data as code or generating new code at runtime. This important property of Wasm allows efficient validation of all code in a module and establishes other important security properties. Further, this restriction allows static analysis against the state declared in a module but not exported, admitting sound closed-world optimizations, which is common in tooling today.

Guest runtimes want to JIT new code

However, guest runtimes running on top of Wasm, such as language runtimes like Python or Lua, and hardware emulators, such as QEMU, can benefit tremendously by generating specialized code at runtime. The performance benefit of dynamic code generation can be extreme; as much as 10 to 100 times faster than running in interpreted mode. Today, these runtimes can run only in interpreted mode and are prohibitively slow.

Inconsistent host support for new modules

To date, generating new Wasm code has been an optional host capability provided by the embedder. On the web, this is done through JavaScript APIs for creating new modules directly from bytes. The wasm-c-api allows embedding a Wasm engine in a C/C++ program and creating new modules dynamically. Other host platforms may offer the ability to create new modules from bytes. Generally, the granularity of generation is Wasm modules. Some experiments have been done with function-at-a-time mechanisms, but no standard mechanism has been proposed and nothing is portable across platforms.

Today, guest runtimes that generate new code are few, and they make a number of compromises, such as batching generated functions, in order to workaround limitations and cost of new modules on host platforms.

Proposing a lightweight core-Wasm mechanism

To address this problem, we propose to add a new mechanism to core Wasm. Doing so exposes the mechanism to security features/mitigations, makes behavior explicit and documented, allows sound toolchain analysis and transformations, and exposes it to engine optimizations.

A new instruction: func.new

The main component of this idea is a new core Wasm bytecode, func.new, which creates a new function at runtime from bytecode stored in Wasm memory.

  func.new $mt $ft $scope: [at at] -> [(ref $ft)]
  where:
  - $mt = memory code at limits
  - $ft = func [t1*] -> [t2*]
  - $scope = export*

This bytecode has immediates:

  • a memory index, indicating the memory which contains the code for the new function
  • a function type, indicating the signature of the new function
  • a scope, indicating a list of functions, tables, globals, tags, and memories that new code may legally access

At runtime, this instruction takes operands:

  • start, an integer indicating the start offset within the memory of the code
  • length, an integer indicating the length of the code

The memory index and function type are straightforward. The $scope immediate is an explicit enumeration of the module contents the new code may reference. In particular, $scope produces a new index space for types, globals, functions, memories, etc. For compactness of the bytecode, and to reuse the same scope for multiple different func.new instructions, $scope will be factored out to its own section rather than inline as immediates.

To execute the instruction, the engine first copies the bytes of code from the Wasm memory, then validates the bytecode under a module context corresponding to $scope and a function context corresponding to the expected function signature $ft. Upon out of bounds memory access or validation error, the instruction traps. If validation of the code is successful, the engine creates an internal representation of a new function and pushes a non-null reference to it onto the operand stack. The function's store is derived from the store under which the instruction was executed; i.e. the new function's instance is (a subset of) the caller's instance.

A new section: JIT scope

A scope defines what declarations are accessible to new code passed to a func.new instruction. We introduce a new "scope" section to factor out the list of accessible declarations so that func.new may refer to a scope by index rather than listing declarations individually. A scope section allows declaring multiple scopes to allow different func.new instructions different accessibility. A scope consists of a list of declarations (similar to export declarations, but without names). The order of declarations in a scope creates new index spaces, i.e. it renumbers the declarations from the surrounding module, starting over from 0.

A new memory flag: code

Generating new code into a module is a powerful feature. It's important that modules have mechanisms to limit and control access to the feature. With the proposed scoping mechanism, a module can limit the interface that new code has access to by explicit enumeration of allowable declarations. This makes sense for modules with security requirements, e.g. where a threat model consider dynamically-generated code potentially compromised or even malicious. Another possible threat is that dynamically-generated code could be corrupted while in Wasm memory, either intentionally or unintentionally, before being supplied to a func.new invocation.

Supporting dynamically-generated code also requires significant runtime support, including a validator and at least one execution tier. Some host platforms may simply be unable to support dynamically-generated code because of memory constraints, lack of execution tiers, or security controls.

For many reasons, we might want to consider the feature "opt-in". We propose a new flag for memories, the code flag, which is part of a memory declaration. Like the shared flag, it explicitly marks a memory that has the capability to be used with func.new. For many of the same reasons as shared memory, this capability thus requires that declarer of the memory and the user of the memory both agree it can be used for code.

An example

Putting the pieces together, we can write an example that uses a code memory, a scope, and creates new functions at runtime.

(module
   (type $t1 (func))  ;; the type for new functions
   (func $f1 ...)
   (func $f2 ...)
   (func $f3 ...)
   (memory $m1 code 1 1)  ;; the memory used to temporarily store code for func.new
   (memory $m2 1 1)       ;; a memory accessible to new code

   (scope $s1             ;; the scope a new function may use
     (func $f1 $f2)       ;; expose $f1 and $f2 to new code
     (memory $m2))        ;; expose only $m2 to new code

   (func $gen
     (local $n (ref $t1)) ;; a variable to hold the new funcref
     ...
     (local.set $n 
       (func.new $m1 $t1 $s1                ;; code lives in $m1, result sig is $t1, scope is $s1
          (i32.const 1024) (i32.const 10))) ;; code is stored at address 1024 and is 10 bytes long
     ...
     (call_ref $t1 (local.get $n))     ;; call the new function!!
   )
)

Sound static analysis

Sound static analysis of Wasm code is vital for module-level optimizations. This proposal preserves reasoning about a module's internal behavior by making its interactions with runtime-generated code explicit. Since runtime generated code can only access the declarations explicitly provided by its scope, analysis can soundly treat declarations mentioned in scopes similarly to exported functions. However, unlike exported functions, scopes cannot be accessed outside a module. Thus a module's public interface is not polluted with its internal use of dynamically-generated code.

Toolchain transformations

Sound static analysis implies that toolchains making closed-world assumptions for optimization (e.g. dead-code elimination) can account for all potential future uses by dynamically-generated code by simply considering scopes. Reorganization of modules resulting from DCE is sound, as scopes can be rewritten for updated indices. Since dynamically-generated code must use scoped indices, it is not affected by the renumbering of the containing module. Thus DCE and other aggressive transformations can be made sound and not affect dynamically-generated code.

Engine optimizations

The possibility of dynamically-generated code implies that an engine has runtime capability to parse function bodies and perform code validation. However, the func.new bytecode doesn't require parsing and validating sections, so the runtime system doesn't need a fully-featured Wasm module parser.

New code implies the need for at least one execution tier, such as interpreter or compiler. This implies AOT scenarios need to either disable the feature, e.g. reject code memories, trap on func.new, or integrate a new tier. Note that since a scope defines a module context, the execution tier only needs to support runtime features that could be legally used under that context--basically, no memories implies no load/store instructions, no GC types implies no GC instructions.

In the future, we could consider additional restrictions, such as an explicit list of allowable bytecodes for new functions. That could allow a module to limit language features for new functions and greatly reduce the runtime system requirement. Since this is a restriction a module imposes on itself, it need not be standardized. As an example of the usefulness of this, a module could limit its func.new capabilities to only allow i32 arithmetic and control flow. An AOT implementation could then provide a runtime execution tier that only supported that restricted set, keeping it both simple and optimized for the use case.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions