This document describes the Wado compiler architecture and implementation status.
The compiler follows a multi-phase pipeline:
Source (.wado) → Lexer → Parser → Bind → Load → Analyze → Resolve → Synthesis → Effect Check → Stores Check → Erase Newtypes/Flags → Monomorphize → Lower → Optimize → WIR Build → WIR Optimize → Codegen
| Phase | Input | Output | Description |
|---|---|---|---|
| Lexer | Source | Tokens | Tokenize, extract __DATA__ section |
| Parser | Tokens | AST | Build abstract syntax tree |
| Bind | AST | AST (validated) | Local name resolution, scope/mutability checking |
| Load | AST | All modules | Load dependencies; each module: parse → bind → desugar |
| Analyze | All modules | Symbol table | Build symbol table, validate imports |
| Resolve | AST + Symbols | Project | Type resolution, produce Project |
| Synthesis | Project | Project | Eq/Ord/serde/CM binding/template/inspect synthesis |
| Effect Check | Project | Project | Validate function effect requirements |
| Stores Check | Project | Project | Validate reference storage declarations |
| Erase Types | Project | Project | Erase newtypes and flags to base types |
| Monomorphize | Project | Project | Instantiate generics with concrete types |
| Lower | Project | Project | Closure, i128 match, global init, string literal lowering |
| Optimize | Project | Project | Inlining, copy-prop, LICM, DCE, post-opt rewrite |
| WIR Build | Project | WirModule | Planning + TIR → WIR (Wasm IR) translation |
| WIR Optimize | WirModule | WirModule | Multi-value SROA, array data promotion, peephole |
| Codegen | WirModule | Wasm bytes | WIR emission to core Wasm + Component Model wrapping |
Note: The Desugar phase is integrated into the Load phase. Each module goes through the same frontend pipeline: lexer → parser → bind → desugar.
| Module | File | Description |
|---|---|---|
| Lexer | lexer.rs |
Tokenizes source code, extracts __DATA__ section |
| Parser | parser.rs |
Recursive descent parser, builds AST |
| AST | ast.rs |
AST node definitions, Module::data_section() API |
| Token | token.rs |
Token types and spans |
| Syntax | syntax.rs |
Syntax definitions (keywords, operators) |
| Comment | comment.rs |
Comment collection and CommentMap for formatting |
| Bind | bind.rs |
Local name binding, scope analysis, mutability check |
| Loader | loader.rs |
Module loading, dependency resolution |
| Desugar | desugar.rs |
AST transformations (compound assign, etc.) |
| EffectCheck | effect_check.rs |
Validates effect requirements and stores declarations |
| Unparser | unparse.rs |
Converts AST/TIR back to source code |
| Analyzer | analyze.rs |
Semantic analysis, symbol table construction |
| Symbol | symbol.rs |
Symbol table data structures |
| Name | name.rs |
Name mangling utilities for methods and symbols |
| Resolver | resolver.rs |
Type resolution, AST to TIR, produces Project (resolver/) |
| TIR | tir.rs |
Typed Intermediate Representation |
| Synthesis | synthesis.rs |
Unified synthesis phase (synthesis/) |
| SynthCommon | synthesis/common.rs |
Shared TIR builders for synthesis phases |
| SynthSerde | synthesis/serde_synth.rs |
Synthesized Serialize/Deserialize for structs |
| SynthTraits | synthesis/traits.rs |
Auto-derived Eq/Ord/Display/Inspect for types |
| SynthTemplate | synthesis/template.rs |
Template string expansion |
| SynthFrom | synthesis/from_synth.rs |
From trait synthesis |
| SynthCmBinding | synthesis/cm_binding.rs |
CM boundary adapter synthesis (TIR functions) |
| CmAbi | cm_abi.rs |
Canonical ABI layout computation |
| Monomorphize | monomorphize.rs |
Generic type/function instantiation (Project→Project) |
| Lower | lower.rs |
Lowering coordinator (lower/) |
| LowerWideInt | lower/wide_int.rs |
i128/u128 match pattern → if-else chains |
| LowerPattern | lower/pattern.rs |
LetPattern/IfPattern → explicit statements + switch |
| LowerGlobals | lower/globals.rs |
Global initializer extraction + __initialize_modules |
| LowerBoxing | lower/boxing.rs |
&primitive → Box<T> struct lowering |
| LowerClosure | lower/closure.rs |
Closure → functor struct with __call methods |
| LowerString | lower/string.rs |
String/bytes literal collection for data section |
| Project | project.rs |
Project: compilation context passed through pipeline |
| Optimize | optimize.rs |
Optimization coordinator (optimize/) |
| ConstFolding | optimize/const_folding.rs |
Constant folding for integer/float arithmetic |
| ConstProp | optimize/const_propagation.rs |
Constant propagation for immutable globals |
| ConstGlobal | optimize/const_global_promotion.rs |
Promote runtime globals to compile-time constants |
| ConstBranch | optimize/const_branch_prune.rs |
Dead branch elimination for known-false conditions |
| DCE | optimize/dce.rs |
Dead code elimination via reachability analysis |
| Inline | optimize/inline.rs |
Function inlining for small, pure functions |
| RefElim | optimize/ref_elim.rs |
Reference elimination after inlining |
| CopyProp | optimize/copy_prop.rs |
Copy propagation for trivial bindings |
| SROA | optimize/sroa.rs |
Scalar replacement of aggregates (struct/tuple elim) |
| LICM | optimize/licm.rs |
Loop-invariant code motion |
| SelectLower | optimize/select_lowering.rs |
if/else → Wasm select instruction |
| FieldScalarize | optimize/field_scalarize.rs |
Hot field scalarization from GC structs |
| BlockFusion | optimize/labeled_block_fusion.rs |
Labeled block fusion |
| StoreLoadFwd | optimize/store_load_forward.rs |
Store-load forwarding for literal values |
| CondImplication | optimize/condition_implication.rs |
Condition implication from dominating guards |
| TmplHoist | optimize/tmpl_hoist.rs |
Template buffer hoisting out of loops |
| ComponentPlan | wir_build/component_plan.rs |
ComponentPlan types and build_component_plan |
| Stdlib | stdlib.rs |
Embedded core library sources |
| CompilerHost | compiler_host.rs |
I/O abstraction for the compiler |
| Logger | logger.rs |
Diagnostic logging with timestamps |
| ComponentModel | component_model.rs |
WASI import registry and CM ABI type support |
| BuiltinRegistry | builtin_registry.rs |
Builtin function registry from core:builtin |
| Doc | doc.rs |
Documentation generation from AST |
| HashMap | hashmap.rs |
Deterministic IndexMap/IndexSet type aliases |
| TirVisitor | tir_visitor.rs |
Generic visitor traits for TIR tree traversal |
| WorldRegistry | world_registry.rs |
World definitions registry for export signatures |
| WIR | wir.rs |
Wasm IR data structures |
| WIR Unparse | wir_unparse.rs |
WIR → pseudo-Wado source code for debugging |
| WIR Build | wir_build.rs |
Planning + TIR→WIR translation (wir_build/) |
| WIR Optimize | wir_optimize.rs |
WIR-level optimizations (multi-value SROA, etc.) |
| Codegen | codegen.rs |
WIR→Wasm emission + Component Model wrapping (codegen/) |
| Bundled | bundled.rs |
Loads pre-compiled Wasm builtins (wado-bundled-libm) |
The parser preserves source syntax literally to enable accurate formatting via the unparser. Syntactic sugar is transformed in the desugar pass, which runs during module loading (not as a separate top-level phase).
| Construct | Parser Output | Desugar Output |
|---|---|---|
x += y |
CompoundAssignExpr |
AssignExpr with BinaryExpr |
a < b < c |
ComparisonChainExpr |
BinaryExpr chain with && |
&self |
Param with SelfKind |
(preserved, handled in codegen) |
{ x } (struct field) |
is_shorthand: true |
(preserved for formatting) |
This separation ensures:
wado formatoutputs the original syntax (e.g.,x += 1notx = x + 1)- Codegen receives simplified AST without syntactic variants
The unparse.rs module also provides a TIR unparser that converts Typed IR back to pseudo-Wado source code. This is useful for debugging the monomorphization and lowering phases.
Usage:
wado dump --tir-resolved file.wado # Show TIR before monomorphization
wado dump --tir-lowered file.wado # Show TIR after monomorphization and lowering
wado dump --tir file.wado # Show final TIR (after optimization)Output Characteristics:
--tir-resolved: Shows generic types as-is (e.g.,Box<T>)--tir-lowered: Shows monomorphized type names (e.g.,Box$i32instead ofBox<T>)- Includes fully qualified function calls (e.g.,
core::cli::println) - Preserves the
__DATA__section if present - Output is pseudo-Wado (not compilable due to mangled names)
Example Output:
struct Box$i32 {
value: i32,
}
fn run() with Stdout {
let b: Box$i32 = Box$i32 { value: 42 };
core::cli::println(core::internal::string_concat("value: ", b.value.to_string()));
}
The wado-bundled-libm crate provides pre-compiled Wasm math functions (deterministic libm). These are statically linked into the generated component.
Location: wado-bundled-libm/ (compiles to wasm32-unknown-unknown)
Float-to-string formatting was previously a bundled Wasm module but is now implemented in pure Wado (core:prelude/fpfmt.wado).
The monomorphize.rs module is a dedicated compilation phase that instantiates generic structs and functions with concrete types. It runs after type resolution and before the lower phase.
Process:
- Collect generic definitions: Gather all generic struct and function definitions from all modules
- Find instantiation sites: Scan for
GenericInstancetypes and generic function calls - Instantiate structs: Create concrete struct definitions with substituted field types
- Instantiate functions: Create concrete function definitions with substituted types
- Rewrite types: Replace all
GenericInstancetype references with concrete struct types - Rewrite calls: Replace generic function calls with calls to monomorphized functions
- Transitive instantiation: Iteratively process new instantiations created during monomorphization
Cross-Module Support:
The monomorphizer supports cross-module generic function instantiation. Generic functions defined in one module (e.g., Array methods from prelude) can be instantiated when used in another module. This is achieved by collecting all generic functions from all modules before processing.
Name Mangling:
// Struct types
Box<i32> → Box$i32
Pair<i32, String> → Pair$i32$String
Box<Box<i32>> → Box$Box$i32
// Generic functions (suffix is unique instantiation ID)
identity::<i32> → identity$1
identity::<i64> → identity$2
// Generic methods
Container::transform::<i32, i64> → Container::transform$1
Variadic Type Packs:
During monomorphization, TirExprKind::TupleSpread nodes (generated by the resolver for [..expr] inside variadic functions) are expanded into individual FieldAccess elements. The tuple literal's type is rebuilt from the expanded element types. Non-variadic spread expressions (concrete tuples) are expanded at resolve time; the resolver introduces temporary let-bindings for non-trivial expressions to ensure single evaluation.
See optimizer.md.
Embedded .wado files in wado-compiler/lib/:
Core Library (core/):
| Module | File | Description |
|---|---|---|
core:prelude |
prelude.wado |
Auto-imported re-exports from prelude sub-modules |
core:prelude/traits.wado |
prelude/traits.wado |
Trait definitions (Eq, Ord, Iterator, etc.) |
core:prelude/types.wado |
prelude/types.wado |
Core types (Option, Result, Stream, Future) |
core:prelude/string.wado |
prelude/string.wado |
String type and string iterators |
core:prelude/array.wado |
prelude/array.wado |
Array type and array iterators |
core:prelude/int128.wado |
prelude/int128.wado |
u128/i128 types (re-exported from prelude) |
core:prelude/primitive.wado |
prelude/primitive.wado |
Primitive type trait implementations |
core:prelude/format.wado |
prelude/format.wado |
Format traits (Display, Formatter) |
core:prelude/fpfmt.wado |
prelude/fpfmt.wado |
Float-to-string formatting (pure Wado) |
core:prelude/tuple.wado |
prelude/tuple.wado |
Tuple trait implementations |
core:cli |
cli.wado |
CLI output (println, eprintln, etc.) |
core:collections |
collections.wado |
TreeMap and other collections |
core:serde |
serde.wado |
Serialization/deserialization traits |
core:json |
json.wado |
JSON serialization/deserialization |
core:json_nsd |
json_nsd.wado |
JSON non-self-describing deserializer |
core:json_value |
json_value.wado |
JSON value type representation |
core:simd |
simd.wado |
SIMD v128 operations |
core:zlib |
zlib.wado |
Compression (zlib/deflate) |
core:base64 |
base64.wado |
Base64 encoding/decoding (RFC 4648) |
core:internal |
internal.wado |
Compiler-generated code support, panic/unreachable |
core:builtin |
builtin.wado |
Compiler intrinsics with #[canonical(...)] attrs |
WASI Library (wasi/):
| Module | File | Description |
|---|---|---|
wasi:cli |
cli.wado |
CLI interfaces |
wasi:clocks |
clocks.wado |
Clock interfaces |
wasi:filesystem |
filesystem.wado |
FS interfaces |
wasi:http |
http.wado |
HTTP interfaces |
wasi:random |
random.wado |
Random interfaces |
wasi:sockets |
sockets.wado |
Socket interfaces |
Stdlib tests are co-located with their source as *_test.wado files (e.g., lib/core/zlib_test.wado). They use Wado's test declaration syntax and run via wado test:
mise run test-wado # runs all *_test.wado files
cargo run --bin wado -- test wado-compiler/lib/core/zlib_test.wado # run one fileTest names can contain any characters (parentheses, dashes, etc.) — the compiler sanitizes them into valid kebab-case CM export names.
The #[expect_trap] and #[TODO] attributes mark tests as expected to trap. The compiler encodes this in the export name prefix:
test-0-simple # normal test export
test-trap-1-panics # #[expect_trap]: passes when body traps
test-todo-2-wip # #[TODO]: passes when body traps; distinct failure message when it doesn't
test-tm5000-0-slow # #[timeout_ms(5000)]: custom timeout in ms
test-trap-tm500-1-panic # combined: expect_trap + custom timeout
The tm{N} segment encodes a custom timeout in milliseconds (default: 1000ms). It appears after the test-/test-trap-/test-todo- prefix.
Both wado test and the e2e test runner detect these prefixes and handle pass/fail accordingly.
Test functions use the same async wrapper as run(), ensuring compatibility with WASI P3's async model. Each test properly completes its async task before reporting results.
The WasiRegistry module (component_model.rs) collects WASI import information from lib/wasi/*.wado files and provides it to the code generator for dynamic Component Model generation.
Purpose:
- Extract WASI version strings from
#[wasi(...)]attributes (e.g.,0.3.0-rc-2025-09-16) - Map effect methods to function names using a unified naming scheme
- Track which WASI interfaces are used for conditional import generation
Naming Convention:
The registry uses a unified naming scheme across both component-level and core module-level code:
| Format | Example |
|---|---|
wasi:{package}/{EffectName}::{method_name} |
wasi:cli/Stdout::write_via_stream |
This naming scheme:
- Uses
wasi:prefix for clarity - Includes package for uniqueness across packages (e.g.,
cli,clocks) - Uses Wado effect/method names (not WIT interface/function names)
- Uses
::as method separator (Wado convention)
The registry provides build_local_alias_name() utility function and resolve() method for name resolution.
What's Dynamic (from registry):
| Item | Example |
|---|---|
| Version strings | wasi:cli/stdout@0.3.0-rc-2025-09-16 |
| Import paths | Built via format!("wasi:cli/stdout@{}", cli_version) |
| Function async flag | is_async from effect method definition |
| Interface presence | has_interface("monotonic-clock") for conditional codegen |
| Local alias names | build_local_alias_name("cli", "Stdout", "write_via_stream") |
| WASI type resolution | Instant → u64, Duration → u64 resolved from wasi/*.wado |
| Function signatures | Params and return types parsed from effect methods |
| Supported interfaces | Dynamically filtered based on type support |
Dynamic Interface Filtering:
Instead of a hardcoded whitelist, interfaces are included based on type support:
- Only interfaces where ALL functions have supported types are imported
- Supported param types: primitives (
i32,u64,bool,char,String, etc.),Stream<T> - Supported return types: same as params plus
Result<T, E> - WASI newtypes are resolved to base types before filtering (e.g.,
Instant→u64) - The "run" interface is skipped (it defines exports, not imports; needed for Command world)
What's Still Hardcoded (TODO):
| Item | Location | Reason |
|---|---|---|
error-code enum variants |
["io", "illegal-byte-sequence", "pipe"] |
Registry only tracks effect functions |
Future Work:
To fully eliminate hardcoded CM structures, the registry would need to:
- Track WASI types (enums, resources) in addition to effect functions
- Parse enum variants from
#[wasi(...)]annotated enums in wasi/*.wado - Generate CM type definitions dynamically from parsed definitions
Wado HTTP handlers use export async fn to opt into the Component Model async calling convention. The async modifier is significant — it changes the entire adapter generation strategy.
Without async, the compiler generates a synchronous CM export adapter: it calls the user function, receives the return value, lowers it to flat CM ABI values, and returns them to the CM runtime. The function lifetime is tied to the return value.
For HTTP handlers, the return type is Result<Response, ErrorCode>. A Response contains a FutureWritable<Result<Option<Trailers>, ErrorCode>> — a writable future handle that the caller must fulfill after the response headers are sent. With a sync adapter, the function would return before the trailers future is resolved, and there would be no opportunity to write to it.
With async, the CM runtime allows the function to remain alive after delivering its result. The adapter generated for export async fn has two key differences:
- The Wasm-level function signature uses the async calling convention: flat params with no outptr, and the function returns nothing (result delivery is via
task.return). - The adapter only lifts the incoming parameters, then calls the user function directly — it does not handle the return value. The user's
task returnstatement inside the function body drives result delivery.
// Synchronous (sync adapter wraps return):
export fn get_version() -> String { return "1.0"; }
// Async (task return drives delivery; function can continue after):
export async fn handle(request: Request) -> Result<Response, ErrorCode> {
// ...build response with trailers future...
task return Result::<Response, ErrorCode>::Ok(response);
// function continues here; fulfills trailers future
trailers_tx.write(Ok(null));
}
task return expr; is a statement that calls the CM task.return instruction. It delivers the function's result to the CM runtime without ending the function.
Rationale: Regular return terminates the Wasm function. If an HTTP handler used return response, the function would exit before it could fulfill any outstanding futures (e.g., trailers). task return separates result delivery from function termination, keeping the function alive so it can perform cleanup and fulfill futures.
Type checking: The task return expression is type-checked against the declared return type of the surrounding export async fn. Regular return is forbidden in async function bodies — using it would terminate the Wasm function without notifying the CM runtime.
CM Binding expansion: During the CM Binding phase, task return expr is expanded in-place to a sequence of TIR that:
- Lowers the Wado value to flat CM ABI values (using
synthesize_lower_to_flat) - Calls
builtin::task_return(0, flat0, flat1, ...)— the0is the Ok discriminant
This expansion is performed by expand_task_returns_in_func in cm_binding_gen.rs, which walks the function body and replaces each TirStmtKind::TaskReturn with the expanded sequence.
The BuiltinRegistry module (builtin_registry.rs) collects function signatures from lib/core/builtin.wado and provides type information for code generation.
The #[canonical("...")] Attribute:
Builtins in builtin.wado are divided into two categories:
- Canonical builtins - Functions with
#[canonical("namespace", "name")]attribute are imported as Component Model canonical built-ins - Instruction builtins - Functions without the attribute compile directly to Wasm instructions
// Canonical builtin - imported as CM function "stream-new"
#[canonical("wasi", "stream-new")]
fn stream_new() -> i64;
// Instruction builtin - compiles to Wasm i32.and instruction
fn i32_and(a: i32, b: i32) -> i32;
Canonical Builtins:
Builtins with #[canonical("namespace", "name")] are imported as CM canonical built-ins. The namespace determines the import source: "wasi" for CM canonical builtins, "mem" for memory operations, "bundled" for wado-bundled-libm.
| Wado Name | Namespace | Canonical Name | Category |
|---|---|---|---|
stream_new |
wasi |
stream-new |
Stream |
stream_read |
wasi |
stream-read |
Stream |
stream_write |
wasi |
stream-write |
Stream |
stream_drop_writable |
wasi |
stream-drop-writable |
Stream |
stream_drop_readable |
wasi |
stream-drop-readable |
Stream |
future_new |
wasi |
future-new |
Future |
future_write |
wasi |
future-write |
Future |
future_drop_writable |
wasi |
future-drop-writable |
Future |
future_drop_readable |
wasi |
future-drop-readable |
Future |
task_return |
wasi |
task-return |
Async task |
waitable_set_new |
wasi |
waitable-set-new |
Async task |
waitable_join |
wasi |
waitable-join |
Async task |
waitable_set_wait |
wasi |
waitable-set-wait |
Async task |
subtask_drop |
wasi |
subtask-drop |
Async task |
realloc |
mem |
realloc |
Memory |
libm_sin, etc. |
bundled |
libm_sin, etc. |
Math (libm) |
Instruction Builtins:
Functions without the #[canonical] attribute compile directly to Wasm instructions (e.g., i32_and → i32.and, f64_sqrt → f64.sqrt). See lib/core/builtin.wado for the full list. Categories include: i32/i64 ops, array ops, linear memory ops, float math, wide arithmetic (i128), reinterpret casts, and control flow.
Registry Usage:
The BuiltinRegistry is used by both codegen and resolver:
- Codegen: Uses the registry to look up canonical names for imported builtins
- Resolver: Uses the registry to look up return types for builtin function calls, eliminating the need for hardcoded type mappings
The WorldRegistry module (world_registry.rs) collects world definitions from lib/wasi/*.wado and provides export signature information for code generation.
Purpose:
- Extract world definitions (e.g.,
Commandworld fromwasi/cli.wado) - Provide export function signatures for component generation
- Derive the
runfunction signature from world exports instead of hardcoding
Usage:
// Get the run export signature from Command world
if let Some(run_export) = world_registry.get_export("Command", "run") {
let params = world_export_to_core_params(run_export);
let results = world_export_to_core_results(run_export);
}The name.rs module centralizes all naming and mangling logic for the compiler. It provides utilities for building and parsing mangled names for methods, effect operations, and module-qualified symbols.
Naming Conventions:
| Name Type | Format | Example |
|---|---|---|
| Simple method | {struct_name}::{method_name} |
Point::sum |
| Full method | {filename}/{struct_name}::{method_name} |
./geometry.wado/Point::sum |
| Trait method | {filename}/{struct_name}^{trait_name}::{method_name} |
./geometry.wado/Point^Display::fmt |
| Effect operation | {effect_name}::{operation_name} |
Stdout::write_via_stream |
| WASI qualified | wasi:{package}/{interface}::{function} |
wasi:cli/stdout::write-via-stream |
| Module-qualified struct | {module_path}::{struct_name} |
./geometry.wado::Point |
| Core internal | core::internal::{name} |
core::internal::log_stdout |
Utility Functions:
| Function | Description | Example |
|---|---|---|
mangle_generic_name |
Build monomorphized type name | ("Box", ["i32"]) → "Box<i32>" |
strip_type_params |
Extract base name from generic | "IndexValue<i32>" → "IndexValue" |
extract_local_name |
Strip module path prefix | "./main.wado/Point" → "Point" |
The ModuleSource enum in name.rs provides a structured representation of where a module comes from.
pub enum ModuleSource {
Core { name: String }, // core:prelude, core:cli, etc.
Wasi { interface: String }, // wasi:cli, wasi:io, etc.
Local { path: String }, // ./geometry.wado, ../lib.wado
EntryPoint, // The main entry module
}The name.rs module also provides path canonicalization utilities to ensure the same file imported via different paths resolves to the same module identity.
Design:
- Uses URI path normalization (RFC 3986)
- Always uses
/separator (platform-agnostic, even on Windows) - Canonical paths are project-root-relative (prefixed with
./) - Special prefixes (
core:,wasi:,http://,https://) pass through unchanged
Examples:
| Input Path | Canonical Output |
|---|---|
./geometry.wado |
./geometry.wado |
./sub/../geometry.wado |
./geometry.wado |
./sub/./file.wado |
./sub/file.wado |
core:cli |
core:cli |
http://localhost:8080/lib.wado |
http://localhost:8080/lib.wado |
Relative Import Resolution:
When resolving relative imports, the path is resolved against the importing module's path:
| From Module | Import Source | Resolved Path |
|---|---|---|
./main.wado |
./geometry.wado |
./geometry.wado |
./sub/main.wado |
./utils.wado |
./sub/utils.wado |
./sub/main.wado |
../lib.wado |
./lib.wado |
Validation:
The analyzer validates module paths before loading to provide better error messages for invalid paths. Paths must be valid URI references per RFC 3986.
The module loader loads all modules and applies the frontend pipeline to each:
Frontend Pipeline (per module):
- Lexer: Source → Tokens
- Parser: Tokens → AST
- Bind: Validate local scopes, detect use-before-define and duplicate definitions
- Desugar: Transform syntactic sugar (compound assignment, comparison chains, loops)
Resolution Rules (based on ModuleSource, see ModuleSource):
core:*→ModuleSource::Core→ embedded stdlibwasi:*→ModuleSource::Wasi→ embedded stdlibhttp://orhttps://→ModuleSource::Remote→ host.load_remote()./or../→ModuleSource::Local→ host.load_source()- Unknown
xxx:→ Error:unknown module namespace - Other → Error:
invalid module path
Wado traits use static dispatch (also known as "static resolution" or "monomorphization"). All trait method calls are resolved at compile time to concrete implementations. There is no runtime vtable or dynamic dispatch.
How It Works:
- When a trait method is called (e.g.,
person.greet()), the resolver looks up the concrete type of the receiver - The resolver finds the matching
impl Trait for Typeblock - The method call is lowered to a static function call with a mangled name:
Type^Trait::method
Example Lowering:
// Source code
trait Greet {
fn greet(&self) -> String;
}
impl Greet for Person {
fn greet(&self) -> String {
return `Hello, {self.name}!`;
}
}
let p = Person { name: "Alice" };
println(p.greet());
// Lowered TIR (pseudo-Wado)
fn "Person^Greet::greet"(self: Person) -> String {
return core::internal::string_concat("Hello, ", self.name, "!");
}
let p = Person { name: "Alice" };
println("Person^Greet::greet"(p));
Static Trait Method Calls (no &self):
Traits can define static methods (no self parameter). These are called using Type::method() syntax:
trait Deserialize {
fn deserialize<D: Deserializer>(d: &mut D) -> Result<Self, Error>;
}
impl Deserialize for i32 {
fn deserialize<D: Deserializer>(d: &mut D) -> Result<i32, Error> { ... }
}
// Call site: Type::method::<TypeArg>(args)
let result = i32::deserialize::<JsonDeserializer>(&mut d);
The resolver uses find_static_method_trait to detect when a static call targets a trait method and produces the mangled name i32^Deserialize::deserialize. Method-level type arguments (e.g., <JsonDeserializer>) generate monomorph_info for the monomorphizer to create a concrete instantiation.
Method Resolution Priority:
- Inherent methods (methods in
impl Type { }) take priority over trait methods - Trait methods (methods in
impl Trait for Type { }) are used when no inherent method matches - If multiple traits define the same method name, it's currently a compile error (disambiguation syntax not yet implemented)
Advantages of Static Dispatch:
- Zero runtime overhead: No vtable lookup
- Inlining possible: Optimizer can inline trait methods
- Dead code elimination: Unused trait implementations are removed
Orphan rule checking runs inside TraitEnv::build() in resolver/trait_env.rs, immediately before per-module resolution begins. At that point all modules (stdlib + user files) are loaded, so the full set of trait and type declarations is available.
Phase placement:
LoadModules → TraitEnv::build() ← orphan check here
│
├── build impl_index / decl_index / blanket_impl_index
├── build type_decl_index (struct/variant/enum/flags/newtype → ModuleSource)
└── check_all_orphan_rules() → Vec<TypeError::OrphanViolation>
│
└── emitted via logger before per-module resolution starts
Implementation:
TraitEnv::build() returns (Arc<TraitEnv>, Vec<TypeError>). The caller (resolve_all_modules in orchestration.rs) emits each OrphanViolation through the logger immediately after the call.
The check skips any impl block whose containing module is Core, Wasi, or Remote — the standard library and remote packages are trusted to write any impl they need. Only EntryPoint and Local modules are checked.
For each local impl block with a foreign trait, check_orphan_rfc2451 walks the sequence [self_type, trait_arg_1, …] left-to-right and classifies each position via classify_position:
PositionKind |
Meaning |
|---|---|
LocalType |
Outermost type constructor is defined in a local module |
ForeignType |
Outermost type constructor is foreign (or a tuple / function type) |
UncoveredTypeParam |
The position is a bare impl<T> type parameter |
&T and &mut T are fundamental: classify_position recurses into the inner type, so &LocalType yields LocalType.
The sequence walk returns true (allowed) as soon as a LocalType is found with no UncoveredTypeParam seen at any earlier position. If an UncoveredTypeParam is reached before any LocalType, or the sequence is exhausted without finding a LocalType, the check returns false and an OrphanViolation error is produced.
Error code: Code::OrphanRule → diagnostic string "ORPHAN_RULE".
Trait methods can have default implementations (a body in the trait declaration). When a type implements the trait but omits a method with a default body, the compiler synthesizes the method in the impl block using the default body.
Resolution:
- During impl block processing, the resolver collects explicitly provided method names
- For each default method in the trait not provided by the impl, the resolver calls
resolve_methodwith the default method's AST, treating it as if it were written in the impl block Selfresolves to the implementing type, soself.method()calls in default bodies dispatch to the concrete type's methods
Method Call Lookup:
When find_trait_method_for_type searches for a method:
- First checks methods explicitly in the impl block
- If not found, checks the trait declaration for a default method with that name
Traits can declare associated types using type Name; syntax. Implementors bind these types using type Name = ConcreteType;.
AST Representation:
// In trait declarations
struct AssociatedTypeDecl {
name: String,
span: Span,
}
// In impl blocks
struct AssociatedTypeBinding {
name: String,
ty: Type,
span: Span,
}Resolution:
When resolving Self::TypeName in trait methods:
- The resolver maintains
current_associated_type_bindings: HashMap<String, TypeId> - Before resolving methods in a trait impl, bindings are collected from the impl block
Self::TypeNameis parsed asType::NamespacedGeneric { namespace: "Self", name: "TypeName" }- Resolution looks up the type name in the current bindings
Example:
trait Container {
type Item;
fn get(&self) -> Self::Item;
}
impl Container for IntBox {
type Item = i32; // Binding: "Item" -> i32
fn get(&self) -> Self::Item { // Self::Item resolves to i32
return self.value;
}
}
type T = U creates a newtype - a distinct type that shares representation with its base type but is not interchangeable.
Key Properties:
TandUare distinct types (no implicit conversion)Tinherits methods, operators, and traits fromU- Explicit
ascast required to convert betweenTandU - Zero runtime cost (same Wasm representation)
Method Signature Substitution:
When calling an inherited method on a newtype, the method signature is substituted:
type Location = Point;
impl Point {
fn distance(&self, other: &Point) -> f64 { ... }
}
let loc1: Location = ...;
let loc2: Location = ...;
loc1.distance(&loc2); // Parameters expect &Location, not &Point
The resolver substitutes all occurrences of the base type with the newtype in:
- Parameter types (including
&BaseType→&Newtype) - Return type
Static Methods and Traits:
Newtypes inherit static methods and trait implementations from their base type:
Location::origin() // Calls Point::origin()
loc.describe() // Calls Point's Describable::describe()
Chained Newtypes:
Newtypes can chain: type C = B; type B = A; - the resolver traces back to the ultimate base type for method lookup.
See WEP: Newtype Semantics for full specification.
The compiler resolves iterator traits (Iterator, IntoIterator, FromIterator) using the same static dispatch mechanism as other traits.
For-Of Loop Compilation:
For-of loops over tuples are expanded at compile time in the resolver (one copy of the body per element, each typed independently). This enables heterogeneous iteration with per-element trait dispatch. break, continue, and return are compile errors inside tuple for-of.
For-of loops over non-tuple types are desugared to use IntoIterator and Iterator traits:
// Source
for let item of collection {
body(item);
}
// Desugars to (conceptually)
{
let mut __iter = IntoIterator::into_iter(&collection);
loop {
match Iterator::next(&mut __iter) {
Some(__item) => {
let item = __item;
body(item);
}
None => break,
}
}
}
Resolution Process:
- Type lookup: Get the type of
collection - IntoIterator lookup: Find
impl IntoIterator for CollectionType - Iter type extraction: Get
Self::Iterassociated type - Iterator lookup: Find
impl Iterator for IterType - Item type extraction: Get
Self::Itemassociated type for the loop binding
Known Limitations:
- Cross-module monomorphization: Generic stdlib methods (like
ArrayIter::collectcallingArray::append) may encounter type table ID mismatches when called from user code. Workaround: Use direct builtin calls in stdlib generic functions instead of method calls.
The compiler desugars comparison operators to trait method calls:
Eq Trait (Equality):
// a == b desugars to:
Eq::eq(&a, &b)
// a != b desugars to:
!Eq::eq(&a, &b)
Ord Trait (Ordering):
// a < b desugars to:
Ord::cmp(&a, &b) == Ordering::Less
// a > b desugars to:
Ord::cmp(&a, &b) == Ordering::Greater
// a <= b desugars to:
Ord::cmp(&a, &b) != Ordering::Greater
// a >= b desugars to:
Ord::cmp(&a, &b) != Ordering::Less
Resolution:
- For primitive types (
i32,f64, etc.), the compiler generates direct Wasm comparison instructions - For
StringandArray<T>, the resolver looks up the trait implementation in prelude - For user-defined types, the resolver finds
impl Eq for Typeorimpl Ord for Type
Index expressions desugar to trait method calls:
// arr[i] (read) desugars to:
IndexValue::index_value(&arr, i)
// or Index::index(&arr, i) for reference-type elements
// arr[i] = value (write) desugars to:
IndexAssign::index_assign(&mut arr, i, value)
Design Note: IndexValue returns by value because Wasm GC's array.get copies elements. For primitive arrays, you cannot get a reference to an element. Index is only used for containers of reference-type elements.
Primitive Layer (builtin::):
The builtin namespace provides direct access to Wasm primitives. These types and functions map 1:1 to Wasm instructions with no abstraction. The namespace is always available without import, but is intended primarily for standard library implementation.
Wasm GC Types:
builtin::array<T> // Wasm GC array (no methods)
builtin::i31ref // Wasm GC i31ref (31-bit integer reference)
Intrinsic Functions:
// Array operations
builtin::array_new<T>(len: i32) -> builtin::array<T>
builtin::array_len<T>(arr: builtin::array<T>) -> i32
builtin::array_get<T>(arr: builtin::array<T>, idx: i32) -> T
builtin::array_set<T>(arr: builtin::array<T>, idx: i32, value: T)
builtin::array_get_u8(arr: builtin::array<u8>, idx: i32) -> i32 // Unsigned byte read
// i31ref operations
builtin::i31ref_new(value: i32) -> builtin::i31ref
builtin::i31ref_get_s(ref: builtin::i31ref) -> i32 // Signed extraction
builtin::i31ref_get_u(ref: builtin::i31ref) -> u32 // Unsigned extraction
// Reference comparison (Wasm ref.eq)
builtin::eqref<T, U>(a: T, b: U) -> bool // Compare any GC references
// Control
builtin::unreachable() -> ! // Wasm trap instruction
// i32 operations
builtin::i32_and(a: i32, b: i32) -> i32 // Bitwise AND
builtin::i32_eqz(a: i32) -> i32 // Check if zero (returns 0 or 1)
// Linear memory operations
builtin::memory_store8(addr: i32, value: i32) // Store byte to memory
builtin::memory_load8_u(addr: i32) -> i32 // Load unsigned byte from memory
builtin::realloc(oldptr: i32, oldsize: i32, align: i32, newsize: i32) -> i32
// Stream/Future intrinsics (Component Model)
// These are low-level i32 handle operations used internally by the resolver.
// User code accesses Stream<T>/Future<T> resource types from core:prelude/types.wado.
// NOTE: Migration from builtin-based to resource-based is incomplete.
// Resource declarations exist in types.wado but method resolution (.new(), .read(),
// .write(), .close(), .drop()) is still hardcoded in the resolver (method_call.rs)
// rather than being driven by the resource declarations.
builtin::stream_new() -> i64 // Create stream, returns rx|tx packed
// Extract: rx = handles as i32, tx = (handles >> 32) as i32
builtin::stream_read(rx: i32, ptr: i32, len: i32) -> i32
builtin::stream_write(tx: i32, ptr: i32, len: i32) -> i32
builtin::stream_drop_writable(tx: i32)
builtin::stream_drop_readable(rx: i32)
builtin::future_new() -> i64 // Create future, returns rx|tx packed
builtin::future_write(tx: i32, ptr: i32) -> i32
builtin::future_drop_writable(tx: i32)
builtin::future_drop_readable(rx: i32)
// Async task intrinsics (Component Model)
builtin::waitable_set_new() -> i32
builtin::waitable_join(set: i32, subtask: i32)
builtin::waitable_set_wait(set: i32, outptr: i32) -> i32
builtin::subtask_drop(subtask: i32)
// Branch hinting (Wasm branch hinting proposal)
builtin::likely(cond: bool) -> bool // Hint: branch is usually taken
builtin::unlikely(cond: bool) -> bool // Hint: branch is rarely taken
Branch Hinting:
builtin::likely() and builtin::unlikely() generate WebAssembly branch hints via the metadata.code.branch_hint custom section. These hints help the Wasm runtime optimize branch prediction.
// Hint that this condition is usually true
if builtin::likely(x > 0) {
// fast path
}
// Hint that this condition is rarely true (error path)
if builtin::unlikely(x < 0) {
// error handling
}
To inspect generated branch hints:
cargo run --bin wado -- compile --wat-to-stdout file.wado | grep branch_hintUsage in Standard Library:
// Standard library uses builtin primitives internally
// In core/string.wado
pub struct String {
buf: builtin::array<u8>,
pub fn length(&self) -> i32 {
return builtin::array_len(self.buf);
}
}
// In core/prelude.wado
pub fn unreachable() -> ! {
builtin::unreachable()
}
Standard Library Types:
Standard library types wrap builtins with methods:
String- Struct wrappingbuiltin::array<u8>(maps to CMstring)Array<T>- Struct wrappingbuiltin::array<T>(maps to CMlist<T>)
Struct Implementation:
- Internally: Wasm-GC
structtype with GC-managed memory - At CM boundary: Automatically converted to/from
record - Enables recursive types, self-referential structures, and efficient field access
Single-Field Optimization:
If a struct contains exactly one GC object field (a builtin::array or another struct), the compiler skips generating the outer Wasm GC struct. This means wrapper types like String and Array<T> have zero runtime overhead:
// String wraps builtin::array<u8>
struct String {
buf: builtin::array<u8>,
// ... methods
}
// At Wasm level: compiles to just (ref (array u8)), no wrapper struct
// Array<T> wraps builtin::array<T>
struct Array<T> {
repr: builtin::array<T>,
// ... methods
}
// At Wasm level: compiles to just (ref (array T)), no wrapper struct
This optimization enables ergonomic APIs with methods while maintaining direct Wasm GC representation.
See WEP: 128-bit Integer Types.
Template strings use backticks with {expr} syntax and Python-like format specifiers (e.g., {pi:.2f}). The compilation pipeline:
- Lexer: Tokenizes backtick strings with brace depth tracking, nested template support, and escape sequences (
\{,\}) - Parser: Builds
TemplateStringExprwithTemplatePart::StringandTemplatePart::Interpolationnodes. Distinguishes:(format spec) from::(scope resolution) via lookahead - Synthesis (
synthesis/template.rs): Expands each template string into a__tmpllabeled block that allocates aStringbuffer, appends literal parts, and callsDisplay::fmtorInspect::inspectfor interpolated expressions. Emits generic trait calls that the monomorphizer resolves to concrete implementations
The synthesis phase generates Serialize and Deserialize trait implementations for structs that use impl Serialize for Type; or impl Deserialize for Type; (semicolon instead of block body).
How it works:
- The resolver detects
impl Trait for Type;declarations and records them asSynthesisRequestentries in the TIR module. serde_synth.rsprocesses each request, inspects the struct's fields, and generates complete TIR method bodies.- For
Serialize: generates aserializemethod that callsbegin_struct, thenfieldfor each field (withsnake_case→camelCasename conversion), thenend. - For
Deserialize: generates adeserializestatic method with a field lookup function, golden-mask bitmask tracking for required fields, and a loop that processes fields in any order. - A lookup function (
_typename_field_lookup) is generated alongside eachDeserializeimpl to map camelCase JSON keys to field indices.
Generated Deserialize pattern (golden mask):
- Each field gets one bit in a
u32bitmask (supports up to 32 fields). - Unknown fields are skipped via
skip(). - After the loop, a single
seen & mask != maskcheck verifies all required fields are present. - Missing fields produce a
DeserializeErrorwithMissingFieldkind.
The synthesis phase auto-generates Inspect and Display trait implementations for all types that need them. Inspect is always generated; Display is generated as a fallback (delegating to Inspect) only for types without a user-provided Display impl.
How it works:
- Template expansion (
synthesis/template.rs) encounters{expr:?}or{expr}and emits calls toInspect::inspectorDisplay::fmt. synthesis/traits.rsscans all types in the project and generatesInspecttrait impls — field access for structs, match arms for variants/enums, loops for arrays, etc.- For types without a user-provided
Displayimpl, a fallbackDisplay::fmtis generated that delegates toInspect::inspect. - The monomorphizer resolves all generic trait calls to these concrete implementations.
- The generated TIR flows through the rest of the pipeline (lower → optimize → codegen).
Each distinct type gets a dedicated __inspect$TypeName function generated once and called from all use sites. The InspectRegistry deduplicates these across the module.
String literals are stored in Wasm passive data segments. This allows direct initialization of GC arrays using array.init_data, which is more efficient than loading from linear memory.
assert behaves like a power-assert, which shows source conditions, collects intermediate values, and prints them if the assertion fails.
Basic Assert:
assert x > 0; is compiled into:
if builtin::unlikely(!condition) {
panic(`Assertion failed:\ncondition: x > 0\nx: {x}`);
}
Assert with Custom Message:
assert x > 0, "x must be checked elsewhere"; is compiled into:
if builtin::unlikely(!condition) {
panic(`Assertion failed: x must be checked elsewhere\ncondition: x > 0\nx: {x}`);
}
Intermediate Values:
Each intermediate value is collected and printed if the assertion fails.
assert x + y > 0; is compiled into:
if builtin::unlikely(!condition) {
panic(`Assertion failed:\ncondition: x + y > 0\nx: {x}\ny: {y}\nx + y: {x + y}`);
}
Value Caching for Side-Effect Safety:
When the condition contains function calls with side effects, values are cached in Wasm locals to ensure each function is called exactly once:
assert get_value() > 10;
The compiler:
- Extracts all "interesting" sub-expressions (identifiers, function calls, binary expressions)
- Evaluates each sub-expression once and stores the result in a local variable
- Evaluates the condition using cached local values
- On failure, builds the error message using cached values (no re-evaluation)
This ensures that get_value() is called only once, not twice (once for caching and once for condition evaluation).
Wado uses value semantics for composite types: assignment creates a copy. Structs and tuples are copied field-by-field via struct.get/struct.new. Arrays and strings are copied element-by-element. Option and variant types conditionally copy their payloads. Reference types (&T, &mut T) do not have value semantics — they share the underlying value.
When calling a method on a reference type, the compiler automatically inserts dereference operations to reach the underlying value type.
How It Works:
let p = Point { x: 10, y: 20 };
let p_ref = &p;
let sum = p_ref.sum(); // Auto-derefs: (*p_ref).sum()
let p_ref2 = &p_ref;
let sum2 = p_ref2.sum(); // Double auto-deref: (**p_ref2).sum()
The resolver handles auto-deref in resolve_method_call(): it repeatedly inserts TirUnaryOp::Deref expressions until the receiver is not a reference type, then proceeds with normal method resolution. Works for &T, &mut T, and multi-level references (&&T, &&&T).
Global variables compile to WebAssembly globals with two initialization strategies:
| Category | Condition | Strategy |
|---|---|---|
| Constant | Primitive type with Wasm constant expression | Direct initialization in Wasm global section |
| Lazy | Object types or non-constant expressions | Null/zero default, initialized in __initialize_module() |
Module Initialization:
- Each module with lazy globals generates
pub fn __initialize_module() - Entry module generates
fn __initialize_modules()which calls all modules' initializers - Initialization order: topologically sorted by dependencies (within module and across modules)
- Re-initialization prevented via flag check
Match expressions are lowered to a series of pattern checks with branching:
Lowering Strategy:
| Pattern Type | Lowering |
|---|---|
| Variant | br_on_cast_fail to test discriminant, extract payload |
| Literal | Equality check with br_if |
Wildcard _ |
No check (always matches) |
| Or pattern | Chain of checks with shared arm body |
Guard && |
Pattern check followed by guard expression check |
Codegen to Wasm:
For dense integer patterns (e.g., enum discriminants), the codegen emits br_table for O(1) dispatch:
;; match color { Red => 0, Green => 1, Blue => 2 }
(block $arm2
(block $arm1
(block $arm0
(br_table $arm0 $arm1 $arm2 (local.get $color)))
(i32.const 0) ;; Red
(br $end))
(i32.const 1) ;; Green
(br $end))
(i32.const 2) ;; BlueFor variant patterns, br_on_cast_fail tests the discriminant and extracts the payload in one instruction.
Exhaustiveness:
Checked during analysis phase. Non-exhaustive patterns are compile errors.
The Component Model requires each core module to export a realloc function. The CM runtime calls realloc whenever it needs guest-side linear memory — for example, stream.read copies bytes from the host into a guest buffer allocated via realloc, and string lifting/lowering also goes through it. Because CM operations can allocate significant amounts of memory (e.g., reading a large HTTP response body in a loop), the realloc implementation must be robust.
The allocator is implemented in Wado itself in lib/core/allocator.wado, using the #![wasm_module("mem")] attribute to compile it into a separate core module. The module exports a realloc function (via #[export_name("realloc")]) and a mutable global for the heap pointer.
The compiler extracts #![wasm_module("mem")] items during WIR construction and emits them as a standalone core module. This "mem" module is instantiated as part of the component, and its realloc and linear memory exports are shared with the main GC core module. See the spec section on the "mem" core module for the full component structure.
The main core module accesses realloc and linear memory through imports, declared via #[canonical("mem", "realloc")] in core:builtin. Internal functions like memory_to_gc_array and gc_array_to_memory in core:internal use these builtins to copy data between GC arrays and linear memory.
The compiler supports multiple allocator implementations, each tagged with #[allocator("name")] in lib/core/allocator.wado. The compiler selects one by setting its export_name to "realloc" and clearing the others.
Available allocators:
bump(default): Simple bump allocator. Never frees memory. Used for production builds.debug: Never reuses freed memory and poisons freed regions with0xFFbytes. Useful for detecting use-after-free bugs.
Selection rules:
- CLI
--allocator <name>overrides everything. - Test world (
--world test/wado test) defaults to"debug". - E2E tests (
cargo test) default to"debug"unless the fixture specifies"allocator"in its__DATA__JSON. - Otherwise, defaults to
"bump".
TODO: The bump allocator never frees memory and has a fixed 64-page (4 MB) backing memory. Implement a proper allocator that supports freeing and growing.
- Variant pattern matching: Single-payload and tuple-payload cases work (
if let Circle(r) = shape,if let Rect([w, h]) = shape). Struct payloads not yet supported. See WEP: Variant Payload Design. - Function types: Parser supports
fn(T) -> Usyntax, closure codegen works (both pure and capturing), but full function type support is incomplete. - Stream/Future resource migration: Resource declarations (
resource Stream<T>,resource Future<T>, etc.) exist incore:prelude/types.wado, but method resolution (.new(),.read(),.write(),.close(),.drop()) is still hardcoded in the resolver (method_call.rs) rather than being driven by the resource declarations. The low-level canonical builtins inbuiltin.wado(stream_new,stream_read,future_write, etc.) remain the actual backing implementation.
- Implicit struct literals don't work with generic structs:
let b: Box<i32> = { value };fails. Use explicit form:let b: Box<i32> = Box { value }; - GC arrays cannot be passed directly to streams:
stream<u8>operations require linear memory. GC arrays must be copied to linear memory before writing to streams. See component-model#525
?operator (error propagation)- Effect handlers
- Reactive signals (source values, derived values, effect blocks)
- JSX
- Generic function/method call type inference