Skip to content

Latest commit

 

History

History
911 lines (691 loc) · 34.9 KB

File metadata and controls

911 lines (691 loc) · 34.9 KB

06 - WebAssembly Integration Analysis

Agent: 6 (WASM Integration Specialist) Date: 2026-02-20 Scope: ruvector codebase WASM capabilities, build pipeline, SIMD acceleration, memory management, deployment strategies, module loading, and benchmarking framework


Table of Contents

  1. Existing WASM Usage in ruvector
  2. WASM Build Pipeline Compatibility
  3. SIMD Acceleration Opportunities
  4. Memory Management Patterns
  5. Browser vs Node.js Deployment Strategies
  6. WASM Module Loading and Initialization Patterns
  7. Performance Benchmarking Framework for WASM
  8. Recommendations for the Sublinear-Time Solver

1. Existing WASM Usage in ruvector

1.1 Scale of WASM Infrastructure

The ruvector project has a massive, mature WASM infrastructure. The workspace defines 27 dedicated WASM crates in the Cargo workspace, spanning vector database operations, attention mechanisms, graph algorithms, ML inference, and self-learning solvers. This is not an experimental feature -- it is a first-class deployment target.

WASM Crate Inventory (27 crates)

Crate Description Target Size
ruvector-wasm Core vector DB bindings (HNSW, insert, search, delete) wasm32-unknown-unknown (wasm-bindgen) ~28 KB src
rvf-solver-wasm Self-learning temporal solver (Thompson Sampling, PolicyKernel) wasm32-unknown-unknown (no_std + alloc, extern "C") ~160 KB compiled
rvf-wasm RVF format microkernel for browser/edge vector ops wasm32-unknown-unknown -
micro-hnsw-wasm Neuromorphic HNSW with spiking neural nets wasm32-unknown-unknown 11.8 KB compiled
ruvector-attention-wasm 18+ attention mechanisms (Flash, MoE, Hyperbolic) wasm32-unknown-unknown (wasm-bindgen) -
ruvector-attention-unified-wasm Unified attention API wasm32-unknown-unknown 339 KB compiled
ruvector-learning-wasm MicroLoRA adaptation (<100us latency) wasm32-unknown-unknown 39 KB compiled
ruvector-nervous-system-wasm Bio-inspired neural simulation wasm32-unknown-unknown 178 KB compiled
ruvector-economy-wasm Compute credit management wasm32-unknown-unknown 181 KB compiled
ruvector-exotic-wasm Quantum, hyperbolic, topological wasm32-unknown-unknown 149 KB compiled
ruvector-sparse-inference-wasm Sparse matrix inference with WASM SIMD wasm32-unknown-unknown -
ruvector-delta-wasm Delta operations with SIMD wasm32-unknown-unknown -
ruvector-mincut-wasm Subpolynomial-time dynamic min-cut wasm32-unknown-unknown -
ruvector-mincut-gated-transformer-wasm Gated transformer min-cut wasm32-unknown-unknown -
ruvector-graph-wasm Graph operations wasm32-unknown-unknown -
ruvector-gnn-wasm Graph neural networks wasm32-unknown-unknown -
ruvector-dag-wasm Minimal DAG for browser/embedded wasm32-unknown-unknown -
ruvector-math-wasm Math operations (Wasserstein, manifolds, spherical) wasm32-unknown-unknown -
ruvector-router-wasm Query routing wasm32-unknown-unknown -
ruvector-fpga-transformer-wasm FPGA transformer simulation wasm32-unknown-unknown -
ruvector-temporal-tensor-wasm Temporal tensor operations wasm32-unknown-unknown -
ruvector-tiny-dancer-wasm Lightweight operations wasm32-unknown-unknown -
ruvector-hyperbolic-hnsw-wasm Hyperbolic HNSW wasm32-unknown-unknown -
ruvector-domain-expansion-wasm Cross-domain transfer learning wasm32-unknown-unknown -
ruvllm-wasm LLM inference wasm32-unknown-unknown -
ruqu-wasm Quantum operations wasm32-unknown-unknown -
exo-wasm (example) Exo AI experiment wasm32-unknown-unknown -

1.2 Two Distinct WASM Binding Strategies

The codebase employs two fundamentally different WASM integration patterns:

Pattern A: wasm-bindgen + wasm-pack (High-Level, Browser-First)

Used by: ruvector-wasm, ruvector-attention-wasm, ruvector-math-wasm, most -wasm crates.

// crates/ruvector-wasm/src/lib.rs
use wasm_bindgen::prelude::*;
use js_sys::{Float32Array, Object, Promise};
use web_sys::{console, IdbDatabase, IdbFactory};

#[wasm_bindgen(start)]
pub fn init() {
    console_error_panic_hook::set_once();
    tracing_wasm::set_as_global_default();
}

#[wasm_bindgen]
pub struct VectorDB { /* ... */ }

#[wasm_bindgen]
impl VectorDB {
    #[wasm_bindgen(constructor)]
    pub fn new(dimensions: usize, metric: Option<String>, use_hnsw: Option<bool>)
        -> Result<VectorDB, JsValue> { /* ... */ }
}

Key dependencies: wasm-bindgen, wasm-bindgen-futures, js-sys, web-sys, serde-wasm-bindgen, console_error_panic_hook.

Advantages: Rich JS interop, automatic TypeScript type generation, Promise support, access to Web APIs (IndexedDB, Workers, console).

Pattern B: no_std + extern "C" ABI (Low-Level, Minimal)

Used by: rvf-solver-wasm, rvf-wasm, micro-hnsw-wasm.

// crates/rvf/rvf-solver-wasm/src/lib.rs
#![no_std]
extern crate alloc;

#[no_mangle]
pub extern "C" fn rvf_solver_create() -> i32 {
    registry().create()
}

#[no_mangle]
pub extern "C" fn rvf_solver_train(handle: i32, count: i32, /* ... */) -> i32 { /* ... */ }

Key dependencies: dlmalloc (global allocator), libm, serde (no_std + alloc). No wasm-bindgen.

Advantages: Minimal binary size (~160 KB for rvf-solver-wasm, 11.8 KB for micro-hnsw-wasm), no JS runtime dependency, runs on bare wasm32-unknown-unknown, suitable for self-bootstrapping RVF files.

1.3 Kernel Pack System (ADR-005)

The ruvector-wasm crate includes a sophisticated Kernel Pack System (/crates/ruvector-wasm/src/kernel/) for secure, sandboxed execution of ML compute kernels via Wasmtime:

  • Manifest parsing (manifest.rs): Declares kernel categories (Positional/RoPE, Normalization/RMSNorm, Activation/SwiGLU, KV-Cache, Adapter/LoRA), tensor specs, resource limits
  • Ed25519 signature verification (signature.rs): Supply chain security for kernel packs
  • SHA256 hash verification (hash.rs): Content integrity
  • Epoch-based execution budgets (epoch.rs): Coarse-grained interruption with configurable tick intervals (10ms server, 1ms embedded)
  • Shared memory protocol (memory.rs): 16-byte aligned allocation, region overlap validation, tensor layout management
  • Kernel runtime (runtime.rs): KernelRuntime trait with compile/instantiate/execute lifecycle, mock runtime for testing
  • Trusted allowlist (allowlist.rs): Restricts which kernel IDs may execute

This kernel pack system is directly relevant to the sublinear-time solver because it provides a ready-made infrastructure for sandboxed execution of solver kernels with resource limits.

1.4 Self-Bootstrapping WASM (RVF Format)

The rvf-types crate defines a WasmHeader (/crates/rvf/rvf-types/src/wasm_bootstrap.rs) for embedding WASM modules directly inside .rvf data files:

.rvf file
  +-- WASM_SEG (role=Interpreter, ~50 KB)
  +-- WASM_SEG (role=Microkernel, ~5.5 KB)
  +-- VEC_SEG (data)

Roles: Microkernel, Interpreter, Combined, Extension, ControlPlane. Targets: Wasm32, WasiP1, WasiP2, Browser, BareTile. Feature flags: WASM_FEAT_SIMD, WASM_FEAT_BULK_MEMORY, WASM_FEAT_MULTI_VALUE, WASM_FEAT_REFERENCE_TYPES, WASM_FEAT_THREADS, WASM_FEAT_TAIL_CALL, WASM_FEAT_GC, WASM_FEAT_EXCEPTION_HANDLING.

1.5 Unified WASM TypeScript API

The @ruvector/wasm-unified npm package (/npm/packages/ruvector-wasm-unified/src/index.ts) provides a high-level TypeScript surface combining all WASM modules:

export interface UnifiedEngine {
  attention: AttentionEngine;  // 14+ mechanisms
  learning: LearningEngine;    // MicroLoRA, SONA, BTSP, RL
  nervous: NervousEngine;      // Bio-inspired neural simulation
  economy: EconomyEngine;      // Compute credits
  exotic: ExoticEngine;        // Quantum, hyperbolic, topological
  version(): string;
  getStats(): UnifiedStats;
  init(): Promise<void>;
  dispose(): void;
}

2. WASM Build Pipeline Compatibility

2.1 Workspace-Level Configuration

The root Cargo.toml defines workspace-level WASM dependencies:

# /Cargo.toml (workspace)
[workspace.dependencies]
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4"
js-sys = "0.3"
web-sys = { version = "0.3", features = ["Worker", "MessagePort", "console"] }
getrandom = { version = "0.3", features = ["wasm_js"] }

There is also a getrandom compatibility patch for WASM:

# In ruvector-wasm/Cargo.toml
getrandom02 = { package = "getrandom", version = "0.2", features = ["js"] }
[target.'cfg(target_arch = "wasm32")'.dependencies]
getrandom = { workspace = true, features = ["wasm_js"] }

And a workspace-level patch for hnsw_rs to use rand 0.8 for WASM compatibility:

[patch.crates-io]
hnsw_rs = { path = "./patches/hnsw_rs" }

2.2 Build Profiles

Two distinct WASM build profiles exist:

Profile 1: Size-Optimized (for wasm-bindgen crates)

# crates/ruvector-wasm/Cargo.toml
[profile.release]
opt-level = "z"       # Optimize for size
lto = true            # Link-time optimization
codegen-units = 1     # Single codegen unit
panic = "abort"       # No unwind tables

[profile.release.package."*"]
opt-level = "z"

[package.metadata.wasm-pack.profile.release]
wasm-opt = false      # Disable wasm-opt (already optimized by LTO)

Profile 2: Size-Optimized + Strip (for no_std crates)

# crates/rvf/rvf-solver-wasm/Cargo.toml
[profile.release]
opt-level = "z"
lto = true
codegen-units = 1
strip = true          # Also strips debug symbols

Profile 3: Workspace Default Release (native)

# Root Cargo.toml
[profile.release]
opt-level = 3         # Optimize for speed
lto = "fat"
codegen-units = 1
strip = true
panic = "unwind"      # Keeps unwind tables (unlike WASM profile)

2.3 Build Tooling

The test script at /scripts/test/test-wasm.mjs demonstrates the build command:

wasm-pack build crates/ruvector-attention-wasm --target web --release

For no_std crates like rvf-solver-wasm, the standard cargo command with WASM target is used:

cargo build --target wasm32-unknown-unknown --release -p rvf-solver-wasm

2.4 Sublinear-Time Solver Build Compatibility

The rvf-solver-wasm crate provides the closest precedent for a sublinear-time solver WASM build:

  • Target: wasm32-unknown-unknown (no WASI dependency)
  • Allocator: dlmalloc (global allocator for alloc)
  • Math: libm (no_std-compatible math functions)
  • Serialization: serde + serde_json (no_std + alloc features)
  • Crypto: rvf-crypto (SHAKE-256 witness chain)
  • Panic handler: core::arch::wasm32::unreachable()
  • ABI: extern "C" exports (no wasm-bindgen overhead)
  • Crate type: cdylib only (no rlib)

This approach produces binaries in the ~160 KB range, which is excellent for edge deployment.


3. SIMD Acceleration Opportunities

3.1 Existing WASM SIMD Infrastructure

The codebase has extensive WASM SIMD128 support across multiple crates, all using core::arch::wasm32::* intrinsics. Every SIMD function provides dual implementations: a #[cfg(target_feature = "simd128")] version using WASM SIMD intrinsics and a #[cfg(not(target_feature = "simd128"))] scalar fallback.

WASM SIMD Operations Already Implemented

Crate File Operations
ruvector-delta-wasm src/simd.rs f32x4 add, sub, scale, dot, L2 norm, diff, abs, clamp, count_nonzero
ruvector-sparse-inference src/backend/wasm.rs f32x4 dot product, ReLU, vector add, AXPY
ruvector-mincut src/wasm/simd.rs v128 popcount (table lookup method), XOR, boundary computation, batch membership
ruvector-core src/simd_intrinsics.rs x86_64 (AVX2, AVX-512, FMA), aarch64 (NEON, unrolled), INT8 quantized, batch operations

SIMD Operations in ruvector-delta-wasm/src/simd.rs (Representative)

use core::arch::wasm32::*;

#[cfg(target_feature = "simd128")]
pub fn simd_dot(a: &[f32], b: &[f32]) -> f32 {
    let chunks = a.len() / 4;
    let mut sum_vec = f32x4_splat(0.0);
    for i in 0..chunks {
        let offset = i * 4;
        unsafe {
            let a_vec = v128_load(a.as_ptr().add(offset) as *const v128);
            let b_vec = v128_load(b.as_ptr().add(offset) as *const v128);
            let prod = f32x4_mul(a_vec, b_vec);
            sum_vec = f32x4_add(sum_vec, prod);
        }
    }
    // Horizontal sum + remainder handling
    let sum_array: [f32; 4] = unsafe { core::mem::transmute(sum_vec) };
    let mut sum = sum_array[0] + sum_array[1] + sum_array[2] + sum_array[3];
    for i in (chunks * 4)..a.len() { sum += a[i] * b[i]; }
    sum
}

SIMD Operations in ruvector-sparse-inference/src/backend/wasm.rs (Backend Trait)

pub struct WasmBackend;

impl Backend for WasmBackend {
    fn dot_product(&self, a: &[f32], b: &[f32]) -> f32 { /* SIMD dispatch */ }
    fn sparse_matmul(&self, matrix: &Array2<f32>, input: &[f32], rows: &[usize]) -> Vec<f32>;
    fn sparse_matmul_accumulate(&self, matrix: &Array2<f32>, input: &[f32], cols: &[usize], output: &mut [f32]);
    fn activation(&self, data: &mut [f32], activation_type: ActivationType); // ReLU via SIMD
    fn add(&self, a: &mut [f32], b: &[f32]);
    fn axpy(&self, a: &mut [f32], b: &[f32], scalar: f32);
    fn name(&self) -> &'static str { "WASM-SIMD" }
    fn simd_width(&self) -> usize { 4 } // 128-bit = 4 x f32
}

3.2 SIMD Acceleration Opportunities for the Sublinear-Time Solver

Based on the sublinear-time solver's core operations, the following SIMD acceleration points are identified:

Operation SIMD Strategy Expected Speedup Existing Pattern
Distance computation (dot, cosine, euclidean) f32x4_mul + f32x4_add accumulation 2-4x ruvector-delta-wasm/src/simd.rs
Vector normalization f32x4_mul (scale) + f32x4_add (L2 norm) 2-4x simd_l2_norm_squared, simd_scale
Bitset operations (partition tracking) v128_xor, v128_and, popcount via lookup 4-8x ruvector-mincut/src/wasm/simd.rs
Sparse matrix-vector multiply SIMD dot + sparse row selection 2-4x WasmBackend::sparse_matmul
Activation functions (ReLU, GELU) f32x4_max with zero splat 2-4x relu_wasm_simd
Thompson Sampling bandit updates Scalar (branching-heavy) 1x (no benefit) N/A
Sort/selection (top-k) Scalar (comparison-heavy) 1x (no benefit) N/A

3.3 SIMD Feature Detection

The ruvector-wasm crate exposes SIMD detection to JS:

#[wasm_bindgen(js_name = detectSIMD)]
pub fn detect_simd() -> bool {
    #[cfg(target_feature = "simd128")]
    { true }
    #[cfg(not(target_feature = "simd128"))]
    { false }
}

For the sublinear-time solver, SIMD should be compiled in via RUSTFLAGS="-C target-feature=+simd128" at build time, with scalar fallbacks for environments that do not support it.

3.4 Native SIMD Comparison

The native codebase (ruvector-core/src/simd_intrinsics.rs) supports:

  • x86_64: AVX2 (256-bit, 8 x f32), AVX-512 (512-bit, 16 x f32), FMA, INT8 quantized
  • aarch64: NEON (128-bit, 4 x f32), 4x loop unrolling, FMA via vfmaq_f32
  • WASM: SIMD128 (128-bit, 4 x f32)

WASM SIMD128 provides the same width as NEON (4 x f32) but lacks FMA (f32x4_fma is not available in stable WASM SIMD). This means the sublinear-time solver WASM build will be approximately 2-3x slower than a native NEON build for distance computations, and 4-8x slower than an AVX-512 build. However, it will still be significantly faster than scalar fallback.


4. Memory Management Patterns

4.1 Shared Memory Protocol (Kernel Pack System)

The kernel pack system at /crates/ruvector-wasm/src/kernel/memory.rs defines a mature shared memory protocol:

pub struct SharedMemoryProtocol {
    total_size: usize,     // Total memory in bytes
    current_offset: usize, // Bump allocator position
    alignment: usize,      // Typically 16 bytes
}

impl SharedMemoryProtocol {
    pub fn default_settings() -> Self {
        Self::new(256, 16) // 256 pages = 16 MB, 16-byte alignment
    }

    pub fn allocate(&mut self, size: usize) -> Result<usize, KernelError> {
        let aligned_offset = self.align_offset(self.current_offset);
        // ...bounds check...
        self.current_offset = aligned_offset + size;
        Ok(aligned_offset)
    }
}

The KernelInvocationDescriptor manages tensor memory layout:

pub struct KernelInvocationDescriptor {
    pub descriptor: KernelDescriptor,  // input_a, input_b, output, scratch, params offsets+sizes
    protocol: SharedMemoryProtocol,
}

The MemoryLayoutValidator prevents region overlap and bounds violations.

4.2 Typed Arrays / Zero-Copy Transfer

The wasm-bindgen crates use Float32Array for zero-copy data transfer between JS and WASM:

// Input: JS Float32Array -> Rust Vec<f32>
pub fn insert(&self, vector: Float32Array, ...) -> Result<String, JsValue> {
    let vector_data: Vec<f32> = vector.to_vec();  // Copy from JS typed array
    // ...
}

// Output: Rust Vec<f32> -> JS Float32Array
pub fn vector(&self) -> Float32Array {
    Float32Array::from(&self.inner.vector[..])  // Copy to JS typed array
}

Note: Float32Array::to_vec() and Float32Array::from() perform copies. True zero-copy requires accessing WASM linear memory directly from JS, which is demonstrated in the pwa-loader:

// Zero-copy write into WASM memory
function wasmWrite(data) {
    const ptr = wasmInstance.exports.rvf_alloc(data.length);
    const mem = new Uint8Array(wasmMemory.buffer, ptr, data.length);
    mem.set(data);  // Direct memory write
    return ptr;
}

// Zero-copy read from WASM memory
function wasmRead(ptr, len) {
    return new Uint8Array(wasmMemory.buffer, ptr, len).slice();
}

4.3 Memory Patterns in rvf-solver-wasm (no_std)

The no_std solver uses dlmalloc as global allocator and manages its own instance registry:

// Global mutable registry - safe in single-threaded WASM
static mut REGISTRY: Registry = Registry::new();
const MAX_INSTANCES: usize = 8;

struct SolverInstance {
    solver: AdaptiveSolver,
    last_result_json: Vec<u8>,   // Heap-allocated via dlmalloc
    policy_json: Vec<u8>,
    witness_chain: Vec<u8>,
}

Memory export for external reads uses raw pointer copies:

#[no_mangle]
pub extern "C" fn rvf_solver_result_read(handle: i32, out_ptr: i32) -> i32 {
    let data = &inst.last_result_json;
    unsafe {
        core::ptr::copy_nonoverlapping(data.as_ptr(), out_ptr as *mut u8, data.len());
    }
    data.len() as i32
}

4.4 Memory Limits

Configuration Max Pages Memory Limit Context
Server runtime 1024 64 MB RuntimeConfig::server()
Embedded runtime 64 4 MB RuntimeConfig::embedded()
Default shared memory 256 16 MB SharedMemoryProtocol::default_settings()
Microkernel (RVF) 2-4 128-256 KB WasmHeader min/max pages
WASM page size 1 64 KB WASM_PAGE_SIZE = 65536

4.5 Security Boundary Validation

The ruvector-wasm crate enforces input validation at the WASM boundary:

const MAX_VECTOR_DIMENSIONS: usize = 65536;

#[wasm_bindgen(constructor)]
pub fn new(vector: Float32Array, ...) -> Result<JsVectorEntry, JsValue> {
    let vec_len = vector.length() as usize;
    if vec_len == 0 {
        return Err(JsValue::from_str("Vector cannot be empty"));
    }
    if vec_len > MAX_VECTOR_DIMENSIONS {
        return Err(JsValue::from_str(&format!(
            "Vector dimensions {} exceed maximum allowed {}", vec_len, MAX_VECTOR_DIMENSIONS
        )));
    }
    // ...
}

5. Browser vs Node.js Deployment Strategies

5.1 Browser Deployment (Primary)

The ruvector-wasm crate is browser-first, using:

  • IndexedDB persistence: web-sys features include IdbDatabase, IdbFactory, IdbObjectStore, IdbRequest, IdbTransaction, IdbOpenDbRequest (/crates/ruvector-wasm/Cargo.toml)
  • Web Workers: Embedded JavaScript worker pool (/crates/ruvector-wasm/src/worker-pool.js, /crates/ruvector-wasm/src/worker.js) for parallel operations
  • Tracing via console: tracing-wasm sends logs to browser dev tools
  • Promise-based async: wasm-bindgen-futures for async operations
  • getrandom via JS: getrandom with wasm_js feature uses crypto.getRandomValues()
  • PWA support: The pwa-loader example (/examples/pwa-loader/app.js) demonstrates offline-capable WASM loading

Browser Loading Pattern

// From examples/pwa-loader/app.js
async function loadWasm() {
    const response = await fetch(WASM_PATH);
    const bytes = await response.arrayBuffer();
    const importObject = { env: {} };
    const result = await WebAssembly.instantiate(bytes, importObject);
    wasmInstance = result.instance;
    wasmMemory = wasmInstance.exports.memory;
}

Browser SIMD Support

WASM SIMD128 is supported in Chrome 91+, Firefox 89+, Safari 16.4+, and Edge 91+. This covers >95% of active browsers as of 2026. Feature detection can be done via:

const simdSupported = WebAssembly.validate(
    new Uint8Array([0,97,115,109,1,0,0,0,1,5,1,96,0,1,123,3,2,1,0,10,10,1,8,0,65,0,253,15,253,98,11])
);

5.2 Node.js Deployment

The project supports Node.js via:

  • wasm-pack --target nodejs: Generates CommonJS bindings
  • Direct instantiation from test scripts (/scripts/test/test-wasm.mjs):
import { readFileSync } from 'fs';
const wasmBuffer = readFileSync(wasmPath);
const mathWasm = await import(join(pkgPath, 'ruvector_math_wasm.js'));
await mathWasm.default(wasmBuffer);
  • Edge-net example: /examples/edge-net/pkg/node/ provides Node-specific WASM packages

Node.js has had WASM SIMD support since v16.4 (V8 9.1+). For the sublinear-time solver, Node.js deployment enables server-side and CLI usage with the same WASM binary.

5.3 Edge / Embedded Deployment

The micro-hnsw-wasm crate (11.8 KB) and rvf-solver-wasm (~160 KB) demonstrate ultra-compact deployment:

  • iOS/Swift: /examples/wasm/ios/ includes Swift resources with embedded WASM
  • Self-bootstrapping: The WASM_SEG system embeds WASM interpreters inside data files
  • Target platforms: WasmTarget::Wasm32, WasiP1, WasiP2, Browser, BareTile

5.4 Deployment Target Matrix

Target WASM Format Binding SIMD Size Budget Persistence
Browser (Chrome/FF/Safari) wasm-bindgen JS glue + TS types SIMD128 <500 KB IndexedDB
Node.js (>= 16.4) wasm-bindgen (nodejs) or raw CommonJS/ESM SIMD128 <1 MB fs
Cloudflare Workers wasm-bindgen (web) ESM SIMD128 <1 MB KV
iOS/Swift raw wasm32 C FFI Optional <200 KB CoreData
Bare-metal / RVF no_std cdylib extern "C" Optional <200 KB None

6. WASM Module Loading and Initialization Patterns

6.1 Pattern 1: wasm-bindgen Auto-Init

Used by most WASM crates. The #[wasm_bindgen(start)] attribute runs initialization automatically:

#[wasm_bindgen(start)]
pub fn init() {
    console_error_panic_hook::set_once();
    tracing_wasm::set_as_global_default();
}

JS side (generated by wasm-pack):

import init, { VectorDB } from './ruvector_wasm.js';
await init();  // Loads + instantiates + runs start function
const db = new VectorDB(384, 'cosine', true);

6.2 Pattern 2: Manual WebAssembly.instantiate

Used by the pwa-loader and no_std modules:

const response = await fetch(WASM_PATH);
const bytes = await response.arrayBuffer();
const importObject = { env: {} };
const result = await WebAssembly.instantiate(bytes, importObject);
wasmInstance = result.instance;
wasmMemory = wasmInstance.exports.memory;

This pattern offers maximum control: the host can inspect exports before calling any function, handle errors granularly, and manage memory directly.

6.3 Pattern 3: Streaming Instantiation

For large modules, WebAssembly.instantiateStreaming should be used (not currently in the codebase but recommended):

const result = await WebAssembly.instantiateStreaming(
    fetch(WASM_PATH),
    importObject
);

This starts compiling while bytes are still downloading, reducing load time by up to 50%.

6.4 Pattern 4: Unified Engine Lazy Init

The @ruvector/wasm-unified uses lazy initialization:

let defaultEngine: UnifiedEngine | null = null;

export async function getDefaultEngine(): Promise<UnifiedEngine> {
    if (!defaultEngine) {
        defaultEngine = await createUnifiedEngine();
        await defaultEngine.init();
    }
    return defaultEngine;
}

6.5 Pattern 5: Instance Registry (rvf-solver-wasm)

The solver WASM uses a handle-based instance registry:

static mut REGISTRY: Registry = Registry::new();  // Max 8 concurrent solvers

// JS creates solver:
let handle = wasmInstance.exports.rvf_solver_create();
// JS uses solver:
wasmInstance.exports.rvf_solver_train(handle, 100, 1, 10, seedLo, seedHi);
// JS reads result:
let len = wasmInstance.exports.rvf_solver_result_len(handle);
let ptr = wasmInstance.exports.rvf_solver_alloc(len);
wasmInstance.exports.rvf_solver_result_read(handle, ptr);
let json = new TextDecoder().decode(new Uint8Array(wasmMemory.buffer, ptr, len));
// JS destroys:
wasmInstance.exports.rvf_solver_destroy(handle);

This is the recommended pattern for the sublinear-time solver because it:

  • Supports multiple concurrent solver instances
  • Avoids global state issues
  • Enables resource cleanup
  • Works across all deployment targets (browser, Node, bare-metal)

7. Performance Benchmarking Framework for WASM

7.1 Existing Benchmark Infrastructure

In-WASM Benchmark Function

The ruvector-wasm crate includes a built-in benchmark export:

#[wasm_bindgen(js_name = benchmark)]
pub fn benchmark(name: &str, iterations: usize, dimensions: usize) -> Result<f64, JsValue> {
    let start = Instant::now();
    for i in 0..iterations {
        let vector: Vec<f32> = (0..dimensions)
            .map(|_| js_sys::Math::random() as f32)
            .collect();
        let vector_arr = Float32Array::from(&vector[..]);
        db.insert(vector_arr, Some(format!("vec_{}", i)), None)?;
    }
    let duration = start.elapsed();
    Ok(iterations as f64 / duration.as_secs_f64())
}

WASM Solver Benchmark Binary

The /examples/benchmarks/src/bin/wasm_solver_bench.rs provides a native vs WASM comparison framework:

WASM vs Native AGI Solver Benchmark
  Config: holdout=50, training=50, cycles=3, budget=200

  NATIVE SOLVER RESULTS
  Mode          Acc%       Cost    Noise%    Time     Pass
  A baseline   xx.x%     xxx.x    xx.x%    xxxms    PASS
  B compiler   xx.x%     xxx.x    xx.x%    xxxms    PASS
  C learned    xx.x%     xxx.x    xx.x%    xxxms    PASS

  WASM REFERENCE METRICS
  Native total time:  xxxms
  WASM expected:      ~xxxms (2-5x native)

This establishes the expected WASM overhead: 2-5x slower than native for the self-learning solver workload.

SIMD Benchmarks

The /crates/prime-radiant/benches/simd_benchmarks.rs and /crates/ruvector-sparse-inference/benches/simd_kernels.rs provide Criterion benchmarks for SIMD operations that can be adapted for WASM SIMD.

7.2 Recommended Benchmarking Framework for the Sublinear-Time Solver

sublinear-time-solver/benches/
  wasm_bench.rs          -- In-Rust Criterion benchmarks (native baseline)
  wasm_bench.mjs         -- Node.js WASM performance runner
  wasm_bench.html        -- Browser WASM performance runner
  bench_harness.rs       -- Shared benchmark harness (puzzle generation)

Metrics to Track

Metric Description Measurement
solve_throughput Puzzles solved per second iterations / elapsed_secs
solve_latency_p50 Median solve time Percentile of individual solve times
solve_latency_p99 99th percentile solve time Percentile of individual solve times
memory_peak_bytes Peak WASM linear memory usage memory.buffer.byteLength
module_load_ms Time to instantiate WASM module performance.now() around WebAssembly.instantiate
simd_speedup SIMD vs scalar performance ratio Compare SIMD build vs non-SIMD build
wasm_native_ratio WASM-to-native performance overhead Compare WASM throughput vs native Criterion results
binary_size_bytes Compiled .wasm file size wc -c *.wasm
accuracy_parity Solver accuracy matches native Bit-exact or epsilon comparison of results

Benchmark Protocol

  1. Native baseline: Run the solver natively with Criterion (3+ iterations, warm-up)
  2. WASM baseline: Load the same solver as WASM, run identical workload in Node.js
  3. WASM SIMD: Build with RUSTFLAGS="-C target-feature=+simd128", measure speedup
  4. Browser measurement: Run in Chrome with performance.now(), measure real-world latency
  5. Size budget: Track .wasm binary size across commits (regression alerts if >200 KB)
  6. Accuracy validation: Compare solver output JSON between native and WASM (must match to f64 epsilon)

8. Recommendations for the Sublinear-Time Solver

8.1 Binding Strategy: Use no_std + extern "C" (Pattern B)

For the sublinear-time solver WASM module, adopt the rvf-solver-wasm pattern:

  • no_std + alloc: Minimizes binary size, avoids JS runtime dependency
  • dlmalloc global allocator: Proven in rvf-solver-wasm
  • extern "C" exports: Maximum portability (browser, Node, embedded, bare-metal)
  • Handle-based instance registry: Supports concurrent solver instances
  • Result reads via pointer+length: JSON serialization of results into WASM memory, host reads via typed array view

Do not use wasm-bindgen for the core solver. A thin wasm-bindgen wrapper can be created separately if a richer JS API is needed.

8.2 SIMD Strategy: Conditional Compilation

// In the solver crate
#[cfg(all(target_arch = "wasm32", target_feature = "simd128"))]
mod simd_wasm {
    use core::arch::wasm32::*;
    pub fn distance_l2_simd(a: &[f32], b: &[f32]) -> f32 { /* SIMD128 */ }
}

#[cfg(not(all(target_arch = "wasm32", target_feature = "simd128")))]
mod simd_wasm {
    pub fn distance_l2_simd(a: &[f32], b: &[f32]) -> f32 { /* scalar fallback */ }
}

Build two variants:

  • solver.wasm -- scalar fallback (maximum compatibility)
  • solver-simd.wasm -- SIMD128 enabled (Chrome 91+, FF 89+, Safari 16.4+, Node 16.4+)

8.3 Memory Strategy: Bump Allocator + Shared Memory Protocol

Adopt the SharedMemoryProtocol pattern from the kernel pack system:

  1. Allocate a fixed arena at solver creation (e.g., 256 pages = 16 MB)
  2. Use 16-byte aligned bump allocation for tensor data
  3. Reset the allocator between solve invocations (amortized O(1))
  4. Validate memory regions before kernel execution
  5. Export memory so the host can directly view/write typed arrays without copying

8.4 Build Profile

[profile.release]
opt-level = "z"
lto = true
codegen-units = 1
strip = true
panic = "abort"

Target binary size: <200 KB (consistent with existing rvf-solver-wasm at ~160 KB).

8.5 Feature Detection Export

#[no_mangle]
pub extern "C" fn solver_capabilities() -> u32 {
    let mut caps = 0u32;
    #[cfg(target_feature = "simd128")]
    { caps |= 0x01; }  // SIMD available
    #[cfg(feature = "thompson-sampling")]
    { caps |= 0x02; }  // Thompson Sampling enabled
    #[cfg(feature = "witness-chain")]
    { caps |= 0x04; }  // Witness chain enabled
    caps
}

8.6 Testing Strategy

  • Use wasm-bindgen-test with run_in_browser for browser tests (existing pattern)
  • Use the Node.js test harness at /scripts/test/test-wasm.mjs as a template
  • Validate accuracy parity with native build via wasm_solver_bench
  • Run SIMD-specific tests with RUSTFLAGS="-C target-feature=+simd128" in CI

Appendix A: File Reference

Core WASM Source Files

File Purpose
/crates/ruvector-wasm/src/lib.rs Main VectorDB WASM bindings (wasm-bindgen)
/crates/ruvector-wasm/src/kernel/mod.rs Kernel pack system entry point
/crates/ruvector-wasm/src/kernel/memory.rs Shared memory protocol, bump allocator
/crates/ruvector-wasm/src/kernel/runtime.rs Kernel runtime trait, mock runtime, manager
/crates/ruvector-wasm/src/kernel/epoch.rs Epoch-based execution budgets
/crates/ruvector-wasm/src/kernel/signature.rs Ed25519 kernel pack verification
/crates/ruvector-wasm/src/kernel/manifest.rs Kernel manifest parsing
/crates/ruvector-wasm/Cargo.toml WASM dependency configuration

SIMD Source Files

File Purpose
/crates/ruvector-delta-wasm/src/simd.rs WASM SIMD128 f32x4 operations
/crates/ruvector-sparse-inference/src/backend/wasm.rs WASM SIMD backend with Backend trait
/crates/ruvector-mincut/src/wasm/simd.rs WASM SIMD128 bitset operations
/crates/ruvector-core/src/simd_intrinsics.rs Native SIMD (AVX2/AVX-512/NEON) reference

Solver WASM Source Files

File Purpose
/crates/rvf/rvf-solver-wasm/src/lib.rs Self-learning solver WASM exports (no_std)
/crates/rvf/rvf-solver-wasm/src/engine.rs Adaptive solver engine
/crates/rvf/rvf-solver-wasm/src/policy.rs PolicyKernel with Thompson Sampling
/crates/rvf/rvf-solver-wasm/Cargo.toml no_std WASM build configuration

Build and Test Files

File Purpose
/Cargo.toml Workspace WASM dependencies and build profiles
/scripts/test/test-wasm.mjs Node.js WASM test runner
/examples/benchmarks/src/bin/wasm_solver_bench.rs Native vs WASM benchmark comparison
/examples/pwa-loader/app.js Browser WASM loading and memory management

RVF Self-Bootstrap Files

File Purpose
/crates/rvf/rvf-types/src/wasm_bootstrap.rs WasmHeader, WasmRole, WasmTarget, feature flags

TypeScript/npm Files

File Purpose
/npm/packages/ruvector-wasm-unified/src/index.ts Unified WASM engine TypeScript API

Appendix B: WASM Binary Size Inventory

Binary Size Strategy
micro_hnsw.wasm 11.8 KB no_std, bare minimum
ruvector_learning_wasm_bg.wasm 39 KB wasm-bindgen
ruvector_exotic_wasm_bg.wasm 149 KB wasm-bindgen
ruvector_nervous_system_wasm_bg.wasm 178 KB wasm-bindgen
ruvector_economy_wasm_bg.wasm 181 KB wasm-bindgen
ruvector_attention_unified_wasm_bg.wasm 339 KB wasm-bindgen
rvf-solver-wasm (estimated) ~160 KB no_std + dlmalloc

The sublinear-time solver should target the <200 KB range using the no_std approach, consistent with rvf-solver-wasm.