Agent: 6 (WASM Integration Specialist) Date: 2026-02-20 Scope: ruvector codebase WASM capabilities, build pipeline, SIMD acceleration, memory management, deployment strategies, module loading, and benchmarking framework
- Existing WASM Usage in ruvector
- WASM Build Pipeline Compatibility
- SIMD Acceleration Opportunities
- Memory Management Patterns
- Browser vs Node.js Deployment Strategies
- WASM Module Loading and Initialization Patterns
- Performance Benchmarking Framework for WASM
- Recommendations for the Sublinear-Time Solver
The ruvector project has a massive, mature WASM infrastructure. The workspace defines 27 dedicated WASM crates in the Cargo workspace, spanning vector database operations, attention mechanisms, graph algorithms, ML inference, and self-learning solvers. This is not an experimental feature -- it is a first-class deployment target.
| Crate | Description | Target | Size |
|---|---|---|---|
ruvector-wasm |
Core vector DB bindings (HNSW, insert, search, delete) | wasm32-unknown-unknown (wasm-bindgen) |
~28 KB src |
rvf-solver-wasm |
Self-learning temporal solver (Thompson Sampling, PolicyKernel) | wasm32-unknown-unknown (no_std + alloc, extern "C") |
~160 KB compiled |
rvf-wasm |
RVF format microkernel for browser/edge vector ops | wasm32-unknown-unknown |
- |
micro-hnsw-wasm |
Neuromorphic HNSW with spiking neural nets | wasm32-unknown-unknown |
11.8 KB compiled |
ruvector-attention-wasm |
18+ attention mechanisms (Flash, MoE, Hyperbolic) | wasm32-unknown-unknown (wasm-bindgen) |
- |
ruvector-attention-unified-wasm |
Unified attention API | wasm32-unknown-unknown |
339 KB compiled |
ruvector-learning-wasm |
MicroLoRA adaptation (<100us latency) | wasm32-unknown-unknown |
39 KB compiled |
ruvector-nervous-system-wasm |
Bio-inspired neural simulation | wasm32-unknown-unknown |
178 KB compiled |
ruvector-economy-wasm |
Compute credit management | wasm32-unknown-unknown |
181 KB compiled |
ruvector-exotic-wasm |
Quantum, hyperbolic, topological | wasm32-unknown-unknown |
149 KB compiled |
ruvector-sparse-inference-wasm |
Sparse matrix inference with WASM SIMD | wasm32-unknown-unknown |
- |
ruvector-delta-wasm |
Delta operations with SIMD | wasm32-unknown-unknown |
- |
ruvector-mincut-wasm |
Subpolynomial-time dynamic min-cut | wasm32-unknown-unknown |
- |
ruvector-mincut-gated-transformer-wasm |
Gated transformer min-cut | wasm32-unknown-unknown |
- |
ruvector-graph-wasm |
Graph operations | wasm32-unknown-unknown |
- |
ruvector-gnn-wasm |
Graph neural networks | wasm32-unknown-unknown |
- |
ruvector-dag-wasm |
Minimal DAG for browser/embedded | wasm32-unknown-unknown |
- |
ruvector-math-wasm |
Math operations (Wasserstein, manifolds, spherical) | wasm32-unknown-unknown |
- |
ruvector-router-wasm |
Query routing | wasm32-unknown-unknown |
- |
ruvector-fpga-transformer-wasm |
FPGA transformer simulation | wasm32-unknown-unknown |
- |
ruvector-temporal-tensor-wasm |
Temporal tensor operations | wasm32-unknown-unknown |
- |
ruvector-tiny-dancer-wasm |
Lightweight operations | wasm32-unknown-unknown |
- |
ruvector-hyperbolic-hnsw-wasm |
Hyperbolic HNSW | wasm32-unknown-unknown |
- |
ruvector-domain-expansion-wasm |
Cross-domain transfer learning | wasm32-unknown-unknown |
- |
ruvllm-wasm |
LLM inference | wasm32-unknown-unknown |
- |
ruqu-wasm |
Quantum operations | wasm32-unknown-unknown |
- |
exo-wasm (example) |
Exo AI experiment | wasm32-unknown-unknown |
- |
The codebase employs two fundamentally different WASM integration patterns:
Used by: ruvector-wasm, ruvector-attention-wasm, ruvector-math-wasm, most -wasm crates.
// crates/ruvector-wasm/src/lib.rs
use wasm_bindgen::prelude::*;
use js_sys::{Float32Array, Object, Promise};
use web_sys::{console, IdbDatabase, IdbFactory};
#[wasm_bindgen(start)]
pub fn init() {
console_error_panic_hook::set_once();
tracing_wasm::set_as_global_default();
}
#[wasm_bindgen]
pub struct VectorDB { /* ... */ }
#[wasm_bindgen]
impl VectorDB {
#[wasm_bindgen(constructor)]
pub fn new(dimensions: usize, metric: Option<String>, use_hnsw: Option<bool>)
-> Result<VectorDB, JsValue> { /* ... */ }
}Key dependencies: wasm-bindgen, wasm-bindgen-futures, js-sys, web-sys, serde-wasm-bindgen, console_error_panic_hook.
Advantages: Rich JS interop, automatic TypeScript type generation, Promise support, access to Web APIs (IndexedDB, Workers, console).
Used by: rvf-solver-wasm, rvf-wasm, micro-hnsw-wasm.
// crates/rvf/rvf-solver-wasm/src/lib.rs
#![no_std]
extern crate alloc;
#[no_mangle]
pub extern "C" fn rvf_solver_create() -> i32 {
registry().create()
}
#[no_mangle]
pub extern "C" fn rvf_solver_train(handle: i32, count: i32, /* ... */) -> i32 { /* ... */ }Key dependencies: dlmalloc (global allocator), libm, serde (no_std + alloc). No wasm-bindgen.
Advantages: Minimal binary size (~160 KB for rvf-solver-wasm, 11.8 KB for micro-hnsw-wasm), no JS runtime dependency, runs on bare wasm32-unknown-unknown, suitable for self-bootstrapping RVF files.
The ruvector-wasm crate includes a sophisticated Kernel Pack System (/crates/ruvector-wasm/src/kernel/) for secure, sandboxed execution of ML compute kernels via Wasmtime:
- Manifest parsing (
manifest.rs): Declares kernel categories (Positional/RoPE, Normalization/RMSNorm, Activation/SwiGLU, KV-Cache, Adapter/LoRA), tensor specs, resource limits - Ed25519 signature verification (
signature.rs): Supply chain security for kernel packs - SHA256 hash verification (
hash.rs): Content integrity - Epoch-based execution budgets (
epoch.rs): Coarse-grained interruption with configurable tick intervals (10ms server, 1ms embedded) - Shared memory protocol (
memory.rs): 16-byte aligned allocation, region overlap validation, tensor layout management - Kernel runtime (
runtime.rs):KernelRuntimetrait with compile/instantiate/execute lifecycle, mock runtime for testing - Trusted allowlist (
allowlist.rs): Restricts which kernel IDs may execute
This kernel pack system is directly relevant to the sublinear-time solver because it provides a ready-made infrastructure for sandboxed execution of solver kernels with resource limits.
The rvf-types crate defines a WasmHeader (/crates/rvf/rvf-types/src/wasm_bootstrap.rs) for embedding WASM modules directly inside .rvf data files:
.rvf file
+-- WASM_SEG (role=Interpreter, ~50 KB)
+-- WASM_SEG (role=Microkernel, ~5.5 KB)
+-- VEC_SEG (data)
Roles: Microkernel, Interpreter, Combined, Extension, ControlPlane.
Targets: Wasm32, WasiP1, WasiP2, Browser, BareTile.
Feature flags: WASM_FEAT_SIMD, WASM_FEAT_BULK_MEMORY, WASM_FEAT_MULTI_VALUE, WASM_FEAT_REFERENCE_TYPES, WASM_FEAT_THREADS, WASM_FEAT_TAIL_CALL, WASM_FEAT_GC, WASM_FEAT_EXCEPTION_HANDLING.
The @ruvector/wasm-unified npm package (/npm/packages/ruvector-wasm-unified/src/index.ts) provides a high-level TypeScript surface combining all WASM modules:
export interface UnifiedEngine {
attention: AttentionEngine; // 14+ mechanisms
learning: LearningEngine; // MicroLoRA, SONA, BTSP, RL
nervous: NervousEngine; // Bio-inspired neural simulation
economy: EconomyEngine; // Compute credits
exotic: ExoticEngine; // Quantum, hyperbolic, topological
version(): string;
getStats(): UnifiedStats;
init(): Promise<void>;
dispose(): void;
}The root Cargo.toml defines workspace-level WASM dependencies:
# /Cargo.toml (workspace)
[workspace.dependencies]
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4"
js-sys = "0.3"
web-sys = { version = "0.3", features = ["Worker", "MessagePort", "console"] }
getrandom = { version = "0.3", features = ["wasm_js"] }There is also a getrandom compatibility patch for WASM:
# In ruvector-wasm/Cargo.toml
getrandom02 = { package = "getrandom", version = "0.2", features = ["js"] }
[target.'cfg(target_arch = "wasm32")'.dependencies]
getrandom = { workspace = true, features = ["wasm_js"] }And a workspace-level patch for hnsw_rs to use rand 0.8 for WASM compatibility:
[patch.crates-io]
hnsw_rs = { path = "./patches/hnsw_rs" }Two distinct WASM build profiles exist:
# crates/ruvector-wasm/Cargo.toml
[profile.release]
opt-level = "z" # Optimize for size
lto = true # Link-time optimization
codegen-units = 1 # Single codegen unit
panic = "abort" # No unwind tables
[profile.release.package."*"]
opt-level = "z"
[package.metadata.wasm-pack.profile.release]
wasm-opt = false # Disable wasm-opt (already optimized by LTO)# crates/rvf/rvf-solver-wasm/Cargo.toml
[profile.release]
opt-level = "z"
lto = true
codegen-units = 1
strip = true # Also strips debug symbols# Root Cargo.toml
[profile.release]
opt-level = 3 # Optimize for speed
lto = "fat"
codegen-units = 1
strip = true
panic = "unwind" # Keeps unwind tables (unlike WASM profile)The test script at /scripts/test/test-wasm.mjs demonstrates the build command:
wasm-pack build crates/ruvector-attention-wasm --target web --releaseFor no_std crates like rvf-solver-wasm, the standard cargo command with WASM target is used:
cargo build --target wasm32-unknown-unknown --release -p rvf-solver-wasmThe rvf-solver-wasm crate provides the closest precedent for a sublinear-time solver WASM build:
- Target:
wasm32-unknown-unknown(no WASI dependency) - Allocator:
dlmalloc(global allocator foralloc) - Math:
libm(no_std-compatible math functions) - Serialization:
serde+serde_json(no_std + alloc features) - Crypto:
rvf-crypto(SHAKE-256 witness chain) - Panic handler:
core::arch::wasm32::unreachable() - ABI:
extern "C"exports (no wasm-bindgen overhead) - Crate type:
cdylibonly (no rlib)
This approach produces binaries in the ~160 KB range, which is excellent for edge deployment.
The codebase has extensive WASM SIMD128 support across multiple crates, all using core::arch::wasm32::* intrinsics. Every SIMD function provides dual implementations: a #[cfg(target_feature = "simd128")] version using WASM SIMD intrinsics and a #[cfg(not(target_feature = "simd128"))] scalar fallback.
| Crate | File | Operations |
|---|---|---|
ruvector-delta-wasm |
src/simd.rs |
f32x4 add, sub, scale, dot, L2 norm, diff, abs, clamp, count_nonzero |
ruvector-sparse-inference |
src/backend/wasm.rs |
f32x4 dot product, ReLU, vector add, AXPY |
ruvector-mincut |
src/wasm/simd.rs |
v128 popcount (table lookup method), XOR, boundary computation, batch membership |
ruvector-core |
src/simd_intrinsics.rs |
x86_64 (AVX2, AVX-512, FMA), aarch64 (NEON, unrolled), INT8 quantized, batch operations |
use core::arch::wasm32::*;
#[cfg(target_feature = "simd128")]
pub fn simd_dot(a: &[f32], b: &[f32]) -> f32 {
let chunks = a.len() / 4;
let mut sum_vec = f32x4_splat(0.0);
for i in 0..chunks {
let offset = i * 4;
unsafe {
let a_vec = v128_load(a.as_ptr().add(offset) as *const v128);
let b_vec = v128_load(b.as_ptr().add(offset) as *const v128);
let prod = f32x4_mul(a_vec, b_vec);
sum_vec = f32x4_add(sum_vec, prod);
}
}
// Horizontal sum + remainder handling
let sum_array: [f32; 4] = unsafe { core::mem::transmute(sum_vec) };
let mut sum = sum_array[0] + sum_array[1] + sum_array[2] + sum_array[3];
for i in (chunks * 4)..a.len() { sum += a[i] * b[i]; }
sum
}pub struct WasmBackend;
impl Backend for WasmBackend {
fn dot_product(&self, a: &[f32], b: &[f32]) -> f32 { /* SIMD dispatch */ }
fn sparse_matmul(&self, matrix: &Array2<f32>, input: &[f32], rows: &[usize]) -> Vec<f32>;
fn sparse_matmul_accumulate(&self, matrix: &Array2<f32>, input: &[f32], cols: &[usize], output: &mut [f32]);
fn activation(&self, data: &mut [f32], activation_type: ActivationType); // ReLU via SIMD
fn add(&self, a: &mut [f32], b: &[f32]);
fn axpy(&self, a: &mut [f32], b: &[f32], scalar: f32);
fn name(&self) -> &'static str { "WASM-SIMD" }
fn simd_width(&self) -> usize { 4 } // 128-bit = 4 x f32
}Based on the sublinear-time solver's core operations, the following SIMD acceleration points are identified:
| Operation | SIMD Strategy | Expected Speedup | Existing Pattern |
|---|---|---|---|
| Distance computation (dot, cosine, euclidean) | f32x4_mul + f32x4_add accumulation |
2-4x | ruvector-delta-wasm/src/simd.rs |
| Vector normalization | f32x4_mul (scale) + f32x4_add (L2 norm) |
2-4x | simd_l2_norm_squared, simd_scale |
| Bitset operations (partition tracking) | v128_xor, v128_and, popcount via lookup |
4-8x | ruvector-mincut/src/wasm/simd.rs |
| Sparse matrix-vector multiply | SIMD dot + sparse row selection | 2-4x | WasmBackend::sparse_matmul |
| Activation functions (ReLU, GELU) | f32x4_max with zero splat |
2-4x | relu_wasm_simd |
| Thompson Sampling bandit updates | Scalar (branching-heavy) | 1x (no benefit) | N/A |
| Sort/selection (top-k) | Scalar (comparison-heavy) | 1x (no benefit) | N/A |
The ruvector-wasm crate exposes SIMD detection to JS:
#[wasm_bindgen(js_name = detectSIMD)]
pub fn detect_simd() -> bool {
#[cfg(target_feature = "simd128")]
{ true }
#[cfg(not(target_feature = "simd128"))]
{ false }
}For the sublinear-time solver, SIMD should be compiled in via RUSTFLAGS="-C target-feature=+simd128" at build time, with scalar fallbacks for environments that do not support it.
The native codebase (ruvector-core/src/simd_intrinsics.rs) supports:
- x86_64: AVX2 (256-bit, 8 x f32), AVX-512 (512-bit, 16 x f32), FMA, INT8 quantized
- aarch64: NEON (128-bit, 4 x f32), 4x loop unrolling, FMA via
vfmaq_f32 - WASM: SIMD128 (128-bit, 4 x f32)
WASM SIMD128 provides the same width as NEON (4 x f32) but lacks FMA (f32x4_fma is not available in stable WASM SIMD). This means the sublinear-time solver WASM build will be approximately 2-3x slower than a native NEON build for distance computations, and 4-8x slower than an AVX-512 build. However, it will still be significantly faster than scalar fallback.
The kernel pack system at /crates/ruvector-wasm/src/kernel/memory.rs defines a mature shared memory protocol:
pub struct SharedMemoryProtocol {
total_size: usize, // Total memory in bytes
current_offset: usize, // Bump allocator position
alignment: usize, // Typically 16 bytes
}
impl SharedMemoryProtocol {
pub fn default_settings() -> Self {
Self::new(256, 16) // 256 pages = 16 MB, 16-byte alignment
}
pub fn allocate(&mut self, size: usize) -> Result<usize, KernelError> {
let aligned_offset = self.align_offset(self.current_offset);
// ...bounds check...
self.current_offset = aligned_offset + size;
Ok(aligned_offset)
}
}The KernelInvocationDescriptor manages tensor memory layout:
pub struct KernelInvocationDescriptor {
pub descriptor: KernelDescriptor, // input_a, input_b, output, scratch, params offsets+sizes
protocol: SharedMemoryProtocol,
}The MemoryLayoutValidator prevents region overlap and bounds violations.
The wasm-bindgen crates use Float32Array for zero-copy data transfer between JS and WASM:
// Input: JS Float32Array -> Rust Vec<f32>
pub fn insert(&self, vector: Float32Array, ...) -> Result<String, JsValue> {
let vector_data: Vec<f32> = vector.to_vec(); // Copy from JS typed array
// ...
}
// Output: Rust Vec<f32> -> JS Float32Array
pub fn vector(&self) -> Float32Array {
Float32Array::from(&self.inner.vector[..]) // Copy to JS typed array
}Note: Float32Array::to_vec() and Float32Array::from() perform copies. True zero-copy requires accessing WASM linear memory directly from JS, which is demonstrated in the pwa-loader:
// Zero-copy write into WASM memory
function wasmWrite(data) {
const ptr = wasmInstance.exports.rvf_alloc(data.length);
const mem = new Uint8Array(wasmMemory.buffer, ptr, data.length);
mem.set(data); // Direct memory write
return ptr;
}
// Zero-copy read from WASM memory
function wasmRead(ptr, len) {
return new Uint8Array(wasmMemory.buffer, ptr, len).slice();
}The no_std solver uses dlmalloc as global allocator and manages its own instance registry:
// Global mutable registry - safe in single-threaded WASM
static mut REGISTRY: Registry = Registry::new();
const MAX_INSTANCES: usize = 8;
struct SolverInstance {
solver: AdaptiveSolver,
last_result_json: Vec<u8>, // Heap-allocated via dlmalloc
policy_json: Vec<u8>,
witness_chain: Vec<u8>,
}Memory export for external reads uses raw pointer copies:
#[no_mangle]
pub extern "C" fn rvf_solver_result_read(handle: i32, out_ptr: i32) -> i32 {
let data = &inst.last_result_json;
unsafe {
core::ptr::copy_nonoverlapping(data.as_ptr(), out_ptr as *mut u8, data.len());
}
data.len() as i32
}| Configuration | Max Pages | Memory Limit | Context |
|---|---|---|---|
| Server runtime | 1024 | 64 MB | RuntimeConfig::server() |
| Embedded runtime | 64 | 4 MB | RuntimeConfig::embedded() |
| Default shared memory | 256 | 16 MB | SharedMemoryProtocol::default_settings() |
| Microkernel (RVF) | 2-4 | 128-256 KB | WasmHeader min/max pages |
| WASM page size | 1 | 64 KB | WASM_PAGE_SIZE = 65536 |
The ruvector-wasm crate enforces input validation at the WASM boundary:
const MAX_VECTOR_DIMENSIONS: usize = 65536;
#[wasm_bindgen(constructor)]
pub fn new(vector: Float32Array, ...) -> Result<JsVectorEntry, JsValue> {
let vec_len = vector.length() as usize;
if vec_len == 0 {
return Err(JsValue::from_str("Vector cannot be empty"));
}
if vec_len > MAX_VECTOR_DIMENSIONS {
return Err(JsValue::from_str(&format!(
"Vector dimensions {} exceed maximum allowed {}", vec_len, MAX_VECTOR_DIMENSIONS
)));
}
// ...
}The ruvector-wasm crate is browser-first, using:
- IndexedDB persistence:
web-sysfeatures includeIdbDatabase,IdbFactory,IdbObjectStore,IdbRequest,IdbTransaction,IdbOpenDbRequest(/crates/ruvector-wasm/Cargo.toml) - Web Workers: Embedded JavaScript worker pool (
/crates/ruvector-wasm/src/worker-pool.js,/crates/ruvector-wasm/src/worker.js) for parallel operations - Tracing via console:
tracing-wasmsends logs to browser dev tools - Promise-based async:
wasm-bindgen-futuresfor async operations - getrandom via JS:
getrandomwithwasm_jsfeature usescrypto.getRandomValues() - PWA support: The pwa-loader example (
/examples/pwa-loader/app.js) demonstrates offline-capable WASM loading
// From examples/pwa-loader/app.js
async function loadWasm() {
const response = await fetch(WASM_PATH);
const bytes = await response.arrayBuffer();
const importObject = { env: {} };
const result = await WebAssembly.instantiate(bytes, importObject);
wasmInstance = result.instance;
wasmMemory = wasmInstance.exports.memory;
}WASM SIMD128 is supported in Chrome 91+, Firefox 89+, Safari 16.4+, and Edge 91+. This covers >95% of active browsers as of 2026. Feature detection can be done via:
const simdSupported = WebAssembly.validate(
new Uint8Array([0,97,115,109,1,0,0,0,1,5,1,96,0,1,123,3,2,1,0,10,10,1,8,0,65,0,253,15,253,98,11])
);The project supports Node.js via:
- wasm-pack
--target nodejs: Generates CommonJS bindings - Direct instantiation from test scripts (
/scripts/test/test-wasm.mjs):
import { readFileSync } from 'fs';
const wasmBuffer = readFileSync(wasmPath);
const mathWasm = await import(join(pkgPath, 'ruvector_math_wasm.js'));
await mathWasm.default(wasmBuffer);- Edge-net example:
/examples/edge-net/pkg/node/provides Node-specific WASM packages
Node.js has had WASM SIMD support since v16.4 (V8 9.1+). For the sublinear-time solver, Node.js deployment enables server-side and CLI usage with the same WASM binary.
The micro-hnsw-wasm crate (11.8 KB) and rvf-solver-wasm (~160 KB) demonstrate ultra-compact deployment:
- iOS/Swift:
/examples/wasm/ios/includes Swift resources with embedded WASM - Self-bootstrapping: The WASM_SEG system embeds WASM interpreters inside data files
- Target platforms:
WasmTarget::Wasm32,WasiP1,WasiP2,Browser,BareTile
| Target | WASM Format | Binding | SIMD | Size Budget | Persistence |
|---|---|---|---|---|---|
| Browser (Chrome/FF/Safari) | wasm-bindgen | JS glue + TS types | SIMD128 | <500 KB | IndexedDB |
| Node.js (>= 16.4) | wasm-bindgen (nodejs) or raw | CommonJS/ESM | SIMD128 | <1 MB | fs |
| Cloudflare Workers | wasm-bindgen (web) | ESM | SIMD128 | <1 MB | KV |
| iOS/Swift | raw wasm32 | C FFI | Optional | <200 KB | CoreData |
| Bare-metal / RVF | no_std cdylib | extern "C" | Optional | <200 KB | None |
Used by most WASM crates. The #[wasm_bindgen(start)] attribute runs initialization automatically:
#[wasm_bindgen(start)]
pub fn init() {
console_error_panic_hook::set_once();
tracing_wasm::set_as_global_default();
}JS side (generated by wasm-pack):
import init, { VectorDB } from './ruvector_wasm.js';
await init(); // Loads + instantiates + runs start function
const db = new VectorDB(384, 'cosine', true);Used by the pwa-loader and no_std modules:
const response = await fetch(WASM_PATH);
const bytes = await response.arrayBuffer();
const importObject = { env: {} };
const result = await WebAssembly.instantiate(bytes, importObject);
wasmInstance = result.instance;
wasmMemory = wasmInstance.exports.memory;This pattern offers maximum control: the host can inspect exports before calling any function, handle errors granularly, and manage memory directly.
For large modules, WebAssembly.instantiateStreaming should be used (not currently in the codebase but recommended):
const result = await WebAssembly.instantiateStreaming(
fetch(WASM_PATH),
importObject
);This starts compiling while bytes are still downloading, reducing load time by up to 50%.
The @ruvector/wasm-unified uses lazy initialization:
let defaultEngine: UnifiedEngine | null = null;
export async function getDefaultEngine(): Promise<UnifiedEngine> {
if (!defaultEngine) {
defaultEngine = await createUnifiedEngine();
await defaultEngine.init();
}
return defaultEngine;
}The solver WASM uses a handle-based instance registry:
static mut REGISTRY: Registry = Registry::new(); // Max 8 concurrent solvers
// JS creates solver:
let handle = wasmInstance.exports.rvf_solver_create();
// JS uses solver:
wasmInstance.exports.rvf_solver_train(handle, 100, 1, 10, seedLo, seedHi);
// JS reads result:
let len = wasmInstance.exports.rvf_solver_result_len(handle);
let ptr = wasmInstance.exports.rvf_solver_alloc(len);
wasmInstance.exports.rvf_solver_result_read(handle, ptr);
let json = new TextDecoder().decode(new Uint8Array(wasmMemory.buffer, ptr, len));
// JS destroys:
wasmInstance.exports.rvf_solver_destroy(handle);This is the recommended pattern for the sublinear-time solver because it:
- Supports multiple concurrent solver instances
- Avoids global state issues
- Enables resource cleanup
- Works across all deployment targets (browser, Node, bare-metal)
The ruvector-wasm crate includes a built-in benchmark export:
#[wasm_bindgen(js_name = benchmark)]
pub fn benchmark(name: &str, iterations: usize, dimensions: usize) -> Result<f64, JsValue> {
let start = Instant::now();
for i in 0..iterations {
let vector: Vec<f32> = (0..dimensions)
.map(|_| js_sys::Math::random() as f32)
.collect();
let vector_arr = Float32Array::from(&vector[..]);
db.insert(vector_arr, Some(format!("vec_{}", i)), None)?;
}
let duration = start.elapsed();
Ok(iterations as f64 / duration.as_secs_f64())
}The /examples/benchmarks/src/bin/wasm_solver_bench.rs provides a native vs WASM comparison framework:
WASM vs Native AGI Solver Benchmark
Config: holdout=50, training=50, cycles=3, budget=200
NATIVE SOLVER RESULTS
Mode Acc% Cost Noise% Time Pass
A baseline xx.x% xxx.x xx.x% xxxms PASS
B compiler xx.x% xxx.x xx.x% xxxms PASS
C learned xx.x% xxx.x xx.x% xxxms PASS
WASM REFERENCE METRICS
Native total time: xxxms
WASM expected: ~xxxms (2-5x native)
This establishes the expected WASM overhead: 2-5x slower than native for the self-learning solver workload.
The /crates/prime-radiant/benches/simd_benchmarks.rs and /crates/ruvector-sparse-inference/benches/simd_kernels.rs provide Criterion benchmarks for SIMD operations that can be adapted for WASM SIMD.
sublinear-time-solver/benches/
wasm_bench.rs -- In-Rust Criterion benchmarks (native baseline)
wasm_bench.mjs -- Node.js WASM performance runner
wasm_bench.html -- Browser WASM performance runner
bench_harness.rs -- Shared benchmark harness (puzzle generation)
| Metric | Description | Measurement |
|---|---|---|
solve_throughput |
Puzzles solved per second | iterations / elapsed_secs |
solve_latency_p50 |
Median solve time | Percentile of individual solve times |
solve_latency_p99 |
99th percentile solve time | Percentile of individual solve times |
memory_peak_bytes |
Peak WASM linear memory usage | memory.buffer.byteLength |
module_load_ms |
Time to instantiate WASM module | performance.now() around WebAssembly.instantiate |
simd_speedup |
SIMD vs scalar performance ratio | Compare SIMD build vs non-SIMD build |
wasm_native_ratio |
WASM-to-native performance overhead | Compare WASM throughput vs native Criterion results |
binary_size_bytes |
Compiled .wasm file size | wc -c *.wasm |
accuracy_parity |
Solver accuracy matches native | Bit-exact or epsilon comparison of results |
- Native baseline: Run the solver natively with Criterion (3+ iterations, warm-up)
- WASM baseline: Load the same solver as WASM, run identical workload in Node.js
- WASM SIMD: Build with
RUSTFLAGS="-C target-feature=+simd128", measure speedup - Browser measurement: Run in Chrome with
performance.now(), measure real-world latency - Size budget: Track .wasm binary size across commits (regression alerts if >200 KB)
- Accuracy validation: Compare solver output JSON between native and WASM (must match to f64 epsilon)
For the sublinear-time solver WASM module, adopt the rvf-solver-wasm pattern:
- no_std + alloc: Minimizes binary size, avoids JS runtime dependency
- dlmalloc global allocator: Proven in rvf-solver-wasm
- extern "C" exports: Maximum portability (browser, Node, embedded, bare-metal)
- Handle-based instance registry: Supports concurrent solver instances
- Result reads via pointer+length: JSON serialization of results into WASM memory, host reads via typed array view
Do not use wasm-bindgen for the core solver. A thin wasm-bindgen wrapper can be created separately if a richer JS API is needed.
// In the solver crate
#[cfg(all(target_arch = "wasm32", target_feature = "simd128"))]
mod simd_wasm {
use core::arch::wasm32::*;
pub fn distance_l2_simd(a: &[f32], b: &[f32]) -> f32 { /* SIMD128 */ }
}
#[cfg(not(all(target_arch = "wasm32", target_feature = "simd128")))]
mod simd_wasm {
pub fn distance_l2_simd(a: &[f32], b: &[f32]) -> f32 { /* scalar fallback */ }
}Build two variants:
solver.wasm-- scalar fallback (maximum compatibility)solver-simd.wasm-- SIMD128 enabled (Chrome 91+, FF 89+, Safari 16.4+, Node 16.4+)
Adopt the SharedMemoryProtocol pattern from the kernel pack system:
- Allocate a fixed arena at solver creation (e.g., 256 pages = 16 MB)
- Use 16-byte aligned bump allocation for tensor data
- Reset the allocator between solve invocations (amortized O(1))
- Validate memory regions before kernel execution
- Export
memoryso the host can directly view/write typed arrays without copying
[profile.release]
opt-level = "z"
lto = true
codegen-units = 1
strip = true
panic = "abort"Target binary size: <200 KB (consistent with existing rvf-solver-wasm at ~160 KB).
#[no_mangle]
pub extern "C" fn solver_capabilities() -> u32 {
let mut caps = 0u32;
#[cfg(target_feature = "simd128")]
{ caps |= 0x01; } // SIMD available
#[cfg(feature = "thompson-sampling")]
{ caps |= 0x02; } // Thompson Sampling enabled
#[cfg(feature = "witness-chain")]
{ caps |= 0x04; } // Witness chain enabled
caps
}- Use
wasm-bindgen-testwithrun_in_browserfor browser tests (existing pattern) - Use the Node.js test harness at
/scripts/test/test-wasm.mjsas a template - Validate accuracy parity with native build via
wasm_solver_bench - Run SIMD-specific tests with
RUSTFLAGS="-C target-feature=+simd128"in CI
| File | Purpose |
|---|---|
/crates/ruvector-wasm/src/lib.rs |
Main VectorDB WASM bindings (wasm-bindgen) |
/crates/ruvector-wasm/src/kernel/mod.rs |
Kernel pack system entry point |
/crates/ruvector-wasm/src/kernel/memory.rs |
Shared memory protocol, bump allocator |
/crates/ruvector-wasm/src/kernel/runtime.rs |
Kernel runtime trait, mock runtime, manager |
/crates/ruvector-wasm/src/kernel/epoch.rs |
Epoch-based execution budgets |
/crates/ruvector-wasm/src/kernel/signature.rs |
Ed25519 kernel pack verification |
/crates/ruvector-wasm/src/kernel/manifest.rs |
Kernel manifest parsing |
/crates/ruvector-wasm/Cargo.toml |
WASM dependency configuration |
| File | Purpose |
|---|---|
/crates/ruvector-delta-wasm/src/simd.rs |
WASM SIMD128 f32x4 operations |
/crates/ruvector-sparse-inference/src/backend/wasm.rs |
WASM SIMD backend with Backend trait |
/crates/ruvector-mincut/src/wasm/simd.rs |
WASM SIMD128 bitset operations |
/crates/ruvector-core/src/simd_intrinsics.rs |
Native SIMD (AVX2/AVX-512/NEON) reference |
| File | Purpose |
|---|---|
/crates/rvf/rvf-solver-wasm/src/lib.rs |
Self-learning solver WASM exports (no_std) |
/crates/rvf/rvf-solver-wasm/src/engine.rs |
Adaptive solver engine |
/crates/rvf/rvf-solver-wasm/src/policy.rs |
PolicyKernel with Thompson Sampling |
/crates/rvf/rvf-solver-wasm/Cargo.toml |
no_std WASM build configuration |
| File | Purpose |
|---|---|
/Cargo.toml |
Workspace WASM dependencies and build profiles |
/scripts/test/test-wasm.mjs |
Node.js WASM test runner |
/examples/benchmarks/src/bin/wasm_solver_bench.rs |
Native vs WASM benchmark comparison |
/examples/pwa-loader/app.js |
Browser WASM loading and memory management |
| File | Purpose |
|---|---|
/crates/rvf/rvf-types/src/wasm_bootstrap.rs |
WasmHeader, WasmRole, WasmTarget, feature flags |
| File | Purpose |
|---|---|
/npm/packages/ruvector-wasm-unified/src/index.ts |
Unified WASM engine TypeScript API |
| Binary | Size | Strategy |
|---|---|---|
micro_hnsw.wasm |
11.8 KB | no_std, bare minimum |
ruvector_learning_wasm_bg.wasm |
39 KB | wasm-bindgen |
ruvector_exotic_wasm_bg.wasm |
149 KB | wasm-bindgen |
ruvector_nervous_system_wasm_bg.wasm |
178 KB | wasm-bindgen |
ruvector_economy_wasm_bg.wasm |
181 KB | wasm-bindgen |
ruvector_attention_unified_wasm_bg.wasm |
339 KB | wasm-bindgen |
rvf-solver-wasm (estimated) |
~160 KB | no_std + dlmalloc |
The sublinear-time solver should target the <200 KB range using the no_std approach, consistent with rvf-solver-wasm.