ADR-039: RVF Solver WASM — Self-Learning AGI Engine Integration

Field	Value
Status	Implemented
Date	2026-02-16 (updated 2026-02-17)
Deciders	RuVector core team
Supersedes	--
Related	ADR-032 (RVF WASM integration), ADR-037 (Publishable RVF acceptance test), ADR-038 (npx/rvlite witness verification)

Context

ADR-037 established the publishable RVF acceptance test with a SHAKE-256 witness chain, and ADR-038 planned npm integration for verifying those artifacts. However, neither the existing rvf-wasm microkernel nor the npm packages expose the actual self-learning engine that produces the AGI benchmarks.

The core AGI capabilities live exclusively in the Rust benchmarks crate (examples/benchmarks/src/):

PolicyKernel: Thompson Sampling two-signal model (safety Beta + cost EMA)
KnowledgeCompiler: Signature-based pattern cache with compiled skip-mode configs
AdaptiveSolver: Three-loop architecture (fast: solve, medium: policy, slow: compiler)
ReasoningBank: Trajectory tracking with checkpoint/rollback and non-regression gating
Acceptance test: Multi-cycle training/holdout evaluation with three ablation modes

These components have no FFI dependencies, no filesystem access during solve, and no system clock requirements — making them ideal candidates for WASM compilation.

Decision

Create `rvf-solver-wasm` as a standalone no_std WASM module

A new crate at crates/rvf/rvf-solver-wasm/ compiles the complete self-learning solver to wasm32-unknown-unknown. It is a no_std + alloc crate (same architecture as rvf-wasm) with a C ABI export surface.

Key design choices:

Choice	Rationale
no_std + alloc	Matches rvf-wasm pattern, runs in any WASM runtime (browser, Node.js, edge)
Self-contained types	Pure-integer `Date` type replaces `chrono` dependency; `BTreeMap` replaces `HashMap`
libm for float math	`sqrt`, `log`, `cos`, `pow` via `libm` crate (pure Rust, no_std compatible)
xorshift64 RNG	Deterministic, no `rand` crate dependency, identical to benchmarks RNG
C ABI exports	Maximum compatibility — works with any WASM host (no wasm-bindgen required)
Handle-based API	Up to 8 concurrent solver instances, same pattern as `rvf_store_*` exports

WASM Export Surface

┌─────────────────────────────────────────────────────┐
│              rvf-solver-wasm exports                │
├─────────────────────────────────────────────────────┤
│ Memory:                                             │
│   rvf_solver_alloc(size) -> ptr                     │
│   rvf_solver_free(ptr, size)                        │
│                                                     │
│ Lifecycle:                                          │
│   rvf_solver_create() -> handle                     │
│   rvf_solver_destroy(handle)                        │
│                                                     │
│ Training (three-loop learning):                     │
│   rvf_solver_train(handle, count,                   │
│     min_diff, max_diff, seed_lo, seed_hi) -> i32    │
│                                                     │
│ Acceptance test (full ablation):                    │
│   rvf_solver_acceptance(handle, holdout,            │
│     training, cycles, budget,                       │
│     seed_lo, seed_hi) -> i32                        │
│                                                     │
│ Result / Policy / Witness reads:                    │
│   rvf_solver_result_len(handle) -> i32              │
│   rvf_solver_result_read(handle, out_ptr) -> i32    │
│   rvf_solver_policy_len(handle) -> i32              │
│   rvf_solver_policy_read(handle, out_ptr) -> i32    │
│   rvf_solver_witness_len(handle) -> i32             │
│   rvf_solver_witness_read(handle, out_ptr) -> i32   │
└─────────────────────────────────────────────────────┘

Architecture Preserved in WASM

The WASM module preserves all five AGI capabilities:

Thompson Sampling two-signal model — Beta posterior for safety (correct & no early-commit) + EMA for cost. Gamma sampling via Marsaglia's method using libm.
18 context buckets — 3 range (small/medium/large) x 3 distractor (clean/some/heavy) x 2 noise = 18 buckets. Each bucket maintains per-arm stats for None, Weekday, Hybrid skip modes.
Speculative dual-path — When top-2 arms are within delta 0.15 and variance > 0.02, the solver speculatively executes the secondary arm. This is preserved identically in WASM.
KnowledgeCompiler — Constraint signature cache (v1:{difficulty}:{sorted_constraint_types}). Compiles successful trajectories into optimized configs with compiled skip-mode, step budget, and confidence scores.
Three-loop solver — Fast (constraint propagation + solve), Medium (PolicyKernel selection), Slow (ReasoningBank → KnowledgeCompiler). Checkpoint/rollback on accuracy regression.

Integration with RVF Ecosystem

┌──────────────────────┐         ┌──────────────────────┐
│   rvf-solver-wasm    │         │     rvf-wasm         │
│   (self-learning     │ ──────▶ │   (verification)     │
│    AGI engine)       │ witness │                      │
│                      │ chain   │ rvf_witness_verify   │
│ rvf_solver_train     │         │ rvf_witness_count    │
│ rvf_solver_acceptance│         │                      │
│ rvf_solver_witness_* │         │ rvf_store_*          │
└──────────┬───────────┘         └──────────────────────┘
           │ uses
    ┌──────▼──────┐
    │  rvf-crypto  │
    │  SHAKE-256   │
    │  witness     │
    │  chain       │
    └─────────────┘

The solver produces a SHAKE-256 witness chain (via rvf_crypto::create_witness_chain) for every acceptance test run. This chain is in the native 73-byte-per-entry format, directly verifiable by rvf_witness_verify in the rvf-wasm microkernel.

npm Integration Path

High-Level SDK (`@ruvector/rvf-solver`)

The @ruvector/rvf-solver npm package provides a typed TypeScript wrapper around the raw WASM C-ABI exports, with automatic WASM loading, memory management, and JSON deserialization.

import { RvfSolver } from '@ruvector/rvf-solver';

// Create solver (lazy-loads WASM on first call)
const solver = await RvfSolver.create();

// Train on 1000 puzzles (three-loop learning)
const result = solver.train({ count: 1000, minDifficulty: 1, maxDifficulty: 10, seed: 42n });
console.log(`Accuracy: ${(result.accuracy * 100).toFixed(1)}%`);

// Run full acceptance test (A/B/C ablation)
const manifest = solver.acceptance({ holdoutSize: 100, trainingPerCycle: 100, cycles: 5, seed: 42n });
console.log(`Mode C passed: ${manifest.allPassed}`);

// Inspect policy state (Thompson Sampling parameters, context buckets)
const policy = solver.policy();
console.log(`Context buckets: ${Object.keys(policy?.contextStats ?? {}).length}`);

// Get tamper-evident witness chain (73 bytes per entry, SHAKE-256)
const chain = solver.witnessChain();
console.log(`Witness chain: ${chain?.length ?? 0} bytes`);

solver.destroy();

The SDK also re-exports through the unified @ruvector/rvf package:

// Unified import — solver + database in one package
import { RvfDatabase, RvfSolver } from '@ruvector/rvf';

npm Package Structure

npm/packages/rvf-solver/
├── package.json          # @ruvector/rvf-solver, CJS/ESM dual exports
├── tsconfig.json         # ES2020 target, strict mode, declarations
├── pkg/
│   ├── rvf_solver.js     # WASM loader (singleton, Node CJS/ESM + browser)
│   ├── rvf_solver.d.ts   # Low-level WASM C-ABI type declarations
│   └── rvf_solver_bg.wasm  # Built from rvf-solver-wasm crate
└── src/
    ├── index.ts          # Barrel exports: RvfSolver + all types
    ├── solver.ts         # RvfSolver class (create/train/acceptance/policy/witnessChain/destroy)
    └── types.ts          # TrainOptions, AcceptanceManifest, PolicyState, etc.

Type	Fields	Purpose
`TrainOptions`	`count`, `minDifficulty?`, `maxDifficulty?`, `seed?`	Configure training run
`TrainResult`	`trained`, `correct`, `accuracy`, `patternsLearned`	Training outcome
`AcceptanceOptions`	`holdoutSize?`, `trainingPerCycle?`, `cycles?`, `stepBudget?`, `seed?`	Configure acceptance test
`AcceptanceManifest`	`modeA`, `modeB`, `modeC`, `allPassed`, `witnessEntries`, `witnessChainBytes`	Full ablation results
`PolicyState`	`contextStats`, `earlyCommitPenalties`, `prepass`, `speculativeAttempts`	Thompson Sampling state
`SkipModeStats`	`attempts`, `successes`, `alphaSafety`, `betaSafety`, `costEma`	Per-arm bandit stats

Low-Level WASM Usage (advanced)

// Direct WASM C-ABI usage (without the SDK wrapper)
const wasm = await WebAssembly.instantiate(solverModule);

const handle = wasm.exports.rvf_solver_create();
const correct = wasm.exports.rvf_solver_train(handle, 1000, 1, 10, 42, 0);

const len = wasm.exports.rvf_solver_result_len(handle);
const ptr = wasm.exports.rvf_solver_alloc(len);
wasm.exports.rvf_solver_result_read(handle, ptr);
const json = new TextDecoder().decode(new Uint8Array(wasm.memory.buffer, ptr, len));

// Witness chain verifiable by rvf-wasm
const wLen = wasm.exports.rvf_solver_witness_len(handle);
const wPtr = wasm.exports.rvf_solver_alloc(wLen);
wasm.exports.rvf_solver_witness_read(handle, wPtr);
const chain = new Uint8Array(wasm.memory.buffer, wPtr, wLen);
const verified = rvfWasm.exports.rvf_witness_verify(chainPtr, wLen);

wasm.exports.rvf_solver_destroy(handle);

Module Structure

crates/rvf/rvf-solver-wasm/
├── Cargo.toml          # no_std + alloc, dlmalloc, libm, serde_json
├── src/
│   ├── lib.rs          # WASM exports, instance registry, panic handler
│   ├── alloc_setup.rs  # dlmalloc global allocator, rvf_solver_alloc/free
│   ├── types.rs        # Date arithmetic, Constraint, Puzzle, Rng64
│   ├── policy.rs       # PolicyKernel, Thompson Sampling, KnowledgeCompiler
│   └── engine.rs       # AdaptiveSolver, ReasoningBank, PuzzleGenerator, acceptance test

File	Lines	Purpose
`types.rs`	239	Pure-integer date math (Howard Hinnant algorithm), constraints, puzzle type
`policy.rs`	~480	Full Thompson Sampling with Marsaglia gamma sampling, 18-bucket context
`engine.rs`	~490	Three-loop solver, acceptance test runner, puzzle generator
`lib.rs`	~280	12 WASM exports, handle registry (8 slots), witness chain integration

Binary Size

Build	Size
Release (wasm32-unknown-unknown)	~171 KB
After wasm-opt -Oz	132 KB

npm Package Ecosystem

The AGI solver is exposed through a layered npm package architecture:

Package	Version	Role	Install
`@ruvector/rvf-solver`	0.1.3	Typed TypeScript SDK for the self-learning solver	`npm i @ruvector/rvf-solver`
`@ruvector/rvf`	0.1.9	Unified SDK re-exporting solver + database	`npm i @ruvector/rvf`
`@ruvector/rvf-node`	0.1.7	Native NAPI bindings with AGI methods (`indexStats`, `verifyWitness`, `freeze`, `metric`)	`npm i @ruvector/rvf-node`
`@ruvector/rvf-wasm`	0.1.6	WASM microkernel with witness verification	`npm i @ruvector/rvf-wasm`

Dependency Graph

@ruvector/rvf (unified SDK)
├── @ruvector/rvf-node (required, native NAPI)
├── @ruvector/rvf-wasm (optional, browser fallback)
└── @ruvector/rvf-solver (optional, AGI solver)
    └── rvf-solver-wasm WASM binary (loaded at runtime)

AGI NAPI Methods (rvf-node)

The native NAPI bindings expose AGI-relevant methods beyond basic vector CRUD:

Method	Returns	Purpose
`indexStats()`	`RvfIndexStats`	HNSW index statistics (layers, M, ef_construction, indexed count)
`verifyWitness()`	`RvfWitnessResult`	Verify tamper-evident SHAKE-256 witness chain integrity
`freeze()`	`void`	Snapshot-freeze current state for deterministic replay
`metric()`	`string`	Get distance metric name (`l2`, `cosine`, `dotproduct`)

Consequences

Positive

The actual self-learning AGI engine runs in the browser, Node.js, and edge runtimes via WASM
No Rust toolchain required for end users — npm install + WASM load is sufficient
Deterministic: same seed → same puzzles → same learning → same witness chain
Witness chains produced in WASM are verifiable by the existing rvf_witness_verify export
PolicyKernel state is inspectable via rvf_solver_policy_read (JSON serializable)
Handle-based API supports up to 8 concurrent solver instances
132 KB binary (after wasm-opt -Oz) includes the complete solver, Thompson Sampling, and serde_json
TypeScript SDK (@ruvector/rvf-solver) provides ergonomic async API with automatic WASM memory management
Unified SDK (@ruvector/rvf) re-exports solver alongside database for single-import usage
Native NAPI bindings expose AGI methods (index stats, witness verification, freeze) for server-side usage

Negative

Date arithmetic is reimplemented (pure-integer) rather than using chrono, requiring validation against the original
HashMap → BTreeMap changes iteration order (sorted vs hash-order), which may produce different witness chain hashes than the native benchmarks
Float math via libm may have minor precision differences vs std f64 methods, affecting Thompson Sampling distributions
The puzzle generator is simplified compared to the full benchmarks generator (no cross-cultural constraints)

Neutral

The native benchmarks crate remains the reference implementation for full-fidelity acceptance tests
The WASM module is a faithful port, not a binding — both implementations should converge on the same acceptance test outcomes given identical seeds
rvf-solver-wasm is a member of the crates/rvf workspace alongside rvf-wasm

Implementation Notes (2026-02-17)

WASM loader (pkg/rvf_solver.js) rewritten as pure CJS to fix ESM/CJS interop — import.meta.url and export default removed
Snake_case → camelCase field mapping added in solver.ts for train(), policy(), and acceptance() methods
AcceptanceModeResult type updated to match actual WASM output: passed, accuracyMaintained, costImproved, robustnessImproved, zeroViolations, dimensionsImproved, cycles[]
SDK tests added at npm/packages/rvf-solver/test/solver.test.mjs (import validation, type structure, WASM integration)
Security review completed: WASM loader path validation flagged as LOW risk (library-internal API), JSON.parse on WASM memory is trusted

Appendix: Public Package Documentation

@ruvector/rvf-solver

Self-learning temporal solver with Thompson Sampling, PolicyKernel, ReasoningBank, and SHAKE-256 tamper-evident witness chains. Runs in the browser, Node.js, and edge runtimes via WebAssembly.

Install

npm install @ruvector/rvf-solver

Or via the unified SDK:

npm install @ruvector/rvf

Features

Thompson Sampling two-signal model — safety Beta distribution + cost EMA for adaptive policy selection
18 context-bucketed bandits — 3 range x 3 distractor x 2 noise levels for fine-grained context awareness
KnowledgeCompiler with signature-based pattern cache — distills learned patterns into reusable compiled configurations
Speculative dual-path execution — runs two candidate arms in parallel, picks the winner
Three-loop adaptive solver — fast: constraint propagation solve, medium: PolicyKernel skip-mode selection, slow: KnowledgeCompiler pattern distillation
SHAKE-256 tamper-evident witness chain — 73 bytes per entry, cryptographically linked proof of all operations
Full acceptance test with A/B/C ablation modes — validates learned policy outperforms fixed and compiler baselines
~132 KB WASM binary, no_std — runs anywhere WebAssembly does (browsers, Node.js, Deno, Cloudflare Workers, edge runtimes)

Quick Start

import { RvfSolver } from '@ruvector/rvf-solver';

// Create a solver instance (loads WASM on first call)
const solver = await RvfSolver.create();

// Train on 100 puzzles (difficulty 1-5)
const result = solver.train({ count: 100, minDifficulty: 1, maxDifficulty: 5 });
console.log(`Accuracy: ${(result.accuracy * 100).toFixed(1)}%`);
console.log(`Patterns learned: ${result.patternsLearned}`);

// Run full acceptance test (A/B/C ablation)
const manifest = solver.acceptance({ cycles: 3 });
console.log(`All passed: ${manifest.allPassed}`);

// Inspect Thompson Sampling policy state
const policy = solver.policy();
console.log(`Context buckets: ${Object.keys(policy?.contextStats ?? {}).length}`);
console.log(`Speculative attempts: ${policy?.speculativeAttempts}`);

// Get raw SHAKE-256 witness chain
const chain = solver.witnessChain();
console.log(`Witness chain: ${chain?.length ?? 0} bytes`);

// Free WASM resources
solver.destroy();

API Reference

`RvfSolver.create(): Promise<RvfSolver>`

Creates a new solver instance. Initializes the WASM module on the first call; subsequent calls reuse the loaded module. Up to 8 concurrent instances are supported.

`solver.train(options: TrainOptions): TrainResult`

Trains the solver on randomly generated puzzles using the three-loop architecture. The fast loop applies constraint propagation, the medium loop selects skip modes via Thompson Sampling, and the slow loop distills patterns into the KnowledgeCompiler cache.

`solver.acceptance(options?: AcceptanceOptions): AcceptanceManifest`

Runs the full acceptance test with training/holdout cycles across all three ablation modes (A, B, C). Returns a manifest with per-cycle metrics, pass/fail status, and witness chain metadata.

`solver.policy(): PolicyState | null`

Returns the current Thompson Sampling policy state including per-context-bucket arm statistics, KnowledgeCompiler cache stats, and speculative execution counters. Returns null if no training has been performed.

`solver.witnessChain(): Uint8Array | null`

Returns the raw SHAKE-256 witness chain bytes. Each entry is 73 bytes and provides tamper-evident proof of all training and acceptance operations. Returns null if the chain is empty. The returned Uint8Array is a copy safe to use after destroy().

`solver.destroy(): void`

Frees the WASM solver instance and releases all associated memory. The instance must not be used after calling destroy().

Types

TrainOptions

Field	Type	Default	Description
`count`	`number`	required	Number of puzzles to generate and solve
`minDifficulty`	`number`	`1`	Minimum puzzle difficulty (1-10)
`maxDifficulty`	`number`	`10`	Maximum puzzle difficulty (1-10)
`seed`	`bigint \| number`	random	RNG seed for reproducible runs

TrainResult

Field	Type	Description
`trained`	`number`	Number of puzzles trained on
`correct`	`number`	Number solved correctly
`accuracy`	`number`	Accuracy ratio (correct / trained)
`patternsLearned`	`number`	Patterns distilled by the ReasoningBank

AcceptanceOptions

Field	Type	Default	Description
`holdoutSize`	`number`	`50`	Number of holdout puzzles per cycle
`trainingPerCycle`	`number`	`200`	Number of training puzzles per cycle
`cycles`	`number`	`5`	Number of train/test cycles
`stepBudget`	`number`	`500`	Maximum constraint propagation steps per puzzle
`seed`	`bigint \| number`	random	RNG seed for reproducible runs

AcceptanceManifest

Field	Type	Description
`version`	`number`	Manifest schema version
`modeA`	`AcceptanceModeResult`	Mode A results (fixed heuristic)
`modeB`	`AcceptanceModeResult`	Mode B results (compiler-suggested)
`modeC`	`AcceptanceModeResult`	Mode C results (learned policy)
`allPassed`	`boolean`	`true` if Mode C passed
`witnessEntries`	`number`	Number of entries in the witness chain
`witnessChainBytes`	`number`	Total witness chain size in bytes

AcceptanceModeResult

Field	Type	Description
`passed`	`boolean`	Whether this mode met the accuracy threshold
`accuracyMaintained`	`boolean`	Accuracy maintained across cycles
`costImproved`	`boolean`	Cost per solve improved
`robustnessImproved`	`boolean`	Noise robustness improved
`zeroViolations`	`boolean`	No constraint violations
`dimensionsImproved`	`number`	Number of dimensions that improved
`cycles`	`CycleMetrics[]`	Per-cycle accuracy and cost metrics

PolicyState

Field	Type	Description
`contextStats`	`Record<string, Record<string, SkipModeStats>>`	Per-context-bucket, per-arm Thompson Sampling statistics
`earlyCommitPenalties`	`number`	Total early-commit penalty cost
`earlyCommitsTotal`	`number`	Total early-commit attempts
`earlyCommitsWrong`	`number`	Early commits that were incorrect
`prepass`	`string`	Current prepass strategy identifier
`speculativeAttempts`	`number`	Number of speculative dual-path executions
`speculativeArm2Wins`	`number`	Times the second speculative arm won

Acceptance Test Modes

The acceptance test validates the solver's learning capability through three ablation modes run across multiple train/test cycles:

Mode A (Fixed) -- Uses a fixed heuristic skip-mode policy. This establishes the baseline performance without any learning. The policy does not adapt regardless of puzzle characteristics.

Mode B (Compiler) -- Uses the KnowledgeCompiler's signature-based pattern cache to select skip modes. The compiler distills observed patterns into compiled configurations but does not perform online Thompson Sampling updates.

Mode C (Learned) -- Uses the full Thompson Sampling two-signal model with context-bucketed bandits. This is the complete system: the fast loop solves, the medium loop selects arms based on safety Beta and cost EMA, and the slow loop feeds patterns back to the compiler. Mode C should outperform both A and B, demonstrating genuine self-improvement.

The test passes when Mode C achieves the accuracy threshold on holdout puzzles. The witness chain records every training and evaluation operation for tamper-evident auditability.

Architecture

The solver uses a three-loop adaptive architecture:

+-----------------------------------------------+
|  Slow Loop: KnowledgeCompiler                  |
|  - Signature-based pattern cache               |
|  - Distills observations into compiled configs  |
+-----------------------------------------------+
        |                          ^
        v                          |
+-----------------------------------------------+
|  Medium Loop: PolicyKernel                     |
|  - Thompson Sampling (safety Beta + cost EMA)  |
|  - 18 context buckets (range x distractor x noise) |
|  - Speculative dual-path execution             |
+-----------------------------------------------+
        |                          ^
        v                          |
+-----------------------------------------------+
|  Fast Loop: Constraint Propagation Solver      |
|  - Generates and solves puzzles                |
|  - Reports outcomes back to PolicyKernel       |
+-----------------------------------------------+
        |
        v
+-----------------------------------------------+
|  SHAKE-256 Witness Chain (73 bytes/entry)      |
|  - Tamper-evident proof of all operations      |
+-----------------------------------------------+

Unified SDK

When using the @ruvector/rvf unified SDK, the solver is available as a sub-module:

import { RvfSolver } from '@ruvector/rvf';

const solver = await RvfSolver.create();
const result = solver.train({ count: 100 });
console.log(`Accuracy: ${(result.accuracy * 100).toFixed(1)}%`);
solver.destroy();

Related Packages

Package	Description
`@ruvector/rvf`	Unified TypeScript SDK
`@ruvector/rvf-node`	Native N-API bindings for Node.js
`@ruvector/rvf-wasm`	Browser WASM package
`@ruvector/rvf-mcp-server`	MCP server for AI agents

FilesExpand file tree

ADR-039-rvf-solver-wasm-agi-integration.md

Latest commit

History