Status: Proposed Date: 2026-02-08 Parent: ADR-017 Temporal Tensor Compression, ADR-005 WASM Runtime Integration, ADR-018 Block-Based Storage Engine Author: System Architecture Team
| Version | Date | Author | Changes |
|---|---|---|---|
| 0.1 | 2026-02-08 | Architecture Team | Initial proposal |
This ADR defines the WASM API surface for the Temporal Tensor Store (TTS),
enabling the tiering gate and quantizer to be called from Node.js and
browser environments with identical semantics. The design extends the
frame-level ttc_* FFI established in ADR-017 with a new block-level tts_*
function set, introduces host-imported IO functions for pluggable storage
backends, and specifies the cross-platform binding strategy for native, Node.js,
browser, and edge targets.
The API surface is intentionally narrow -- five core exports, three host imports, and two memory-management helpers -- to minimise the attack surface exposed across the WASM boundary while remaining sufficient for full tiered tensor storage operations.
ADR-017 established a Rust-native temporal tensor compressor with a WASM FFI
layer (ttc_* functions) for frame-level compression. ADR-005 established the
WASM sandboxing model with epoch-based interruption and raw ABI. ADR-018
defined the block-based storage engine with tiered placement.
However, these designs assume the storage backend is directly accessible from within the WASM module. In practice:
- Node.js: Storage lives in AgentDB/RuVector file-backed databases that the WASM module cannot access directly via filesystem calls.
- Browser: Persistent storage requires IndexedDB, which is asynchronous and unavailable from within WASM linear memory.
- Edge/Embedded: Storage may be in-memory only, with no filesystem at all.
The WASM module must delegate all IO to the host via imported functions, while retaining ownership of the tiering policy, quantization logic, and block management.
WASM's value types are limited to i32, i64, f32, and f64. The Temporal
Tensor Store uses u128 tensor identifiers internally, but u128 cannot cross
the WASM FFI boundary as a single value. The standard solution is to split the
identifier into two u64 halves (hi and lo), which the host reconstructs
on its side.
| Goal | Rationale |
|---|---|
| Narrow API surface (< 10 exports) | Minimise WASM boundary complexity and audit scope |
| Host-delegated IO | Enable platform-specific storage without WASM recompilation |
| Zero-copy where possible | Avoid redundant copies across the WASM boundary |
| Identical semantics across platforms | Same WASM binary runs on Node.js, browser, and edge |
Coexistence with ADR-017 ttc_* |
Both function sets share the same WASM module |
We extend the WASM module with five core export functions and two memory
management helpers, all using extern "C" linkage with #[no_mangle]:
// Initialize the store with a JSON-encoded policy configuration.
// Returns 0 on success, negative error code on failure.
int32_t tts_init(const uint8_t* policy_ptr, usize policy_len) -> i32;
// Ingest a tensor block. The tensor_id is split into hi/lo halves.
// data_ptr points to f32 values in WASM linear memory.
// Returns 0 on success, negative error code on failure.
int32_t tts_put(uint64_t tensor_id_hi, uint64_t tensor_id_lo,
uint32_t block_index,
const float* data_ptr, usize data_len) -> i32;
// Read a tensor block, dequantized back to f32.
// out_ptr is a pre-allocated buffer in WASM linear memory.
// Returns 0 on success, negative error code on failure.
int32_t tts_get(uint64_t tensor_id_hi, uint64_t tensor_id_lo,
uint32_t block_index,
float* out_ptr, usize out_len) -> i32;
// Run a maintenance tick: promote/demote blocks, evict to meet budgets.
// budget_bytes: maximum bytes to write during this tick.
// budget_ops: maximum IO operations during this tick.
// Returns number of blocks moved, or negative error code.
int32_t tts_tick(uint32_t budget_bytes, uint32_t budget_ops) -> i32;
// Write a JSON-encoded statistics snapshot into out_ptr.
// Returns bytes written, or negative error code if buffer too small.
int32_t tts_stats(uint8_t* out_ptr, usize out_len) -> i32;The WASM module imports three functions from the host environment for all
persistent IO. These are declared in the "tts_host" import namespace:
// Read a block from host storage into dst buffer.
// tier: 0=hot, 1=warm, 2=cold
// key_ptr/key_len: block key (tensor_id:block_index encoded as bytes)
// dst_ptr/dst_len: destination buffer in WASM linear memory
// Returns bytes read, or negative error code.
int32_t read_block(uint32_t tier, const uint8_t* key_ptr, usize key_len,
uint8_t* dst_ptr, usize dst_len) -> i32;
// Write a block to host storage from src buffer.
// Returns 0 on success, negative error code on failure.
int32_t write_block(uint32_t tier, const uint8_t* key_ptr, usize key_len,
const uint8_t* src_ptr, usize src_len) -> i32;
// Delete a block from host storage.
// Returns 0 on success, negative error code on failure.
int32_t delete_block(uint32_t tier, const uint8_t* key_ptr, usize key_len) -> i32;Platform-specific host bindings:
| Platform | read_block |
write_block |
delete_block |
|---|---|---|---|
| Node.js | AgentDB get | AgentDB put | AgentDB delete |
| Browser | IndexedDB getAll | IndexedDB put | IndexedDB delete |
| Native (server) | mmap read | mmap write | unlink |
| Edge/Embedded | ArrayBuffer slice | ArrayBuffer copy | zeroed/freed |
// Allocate len bytes in WASM linear memory.
// Returns pointer to allocated region, or 0 on failure.
uint32_t tts_alloc(usize len) -> u32;
// Deallocate a previously allocated region.
void tts_dealloc(uint32_t ptr, usize len);
// Retrieve the last error message as a UTF-8 string.
// Returns bytes written, or negative if buffer too small.
int32_t tts_last_error(uint8_t* out_ptr, usize out_len) -> i32;+========================================================================+
| WASM Linear Memory |
|========================================================================|
| |
| 0x0000 +-----------------+ |
| | WASM Stack | (grows downward, managed by WASM runtime) |
| +-----------------+ |
| | Static Data | (STORE, policy config, error buffer) |
| +-----------------+ |
| | | |
| | Heap | (managed by tts_alloc / tts_dealloc) |
| | | |
| | +-------------+ | |
| | | Input Buffer| | Host writes f32 frames here |
| | | (f32[N]) | | via tts_alloc -> memcpy -> tts_put |
| | +-------------+ | |
| | | |
| | +-------------+ | |
| | | Output Buf | | tts_get writes dequantized f32 here |
| | | (f32[N]) | | Host reads after tts_get returns |
| | +-------------+ | |
| | | |
| | +-------------+ | |
| | | IO Staging | | Temporary buffer for host import calls |
| | | Buffer | | (read_block / write_block payloads) |
| | +-------------+ | |
| | | |
| 0xFFFF +-----------------+ (grows via memory.grow as needed) |
| |
+========================================================================+
HOST (Node.js / Browser / Native) GUEST (WASM Module)
==================================== ========================
1. Load WASM module
2. Provide host imports:
- tts_host::read_block
- tts_host::write_block
- tts_host::delete_block
3. Instantiate module
|
4. Encode policy as JSON bytes ------->|
5. ptr = tts_alloc(policy_len) | allocate in linear mem
6. Write policy bytes to ptr |
7. tts_init(ptr, policy_len) ------->| parse policy, init STORE
8. tts_dealloc(ptr, policy_len) | free policy buffer
|
--- INGEST LOOP --- |
|
9. buf = tts_alloc(N * 4) | allocate f32 buffer
10. Write f32 data into buf |
11. tts_put(id_hi, id_lo, idx, ------->| quantize frame
buf, N) | tier policy selects bits
| calls write_block(tier,
| key, compressed)
<-------| write_block import
12. Host persists block |
------->| returns 0 (success)
13. tts_dealloc(buf, N * 4) |
|
--- READ LOOP --- |
|
14. out = tts_alloc(N * 4) | allocate output buffer
15. tts_get(id_hi, id_lo, idx, ------->| calls read_block(tier,
out, N) | key, staging_buf)
<-------| read_block import
16. Host reads from storage, |
writes into staging_buf |
------->| dequantize into out
17. Host reads f32 from out |
18. tts_dealloc(out, N * 4) |
|
--- MAINTENANCE --- |
|
19. tts_tick(budget_bytes, ------->| evaluate tier scores
budget_ops) | promote/demote blocks
| calls write_block,
| delete_block as needed
<-------| host import callbacks
20. Host handles IO |
------->| returns blocks_moved
Exports (WASM -> Host):
| Export | Signature (WASM types) | Description |
|---|---|---|
tts_init |
(i32, i32) -> i32 |
Init store with policy JSON |
tts_put |
(i64, i64, i32, i32, i32) -> i32 |
Ingest tensor block |
tts_get |
(i64, i64, i32, i32, i32) -> i32 |
Read tensor block |
tts_tick |
(i32, i32) -> i32 |
Maintenance tick |
tts_stats |
(i32, i32) -> i32 |
Statistics snapshot |
tts_alloc |
(i32) -> i32 |
Allocate linear memory |
tts_dealloc |
(i32, i32) -> () |
Free linear memory |
tts_last_error |
(i32, i32) -> i32 |
Get error message |
Imports (Host -> WASM), namespace tts_host:
| Import | Signature (WASM types) | Description |
|---|---|---|
read_block |
(i32, i32, i32, i32, i32) -> i32 |
Read from host storage |
write_block |
(i32, i32, i32, i32, i32) -> i32 |
Write to host storage |
delete_block |
(i32, i32, i32) -> i32 |
Delete from host storage |
u128 tensor_id:
+----------------------------------+----------------------------------+
| hi (u64) | lo (u64) |
| bits [127..64] | bits [63..0] |
+----------------------------------+----------------------------------+
Reconstruction (host side):
tensor_id = (hi as u128) << 64 | (lo as u128)
Block key encoding (for host import calls):
key = tensor_id_hi.to_le_bytes() ++ tensor_id_lo.to_le_bytes() ++ block_index.to_le_bytes()
key_len = 8 + 8 + 4 = 20 bytes
This encoding is deterministic and platform-independent (little-endian).
Return code convention:
| Code | Name | Description |
|---|---|---|
| 0 | TTS_OK |
Operation succeeded |
| -1 | TTS_ERR_INVALID_HANDLE |
Store not initialized or handle invalid |
| -2 | TTS_ERR_TENSOR_EVICTED |
Requested block was evicted from all tiers |
| -3 | TTS_ERR_BUDGET_EXHAUSTED |
Tick budget fully consumed |
| -4 | TTS_ERR_IO |
Host IO import returned an error |
| -5 | TTS_ERR_CORRUPT_BLOCK |
Block data failed integrity check |
| -6 | TTS_ERR_BUFFER_TOO_SMALL |
Output buffer insufficient |
| -7 | TTS_ERR_INVALID_POLICY |
Policy JSON failed validation |
| -8 | TTS_ERR_NULL_POINTER |
Null pointer passed for required argument |
| -9 | TTS_ERR_ALLOC_FAILED |
Memory allocation failed |
Error message retrieval:
// Guest-side implementation
static mut LAST_ERROR: [u8; 256] = [0u8; 256];
static mut LAST_ERROR_LEN: usize = 0;
fn set_error(msg: &str) {
unsafe {
let bytes = msg.as_bytes();
let len = bytes.len().min(256);
LAST_ERROR[..len].copy_from_slice(&bytes[..len]);
LAST_ERROR_LEN = len;
}
}
#[no_mangle]
pub extern "C" fn tts_last_error(out_ptr: *mut u8, out_len: usize) -> i32 {
if out_ptr.is_null() {
return TTS_ERR_NULL_POINTER;
}
unsafe {
let copy_len = LAST_ERROR_LEN.min(out_len);
core::ptr::copy_nonoverlapping(LAST_ERROR.as_ptr(), out_ptr, copy_len);
copy_len as i32
}
}The WASM module uses linear memory exclusively. The host interacts with this
memory through the exported tts_alloc and tts_dealloc functions:
// Guest-side allocator (simple bump allocator for WASM)
#[no_mangle]
pub extern "C" fn tts_alloc(len: usize) -> u32 {
let layout = core::alloc::Layout::from_size_align(len, 4);
match layout {
Ok(layout) => {
let ptr = unsafe { alloc::alloc::alloc(layout) };
if ptr.is_null() {
set_error("allocation failed");
0
} else {
ptr as u32
}
}
Err(_) => {
set_error("invalid allocation layout");
0
}
}
}
#[no_mangle]
pub extern "C" fn tts_dealloc(ptr: u32, len: usize) {
if ptr == 0 || len == 0 {
return;
}
let layout = core::alloc::Layout::from_size_align(len, 4);
if let Ok(layout) = layout {
unsafe { alloc::alloc::dealloc(ptr as *mut u8, layout); }
}
}Lifecycle protocol:
- Host calls
tts_alloc(N)to get a pointer in WASM linear memory. - Host writes data into that pointer region (via
memory.bufferin JS). - Host calls
tts_put(...)ortts_init(...)with the pointer. - Host calls
tts_dealloc(ptr, N)to free the buffer. - For reads: host allocates output buffer, calls
tts_get(...), reads result, then deallocates.
| Platform | BlockIO Binding | MetaLog Binding | Async Model | Notes |
|---|---|---|---|---|
| Native (server) | Memory-mapped files per tier | Append-only file | Sync | mmap for zero-copy reads; direct filesystem access |
| Node.js (WASM) | AgentDB / RuVector | AgentDB | Sync wrapper over async | Host imports bridge WASM to AgentDB API |
| Browser (WASM) | IndexedDB | IndexedDB | Async wrapper needed | Requires Atomics.wait or promise-based shim |
| Edge / Embedded | In-memory buffers | In-memory ring | Sync | No persistence; eviction on budget pressure |
+------------------------------------------------------------------+
| Node.js Process |
| |
| +------------------+ +-----------------------------+ |
| | TypeScript API | | WASM Instance | |
| | | alloc | | |
| | tts.init(policy) |--------->| tts_init(ptr, len) | |
| | tts.put(id, blk, |--------->| tts_put(hi, lo, idx, | |
| | data) | | ptr, len) | |
| | tts.get(id, blk) |--------->| tts_get(hi, lo, idx, | |
| | tts.tick(budget) |--------->| ptr, len) | |
| | tts.stats() | | tts_tick(bytes, ops) | |
| +------------------+ | tts_stats(ptr, len) | |
| ^ +----------+------------------+ |
| | | |
| | host imports| |
| | v |
| +------+------+ +-----------+-----------+ |
| | AgentDB |<-------------| tts_host::read_block | |
| | (storage) |<-------------| tts_host::write_block | |
| | |<-------------| tts_host::delete_block| |
| +-------------+ +-----------------------+ |
+------------------------------------------------------------------+
In the browser, IndexedDB is asynchronous. The host imports must bridge this gap. Two strategies are available:
Strategy A: SharedArrayBuffer + Atomics (preferred for performance)
The host import writes to a shared buffer and signals completion via
Atomics.notify. The WASM thread (running in a Web Worker) waits via
Atomics.wait. This provides synchronous semantics from the WASM perspective.
Strategy B: Asyncify (fallback)
For browsers without SharedArrayBuffer support, the Asyncify transform
(applied at WASM compile time via wasm-opt --asyncify) enables the WASM
module to yield execution and resume after the host completes an async
IndexedDB operation.
| Strategy | Latency | Compatibility | Complexity |
|---|---|---|---|
| SharedArrayBuffer + Atomics | ~1ms per IO | Requires COOP/COEP headers | Moderate |
| Asyncify | ~2-5ms per IO | Universal | Higher (binary transform) |
For edge and embedded deployments, all storage is in-memory:
read_block: Returns data from a pre-allocatedArrayBufferorVec<u8>.write_block: Copies data into the in-memory store.delete_block: Zeros or frees the slot.- No persistence. The
tts_tickmaintenance function handles eviction when memory budget is exceeded. - The in-memory ring for MetaLog provides bounded audit logging with automatic overwrite of oldest entries.
ADR-017 defined frame-level compression functions (ttc_create, ttc_push_frame,
ttc_flush, ttc_decode_segment, etc.). ADR-022 introduces block-level storage
functions (tts_init, tts_put, tts_get, tts_tick, tts_stats).
Both function sets coexist in the same WASM module:
WASM Module Exports
===================================================
ADR-017 (frame-level compression) ADR-022 (block-level storage)
---------------------------------- ----------------------------
ttc_create tts_init
ttc_free tts_put
ttc_touch tts_get
ttc_set_access tts_tick
ttc_push_frame tts_stats
ttc_flush tts_alloc
ttc_decode_segment tts_dealloc
ttc_alloc tts_last_error
ttc_dealloc
===================================================
Shared allocator: tts_alloc and ttc_alloc use the same underlying
allocator. If both are present, either can be called; they are aliases.
Layering: tts_put internally invokes the ttc_* quantization pipeline
to compress the ingested f32 data before passing compressed blocks to the host
via write_block. tts_get reads compressed blocks via read_block and
invokes ttc_decode_segment to dequantize before writing f32 to the output
buffer.
// Single-threaded WASM: static mut is sound
static mut STORE: Option<TemporalTensorStore> = None;
// The store holds:
// - TierPolicy (from tts_init config)
// - Block metadata index (tensor_id -> block_index -> tier, size, access stats)
// - Active compressor handles (reusing ttc_* compressor pool from ADR-017)
// - IO staging buffer (reused across calls to avoid repeated allocation)The following types define the Node.js binding surface:
/** 128-bit tensor identifier, split for WASM compatibility. */
export interface TensorId {
/** Upper 64 bits of the tensor ID. */
readonly hi: bigint;
/** Lower 64 bits of the tensor ID. */
readonly lo: bigint;
}
/** Policy configuration for the Temporal Tensor Store. */
export interface TtsPolicy {
/** Minimum score for hot tier placement (default: 512). */
hot_min_score?: number;
/** Minimum score for warm tier placement (default: 64). */
warm_min_score?: number;
/** Bit width for warm tier: 5 or 7 (default: 7). */
warm_bits?: 5 | 7;
/** Drift tolerance as Q8 fixed-point: 26 = ~10% (default: 26). */
drift_pct_q8?: number;
/** Elements per quantization group (default: 64). */
group_len?: number;
/** Maximum bytes across all tiers before eviction. */
max_total_bytes?: number;
}
/** Statistics snapshot returned by tts.stats(). */
export interface TtsStats {
/** Number of tensor blocks in each tier. */
blocks_by_tier: { hot: number; warm: number; cold: number };
/** Total bytes stored in each tier. */
bytes_by_tier: { hot: number; warm: number; cold: number };
/** Total number of unique tensor IDs tracked. */
tensor_count: number;
/** Number of blocks promoted in the last tick. */
last_tick_promotions: number;
/** Number of blocks demoted in the last tick. */
last_tick_demotions: number;
/** Number of blocks evicted in the last tick. */
last_tick_evictions: number;
}
/** Budget parameters for a maintenance tick. */
export interface TtsTickBudget {
/** Maximum bytes to write during this tick. */
bytes: number;
/** Maximum IO operations during this tick. */
ops: number;
}
/** Result of a maintenance tick. */
export interface TtsTickResult {
/** Number of blocks moved (promoted + demoted + evicted). */
blocks_moved: number;
}
/** Error codes returned by tts_* functions. */
export const enum TtsError {
OK = 0,
INVALID_HANDLE = -1,
TENSOR_EVICTED = -2,
BUDGET_EXHAUSTED = -3,
IO_ERROR = -4,
CORRUPT_BLOCK = -5,
BUFFER_TOO_SMALL = -6,
INVALID_POLICY = -7,
NULL_POINTER = -8,
ALLOC_FAILED = -9,
}
/** Host IO interface that platform bindings must implement. */
export interface TtsHostIO {
/** Read a block from storage. Returns the block bytes. */
readBlock(tier: number, key: Uint8Array): Uint8Array | null;
/** Write a block to storage. */
writeBlock(tier: number, key: Uint8Array, data: Uint8Array): void;
/** Delete a block from storage. */
deleteBlock(tier: number, key: Uint8Array): void;
}
/**
* High-level TypeScript wrapper around the TTS WASM module.
*
* Usage:
* const tts = await TtsStore.create(wasmBytes, hostIO, policy);
* tts.put(tensorId, blockIndex, float32Data);
* const data = tts.get(tensorId, blockIndex);
* const moved = tts.tick({ bytes: 1048576, ops: 100 });
* const stats = tts.stats();
* tts.dispose();
*/
export declare class TtsStore {
/**
* Instantiate the WASM module and initialize the store.
* @param wasmBytes - Compiled WASM module bytes.
* @param hostIO - Platform-specific IO implementation.
* @param policy - Tiering policy configuration.
*/
static create(
wasmBytes: ArrayBuffer,
hostIO: TtsHostIO,
policy?: TtsPolicy,
): Promise<TtsStore>;
/**
* Ingest a tensor block.
* @param id - 128-bit tensor identifier (split into hi/lo).
* @param blockIndex - Block index within the tensor.
* @param data - Float32 data to store.
* @throws TtsStoreError on failure.
*/
put(id: TensorId, blockIndex: number, data: Float32Array): void;
/**
* Read a tensor block, dequantized to f32.
* @param id - 128-bit tensor identifier.
* @param blockIndex - Block index within the tensor.
* @returns Dequantized Float32Array.
* @throws TtsStoreError if block was evicted or corrupted.
*/
get(id: TensorId, blockIndex: number): Float32Array;
/**
* Run a maintenance tick to promote, demote, or evict blocks.
* @param budget - IO budget for this tick.
* @returns Number of blocks moved.
*/
tick(budget: TtsTickBudget): TtsTickResult;
/** Get a statistics snapshot. */
stats(): TtsStats;
/** Release all WASM resources. */
dispose(): void;
}// WASM (single-threaded): sound, no data races possible
static mut STORE: Option<TemporalTensorStore> = None;
// Native targets: MUST use thread-safe alternatives
#[cfg(not(target_arch = "wasm32"))]
thread_local! {
static STORE: RefCell<Option<TemporalTensorStore>> = RefCell::new(None);
}
// Or for shared-state native:
#[cfg(not(target_arch = "wasm32"))]
static STORE: once_cell::sync::Lazy<Mutex<Option<TemporalTensorStore>>> =
once_cell::sync::Lazy::new(|| Mutex::new(None));All exported functions validate pointers before use:
#[no_mangle]
pub extern "C" fn tts_put(
tensor_id_hi: u64, tensor_id_lo: u64,
block_index: u32,
data_ptr: *const f32, data_len: usize,
) -> i32 {
// Null check
if data_ptr.is_null() {
set_error("data_ptr is null");
return TTS_ERR_NULL_POINTER;
}
// Bounds check: ensure the slice is within WASM linear memory
#[cfg(debug_assertions)]
{
let end = (data_ptr as usize) + (data_len * core::mem::size_of::<f32>());
assert!(end <= core::arch::wasm32::memory_size(0) * 65536,
"data_ptr + data_len exceeds linear memory");
}
// Safe slice construction
let data = unsafe { core::slice::from_raw_parts(data_ptr, data_len) };
// ... proceed with quantization and storage
}The WASM module trusts that host-imported functions (read_block,
write_block, delete_block) behave correctly with respect to the pointers
passed to them. This is the standard WASM host-guest contract:
- The host must only read from
src_ptrranges within WASM linear memory. - The host must only write to
dst_ptrranges within WASM linear memory. - The host must not retain pointers across calls (WASM memory may relocate
on
memory.grow).
Debug builds include additional safety checks:
| Check | Location | Purpose |
|---|---|---|
| Pointer bounds | All exported functions | Prevent out-of-bounds access |
| Block key length | read_block, write_block |
Ensure 20-byte key format |
| Policy JSON validity | tts_init |
Reject malformed configuration |
| Tier range | Host import calls | Ensure tier in {0, 1, 2} |
| Alloc alignment | tts_alloc |
Ensure 4-byte alignment for f32 |
Rejected. WASI provides fd_read / fd_write for filesystem access, which
would allow the WASM module to perform IO directly. However, WASI filesystem
access is not available in browsers, and granting filesystem access to the WASM
module undermines the sandboxing model established in ADR-005. Host-imported IO
keeps the module fully sandboxed.
Rejected for now. The WASM Component Model provides richer type definitions
and automatic binding generation via WIT (WASM Interface Types). However, as
noted in ADR-005 section 3.1, the Component Model is still evolving and adds
canonical ABI overhead. The raw C ABI is stable, universally supported, and
sufficient for this narrow API surface. Migration path: the tts_* signatures
are designed to be expressible in WIT for future migration.
Rejected. Running ttc_* and tts_* in separate WASM modules would
require cross-module communication (via the host) for every put/get operation,
adding significant overhead. A single module with shared linear memory is
simpler and faster.
Rejected. While passing tensor_id as a *const u8 pointing to 16 bytes
would avoid the hi/lo split, it adds a pointer indirection and requires the
host to allocate and manage a 16-byte buffer for every call. The hi/lo split
uses value types only, which is more efficient and eliminates a class of
pointer-related bugs.
-
tts_initcorrectly parses JSON policy and initializes the store -
tts_putquantizes f32 data and delegates towrite_blockhost import -
tts_getcallsread_block, dequantizes, and writes f32 to output -
tts_tickevaluates tier scores and moves blocks between tiers -
tts_statsreturns valid JSON with tier-level statistics -
tts_last_errorreturns meaningful error messages for all error codes - Host imports are called with correct tier, key, and buffer parameters
- Same WASM binary works in Node.js and browser without recompilation
| Metric | Target | Notes |
|---|---|---|
tts_put latency (512-dim, WASM) |
< 5us | Includes quantization + host IO |
tts_get latency (512-dim, WASM) |
< 5us | Includes host IO + dequantization |
tts_tick latency (100 blocks) |
< 1ms | Budget-bounded |
| WASM binary size (tts + ttc) | < 150KB | Release build, wasm-opt -Oz |
| Memory overhead per tracked tensor | < 64 bytes | Metadata only, excludes block data |
| Platform | Requirement |
|---|---|
| Node.js 20+ | Full functionality with AgentDB backend |
| Chrome 110+ | Full functionality with IndexedDB backend |
| Firefox 110+ | Full functionality with IndexedDB backend |
| Safari 16.4+ | Full functionality (SharedArrayBuffer with COOP/COEP) |
| Deno 1.30+ | Full functionality with filesystem backend |
| Edge / Embedded | In-memory mode, no persistence |
| Risk | Severity | Likelihood | Mitigation |
|---|---|---|---|
| Browser async IO adds significant latency | High | Medium | SharedArrayBuffer + Atomics for sync semantics; batch IO in tts_tick |
| IndexedDB storage limits in browser | Medium | Medium | Implement LRU eviction in tts_tick; surface quota warnings in tts_stats |
| Host import ABI mismatch across platforms | High | Low | Comprehensive integration tests per platform; ABI versioning in policy JSON |
| WASM memory.grow invalidates host pointers | Medium | Medium | Document that host must re-read memory.buffer after any call that may allocate |
| Shared allocator contention between ttc/tts | Low | Low | Single-threaded WASM eliminates contention; native targets use separate pools |
| Future WASM multi-threading breaks static mut | Medium | Low | Replace with thread_local! for native; WASM threads require explicit opt-in |
-
IndexedDB transaction granularity: Should each
read_block/write_blockcall be a separate IndexedDB transaction, or should we batch within atts_tickinvocation? -
WASM module size budget: With both
ttc_*andtts_*in one module, the 150KB target may be tight. Should we provide atts_*-only build for environments that do not need frame-level compression? -
Policy hot-reload: Should
tts_initbe callable multiple times to update policy without losing block metadata, or should policy changes require a full re-initialization? -
Streaming reads: Should
tts_getsupport partial block reads (offset + length) for large tensor blocks, or always return the full block? -
Host import error propagation: When a host import returns an error, should
tts_put/tts_getpropagate the raw error code or map it to a TTS-specific error?
- Define
tts_*export functions inffi.rs - Define
tts_hostimport declarations - Implement
tts_initwith JSON policy parsing - Implement
tts_alloc/tts_dealloc/tts_last_error - Unit tests for error handling and pointer validation
- Implement
tts_putwith quantization pipeline andwrite_blockcalls - Implement
tts_getwithread_blockcalls and dequantization - Implement block key encoding (tensor_id + block_index)
- Integration tests with mock host imports
- Implement
tts_tickwith tier score evaluation - Implement block promotion/demotion with budget enforcement
- Implement
tts_statswith JSON serialization - Stress tests: 10K blocks, rapid tier transitions
- TypeScript wrapper class (
TtsStore) - AgentDB
TtsHostIOimplementation - npm package build with wasm-pack
- Integration tests against live AgentDB
- IndexedDB
TtsHostIOimplementation - SharedArrayBuffer + Atomics synchronization layer
- Asyncify fallback build
- Browser integration tests (Playwright)
- In-memory
TtsHostIOimplementation - Ring-buffer MetaLog for audit
- Memory budget enforcement tests
- Binary size optimization (wasm-opt -Oz)
- ADR-017: Temporal Tensor Compression with Tiered Quantization (this repo)
- ADR-005: WASM Runtime Integration (this repo)
- ADR-018: Block-Based Storage Engine (this repo)
- WebAssembly Specification, Section 5: Binary Format. https://webassembly.github.io/spec/core/binary/
- WebAssembly JS API. https://developer.mozilla.org/en-US/docs/WebAssembly/JavaScript_interface
- Asyncify: Turning WASM modules into async generators. https://kripken.github.io/blog/wasm/2019/07/16/asyncify.html
- IndexedDB API. https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API
- SharedArrayBuffer and Atomics. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer
- wasm-bindgen: Facilitating high-level interactions between WASM and JS. https://rustwasm.github.io/docs/wasm-bindgen/
- Pelkonen, T., et al. "Gorilla: A Fast, Scalable, In-Memory Time Series Database." VLDB 2015.
import { TtsHostIO } from "./types";
import { AgentDB } from "@ruvector/agentdb";
const TIER_NAMES = ["hot", "warm", "cold"] as const;
export class AgentDBHostIO implements TtsHostIO {
constructor(private readonly db: AgentDB) {}
readBlock(tier: number, key: Uint8Array): Uint8Array | null {
const namespace = `tts:${TIER_NAMES[tier]}`;
const keyHex = Buffer.from(key).toString("hex");
return this.db.getSync(namespace, keyHex);
}
writeBlock(tier: number, key: Uint8Array, data: Uint8Array): void {
const namespace = `tts:${TIER_NAMES[tier]}`;
const keyHex = Buffer.from(key).toString("hex");
this.db.putSync(namespace, keyHex, data);
}
deleteBlock(tier: number, key: Uint8Array): void {
const namespace = `tts:${TIER_NAMES[tier]}`;
const keyHex = Buffer.from(key).toString("hex");
this.db.deleteSync(namespace, keyHex);
}
}import { TtsHostIO } from "./types";
const DB_NAME = "tts-blocks";
const STORE_NAMES = ["hot", "warm", "cold"];
export class IndexedDBHostIO implements TtsHostIO {
private db: IDBDatabase | null = null;
async init(): Promise<void> {
return new Promise((resolve, reject) => {
const req = indexedDB.open(DB_NAME, 1);
req.onupgradeneeded = () => {
const db = req.result;
for (const store of STORE_NAMES) {
if (!db.objectStoreNames.contains(store)) {
db.createObjectStore(store);
}
}
};
req.onsuccess = () => { this.db = req.result; resolve(); };
req.onerror = () => reject(req.error);
});
}
readBlock(tier: number, key: Uint8Array): Uint8Array | null {
// With Asyncify, this synchronous-looking call actually yields
// to the event loop and resumes when the IDB transaction completes.
const tx = this.db!.transaction(STORE_NAMES[tier], "readonly");
const store = tx.objectStore(STORE_NAMES[tier]);
const keyHex = Array.from(key, (b) => b.toString(16).padStart(2, "0")).join("");
const req = store.get(keyHex);
// Asyncify transforms this into an awaitable suspension point
return req.result ? new Uint8Array(req.result) : null;
}
writeBlock(tier: number, key: Uint8Array, data: Uint8Array): void {
const tx = this.db!.transaction(STORE_NAMES[tier], "readwrite");
const store = tx.objectStore(STORE_NAMES[tier]);
const keyHex = Array.from(key, (b) => b.toString(16).padStart(2, "0")).join("");
store.put(data.buffer, keyHex);
}
deleteBlock(tier: number, key: Uint8Array): void {
const tx = this.db!.transaction(STORE_NAMES[tier], "readwrite");
const store = tx.objectStore(STORE_NAMES[tier]);
const keyHex = Array.from(key, (b) => b.toString(16).padStart(2, "0")).join("");
store.delete(keyHex);
}
}import { readFile } from "node:fs/promises";
import { TtsStore, TtsPolicy, TtsHostIO } from "./types";
export async function loadTtsModule(
wasmPath: string,
hostIO: TtsHostIO,
policy: TtsPolicy = {},
): Promise<TtsStore> {
const wasmBytes = await readFile(wasmPath);
const wasmMemory = new WebAssembly.Memory({ initial: 256, maximum: 4096 });
const importObject = {
env: { memory: wasmMemory },
tts_host: {
read_block: (tier: number, keyPtr: number, keyLen: number,
dstPtr: number, dstLen: number): number => {
const mem = new Uint8Array(wasmMemory.buffer);
const key = mem.slice(keyPtr, keyPtr + keyLen);
const result = hostIO.readBlock(tier, key);
if (!result) return -2; // TTS_ERR_TENSOR_EVICTED
if (result.length > dstLen) return -6; // TTS_ERR_BUFFER_TOO_SMALL
mem.set(result, dstPtr);
return result.length;
},
write_block: (tier: number, keyPtr: number, keyLen: number,
srcPtr: number, srcLen: number): number => {
const mem = new Uint8Array(wasmMemory.buffer);
const key = mem.slice(keyPtr, keyPtr + keyLen);
const data = mem.slice(srcPtr, srcPtr + srcLen);
hostIO.writeBlock(tier, key, data);
return 0;
},
delete_block: (tier: number, keyPtr: number, keyLen: number): number => {
const mem = new Uint8Array(wasmMemory.buffer);
const key = mem.slice(keyPtr, keyPtr + keyLen);
hostIO.deleteBlock(tier, key);
return 0;
},
},
};
const { instance } = await WebAssembly.instantiate(wasmBytes, importObject);
const exports = instance.exports as Record<string, Function>;
// Initialize the store with policy
const policyJson = new TextEncoder().encode(JSON.stringify(policy));
const policyPtr = exports.tts_alloc(policyJson.length) as number;
new Uint8Array(wasmMemory.buffer).set(policyJson, policyPtr);
const initResult = exports.tts_init(policyPtr, policyJson.length) as number;
exports.tts_dealloc(policyPtr, policyJson.length);
if (initResult !== 0) {
throw new Error(`tts_init failed with code ${initResult}`);
}
// Return wrapped store object
return new TtsStoreImpl(exports, wasmMemory);
}- ADR-005: WASM Runtime Integration (sandboxing model, epoch interruption, raw ABI)
- ADR-017: Temporal Tensor Compression (frame-level
ttc_*FFI, quantization pipeline) - ADR-018: Block-Based Storage Engine (tiered placement, block format)
- ADR-001: RuVector Core Architecture (crate structure, dependency graph)
- ADR-004: KV Cache Management (three-tier cache model)