Shrink your vector data 4-10x without losing the signal.
ruvector-temporal-tensor compresses streams of floating-point tensors by exploiting two properties that most vector workloads share:
- Values within a group are similar — so a single scale factor per group captures the range, and a small integer code captures the value. This is groupwise symmetric quantization.
- Consecutive frames barely change — so the same scale factors can be reused across many frames until the data drifts. This is temporal segment reuse.
The crate automatically picks the right bit-width based on how "hot" (frequently accessed) the tensor is, giving you aggressive compression on cold data while preserving accuracy on hot data.
Zero external dependencies. Compiles to WASM. Ships with a C FFI.
f32 frame ──► tier policy ──► quantizer ──► bitpack ──► segment blob
│
"How hot is this tensor?"
Hot → 8-bit (lossless-ish)
Warm → 7 or 5-bit
Cold → 3-bit (10x smaller)
Each frame of f32 values is divided into fixed-size groups (default 64). Per group, the compressor computes a single scale factor (max_abs / qmax) and maps every value to a signed integer code. Codes are packed into a tight bitstream with no byte-alignment waste.
When the next frame arrives, the compressor checks whether the existing scale factors still cover the new data (within a configurable drift tolerance). If they do, the frame is appended to the current segment — reusing the same scales. If they don't, the segment is finalized and a new one starts.
Segments are self-contained binary blobs with a 22-byte header, the f16-encoded scales, and the packed data. They can be decoded independently, or you can random-access a single frame by index.
| Tier | Bits | Ratio vs f32 | Typical Error | When Used |
|---|---|---|---|---|
| Hot | 8 | ~4x | < 0.5% | Frequently accessed tensors |
| Warm | 7 | ~4.6x | < 1% | Moderate access patterns |
| Warm | 5 | ~6.4x | < 3% | Aggressively compressed warm data |
| Cold | 3 | ~10.7x | < 15% | Rarely accessed / archival |
Ratios improve further with temporal reuse — the scale overhead is amortized across all frames in a segment.
Add to your Cargo.toml:
[dependencies]
ruvector-temporal-tensor = "2.0"use ruvector_temporal_tensor::{TemporalTensorCompressor, TierPolicy};
// 1. Create a compressor for 128-element tensors
let mut comp = TemporalTensorCompressor::new(TierPolicy::default(), 128, 0);
comp.set_access(100, 0); // mark as hot → 8-bit quantization
let frame = vec![1.0f32; 128];
let mut segment = Vec::new();
// 2. Push frames — segment stays empty until a boundary is crossed
comp.push_frame(&frame, 1, &mut segment);
// 3. Force-emit the current segment
comp.flush(&mut segment);
// 4. Decode back to f32
let mut decoded = Vec::new();
ruvector_temporal_tensor::segment::decode(&segment, &mut decoded);
assert_eq!(decoded.len(), 128);use ruvector_temporal_tensor::{TemporalTensorCompressor, TierPolicy};
let mut comp = TemporalTensorCompressor::new(TierPolicy::default(), 512, 0);
comp.set_access(100, 0);
let mut segments: Vec<Vec<u8>> = Vec::new();
let mut seg = Vec::new();
for t in 0..1000 {
let frame: Vec<f32> = (0..512).map(|i| ((i + t) as f32 * 0.01).sin()).collect();
comp.push_frame(&frame, t as u32, &mut seg);
if !seg.is_empty() {
segments.push(seg.clone());
}
}
comp.flush(&mut seg);
if !seg.is_empty() {
segments.push(seg);
}use ruvector_temporal_tensor::segment;
# use ruvector_temporal_tensor::{TemporalTensorCompressor, TierPolicy};
# let mut comp = TemporalTensorCompressor::new(TierPolicy::default(), 64, 0);
# let mut seg = Vec::new();
# comp.push_frame(&vec![1.0f32; 64], 0, &mut seg);
# comp.flush(&mut seg);
// Decode only frame 0 — skips all other frames in the segment
let values = segment::decode_single_frame(&seg, 0).unwrap();
assert_eq!(values.len(), 64);
// Check compression ratio
let ratio = segment::compression_ratio(&seg);
assert!(ratio > 1.0);use ruvector_temporal_tensor::{TemporalTensorCompressor, TierPolicy};
let policy = TierPolicy {
hot_min_score: 512, // score threshold for 8-bit
warm_min_score: 64, // score threshold for warm tier
warm_bits: 5, // use 5-bit instead of default 7 for warm
drift_pct_q8: 26, // ~10% drift tolerance (Q8 fixed-point)
group_len: 32, // smaller groups = more scales, tighter fit
};
let mut comp = TemporalTensorCompressor::new(policy, 256, 0);[dependencies]
ruvector-temporal-tensor = { version = "2.0", features = ["ffi"] }| Feature | Default | Description |
|---|---|---|
ffi |
off | Enable extern "C" exports for WASM and C interop |
simd |
off | Reserved for future SIMD-accelerated quantization |
| Type | Description |
|---|---|
TemporalTensorCompressor |
Main entry point — push frames, get segments |
TierPolicy |
Controls bit-width selection and drift tolerance |
| Method | Description |
|---|---|
new(policy, len, now_ts) |
Create a compressor for tensors of len elements |
push_frame(frame, now_ts, out) |
Compress a frame; emits a segment on boundary crossings |
flush(out) |
Force-emit the current segment |
touch(now_ts) |
Record an access event (increments count + updates timestamp) |
set_access(count, ts) |
Set access stats directly (for restoring state) |
active_bits() |
Current quantization bit-width |
active_frame_count() |
Frames buffered in the current segment |
len() / is_empty() |
Tensor length |
| Function | Description |
|---|---|
segment::decode(bytes, out) |
Decode all frames from a segment |
segment::decode_single_frame(bytes, idx) |
Decode one frame by index |
segment::parse_header(bytes) |
Read segment metadata without decoding |
segment::compression_ratio(bytes) |
Compute raw-to-compressed ratio |
segment::encode(...) |
Low-level segment encoder (used internally) |
| Module | Description |
|---|---|
quantizer |
Groupwise symmetric quantization and dequantization |
bitpack |
Arbitrary-width bitstream packer and unpacker |
f16 |
Software IEEE 754 half-precision conversion |
tier_policy |
Access-pattern scoring and bit-width selection |
Segments are self-contained, portable, and version-tagged:
Offset Size Field
────── ──── ─────────────────
0 4 Magic: 0x43545154 ("TQTC")
4 1 Version (currently 1)
5 1 Bits per code (3, 5, 7, or 8)
6 4 Group length
10 4 Tensor length (elements per frame)
14 4 Frame count
18 4 Scale count (S)
22 2*S Scales (f16, little-endian)
22+2S 4 Data length (D)
26+2S D Packed quantization codes
Enable the ffi feature and compile with --target wasm32-unknown-unknown:
cargo build --release --target wasm32-unknown-unknown --features ffiExported C functions:
| Function | Description |
|---|---|
ttc_create(len, now_ts, out_handle) |
Create compressor, get handle |
ttc_create_with_policy(...) |
Create with custom tier policy |
ttc_free(handle) |
Free a compressor |
ttc_touch(handle, now_ts) |
Record access |
ttc_set_access(handle, count, ts) |
Set access stats |
ttc_push_frame(handle, ts, in, len, out, cap, written) |
Compress a frame |
ttc_flush(handle, out, cap, written) |
Flush current segment |
ttc_decode_segment(seg, len, out, cap, written) |
Decode a segment |
ttc_alloc(size, out_ptr) |
Allocate WASM linear memory |
ttc_dealloc(ptr, cap) |
Free allocated memory |
See ADR-017 for the full architecture decision record, including SOTA survey, compression math, safety analysis, and integration guidance.
Key decisions:
- Groupwise symmetric (no zero-point) — simpler, faster, well-suited for normally-distributed embeddings
- f16 scales — 2 bytes per group vs 4 for f32, with negligible accuracy loss
- 64-bit bitstream accumulator — handles any sub-byte width without byte-alignment waste
- Score-based tiering —
access_count * 1024 / agebalances recency and frequency - ~10% drift tolerance — Q8 fixed-point configurable, default 26/256
# Build
cargo build -p ruvector-temporal-tensor --release
# Run all tests (41 unit + 3 doc-tests)
cargo test -p ruvector-temporal-tensor
# Clippy
cargo clippy -p ruvector-temporal-tensor -- -W clippy::all
# Build WASM target
cargo build -p ruvector-temporal-tensor --release --target wasm32-unknown-unknown --features ffi| Crate | Relationship |
|---|---|
| ruvector-core | Parent vector database engine; temporal tensors integrate as a storage backend |
| ruvector-temporal-tensor-wasm | Thin WASM re-export wrapper |
MIT License — see LICENSE for details.
Part of Ruvector