| Field | Value |
|---|---|
| Status | Proposed |
| Date | 2026-02-28 |
| Deciders | ruv |
| Relates to | ADR-003 (RVF Cognitive Containers), ADR-005 (SONA Self-Learning), ADR-015 (Public Dataset Strategy), ADR-016 (RuVector Integration), ADR-017 (RuVector-Signal-MAT), ADR-020 (Rust AI Migration), ADR-021 (Vital Sign Detection) |
The WiFi-DensePose system currently operates in two distinct modes:
-
WiFi CSI sensing (working): ESP32 streams CSI frames → Rust aggregator → feature extraction → presence/motion classification. 41 tests passing, verified at ~20 Hz with real hardware.
-
Heuristic pose derivation (working but approximate): The Rust sensing server generates 17 COCO keypoints from WiFi signal properties using hand-crafted rules (
derive_pose_from_sensing()insensing-server/src/main.rs). This is not a trained model — keypoint positions are derived from signal amplitude, phase variance, and motion metrics rather than learned from labeled data.
Neither mode produces DensePose-quality body surface estimation. The CMU "DensePose From WiFi" paper (arXiv:2301.00250) demonstrated that a neural network trained on paired WiFi CSI + camera pose data can produce dense body surface UV coordinates from WiFi alone. However, that approach requires:
- Environment-specific training: The model must be trained or fine-tuned for each deployment environment because CSI multipath patterns are environment-dependent.
- Paired training data: Simultaneous WiFi CSI captures + ground-truth pose annotations (or a camera-based teacher model generating pseudo-labels).
- Substantial compute: Training a modality translation network + DensePose head requires GPU time (hours to days depending on dataset size).
The Rust workspace already has the complete model architecture ready for training:
| Component | Crate | File | Status |
|---|---|---|---|
WiFiDensePoseModel |
wifi-densepose-train |
model.rs |
Implemented (random weights) |
ModalityTranslator |
wifi-densepose-train |
model.rs |
Implemented with RuVector attention |
KeypointHead |
wifi-densepose-train |
model.rs |
Implemented (17 COCO heatmaps) |
DensePoseHead |
wifi-densepose-nn |
densepose.rs |
Implemented (25 parts + 48 UV) |
WiFiDensePoseLoss |
wifi-densepose-train |
losses.rs |
Implemented (keypoint + part + UV + transfer) |
MmFiDataset loader |
wifi-densepose-train |
dataset.rs |
Planned (ADR-015) |
WiFiDensePosePipeline |
wifi-densepose-nn |
inference.rs |
Implemented (generic over Backend) |
| Training proof verification | wifi-densepose-train |
proof.rs |
Implemented (deterministic hash) |
| Subcarrier resampling (114→56) | wifi-densepose-train |
subcarrier.rs |
Planned (ADR-016) |
The vendor/ruvector/ subtree provides 90+ crates. The following are directly relevant to a trained DensePose pipeline:
Already integrated (5 crates, ADR-016):
| Crate | Algorithm | Current Use |
|---|---|---|
ruvector-mincut |
Subpolynomial dynamic min-cut O(n^{o(1)}) | Multi-person assignment in metrics.rs |
ruvector-attn-mincut |
Attention-gated min-cut | Noise-suppressed spectrogram in model.rs |
ruvector-attention |
Scaled dot-product + geometric attention | Spatial decoder in model.rs |
ruvector-solver |
Sparse Neumann solver O(√n) | Subcarrier resampling in subcarrier.rs |
ruvector-temporal-tensor |
Tiered temporal compression | CSI frame buffering in dataset.rs |
Newly proposed for DensePose pipeline (6 additional crates):
| Crate | Description | Proposed Use |
|---|---|---|
ruvector-gnn |
Graph neural network on HNSW topology | Spatial body-graph reasoning |
ruvector-graph-transformer |
Proof-gated graph transformer (8 modules) | CSI-to-pose cross-attention |
ruvector-sparse-inference |
PowerInfer-style sparse inference engine | Edge deployment with neuron activation sparsity |
ruvector-sona |
Self-Optimizing Neural Architecture (LoRA + EWC++) | Online environment adaptation |
ruvector-fpga-transformer |
FPGA-optimized transformer | Hardware-accelerated inference path |
ruvector-math |
Optimal transport, information geometry | Domain adaptation loss functions |
The RuVector Format (RVF) is a segment-based binary container format designed to package intelligence artifacts — embeddings, HNSW indexes, quantized weights, WASM runtimes, witness proofs, and metadata — into a single self-contained file. Key properties:
- 64-byte segment headers (
SegmentHeader, magic0x52564653"RVFS") with type discriminator, content hash, compression, and timestamp - Progressive loading: Layer A (entry points, <5ms) → Layer B (hot adjacency, 100ms–1s) → Layer C (full graph, seconds)
- 20+ segment types:
Vec(embeddings),Index(HNSW),Overlay(min-cut witnesses),Quant(codebooks),Witness(proof-of-computation),Wasm(self-bootstrapping runtime),Dashboard(embedded UI),AggregateWeights(federated SONA deltas),Crypto(Ed25519 signatures), and more - Temperature-tiered quantization (
rvf-quant): f32 / f16 / u8 / binary per-segment, with SIMD-accelerated distance computation - AGI Cognitive Container (
agi_container.rs): packages kernel + WASM + world model + orchestrator + evaluation harness + witness chains into a single deployable file
The trained DensePose model will be packaged as an .rvf container, making it a single
self-contained artifact that includes model weights, HNSW-indexed embedding tables, min-cut
graph overlays, quantization codebooks, SONA adaptation deltas, and the WASM inference
runtime — deployable to any host without external dependencies.
Implement a fully trained DensePose model using RuVector signal intelligence as the backbone signal processing layer, packaged in the RVF container format. The pipeline has three stages: (1) offline training on public datasets, (2) teacher-student distillation for DensePose UV labels, and (3) online SONA adaptation for environment-specific fine-tuning. The trained model, its embeddings, indexes, and adaptation state are serialized into a single .rvf file.
┌─────────────────────────────────────────────────────────────────────────────┐
│ TRAINED DENSEPOSE PIPELINE │
│ │
│ ┌─────────────┐ ┌──────────────────────┐ ┌──────────────────────┐ │
│ │ ESP32 CSI │ │ RuVector Signal │ │ Trained Neural │ │
│ │ Raw I/Q │───▶│ Intelligence Layer │───▶│ Network │ │
│ │ [ant×sub×T] │ │ (preprocessing) │ │ (inference) │ │
│ └─────────────┘ └──────────────────────┘ └──────────────────────┘ │
│ │ │ │
│ ┌─────────┴─────────┐ ┌────────┴────────┐ │
│ │ 5 RuVector crates │ │ 6 RuVector │ │
│ │ (signal processing)│ │ crates (neural) │ │
│ └───────────────────┘ └─────────────────┘ │
│ │ │
│ ┌──────────────────────────┘ │
│ ▼ │
│ ┌──────────────────────────────────────┐ │
│ │ Outputs │ │
│ │ • 17 COCO keypoints [B,17,H,W] │ │
│ │ • 25 body parts [B,25,H,W] │ │
│ │ • 48 UV coords [B,48,H,W] │ │
│ │ • Confidence scores │ │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
Raw CSI frames from ESP32 (56–192 subcarriers × N antennas × T time frames) are processed through the RuVector signal intelligence stack before entering the neural network. This replaces hand-crafted feature extraction with learned, graph-aware preprocessing.
Raw CSI [ant, sub, T]
│
▼
┌─────────────────────────────────────────────────────┐
│ 1. ruvector-attn-mincut: gate_spectrogram() │
│ Input: Q=amplitude, K=phase, V=combined │
│ Effect: Suppress multipath noise, keep motion- │
│ relevant subcarrier paths │
│ Output: Gated spectrogram [ant, sub', T] │
├─────────────────────────────────────────────────────┤
│ 2. ruvector-mincut: mincut_subcarrier_partition() │
│ Input: Subcarrier coherence graph │
│ Effect: Partition into sensitive (motion- │
│ responsive) vs insensitive (static) │
│ Output: Partition mask + per-subcarrier weights │
├─────────────────────────────────────────────────────┤
│ 3. ruvector-attention: attention_weighted_bvp() │
│ Input: Gated spectrogram + partition weights │
│ Effect: Compute body velocity profile with │
│ sensitivity-weighted attention │
│ Output: BVP feature vector [D_bvp] │
├─────────────────────────────────────────────────────┤
│ 4. ruvector-solver: solve_fresnel_geometry() │
│ Input: Amplitude + known TX/RX positions │
│ Effect: Estimate TX-body-RX ellipsoid distances │
│ Output: Fresnel geometry features [D_fresnel] │
├─────────────────────────────────────────────────────┤
│ 5. ruvector-temporal-tensor: compress + buffer │
│ Input: Temporal CSI window (100 frames) │
│ Effect: Tiered quantization (hot/warm/cold) │
│ Output: Compressed tensor, 50-75% memory saving │
└─────────────────────────────────────────────────────┘
│
▼
Feature tensor [B, T*tx*rx, sub] (preprocessed, noise-suppressed)
The neural network follows the CMU teacher-student architecture with RuVector enhancements at three critical points.
CSI features [B, T*tx*rx, sub]
│
├──amplitude──┐
│ ├─► Encoder (Conv1D stack, 64→128→256)
└──phase──────┘ │
▼
┌──────────────────────────────┐
│ ruvector-graph-transformer │
│ │
│ Treat antenna-pair×time as │
│ graph nodes. Edges connect │
│ spatially adjacent antenna │
│ pairs and temporally │
│ adjacent frames. │
│ │
│ Proof-gated attention: │
│ Each layer verifies that │
│ attention weights satisfy │
│ physical constraints │
│ (Fresnel ellipsoid bounds) │
└──────────────────────────────┘
│
▼
Decoder (ConvTranspose2d stack, 256→128→64→3)
│
▼
Visual features [B, 3, 48, 48]
RuVector enhancement: Replace standard multi-head self-attention in the bottleneck with ruvector-graph-transformer. The graph structure encodes the physical antenna topology — nodes that are closer in space (adjacent ESP32 nodes in the mesh) or time (consecutive frames) have stronger edge weights. This injects domain-specific inductive bias that standard attention lacks.
Visual features [B, 3, 48, 48]
│
▼
ResNet18 backbone → feature maps [B, 256, 12, 12]
│
▼
┌─────────────────────────────────────────┐
│ ruvector-gnn: Body Graph Network │
│ │
│ 17 COCO keypoints as graph nodes │
│ Edges: anatomical connections │
│ (shoulder→elbow, hip→knee, etc.) │
│ │
│ GNN message passing (3 rounds): │
│ h_i^{l+1} = σ(W·h_i^l + Σ_j α_ij·h_j)│
│ α_ij = attention(h_i, h_j, edge_ij) │
│ │
│ Enforces anatomical constraints: │
│ - Limb length ratios │
│ - Joint angle limits │
│ - Left-right symmetry priors │
└─────────────────────────────────────────┘
│
├──────────────────┬──────────────────┐
▼ ▼ ▼
KeypointHead DensePoseHead ConfidenceHead
[B,17,H,W] [B,25+48,H,W] [B,1]
heatmaps parts + UV quality score
RuVector enhancement: ruvector-gnn replaces the flat spatial decoder with a graph neural network that operates on the human body graph. WiFi CSI is inherently noisy — GNN message passing between anatomically connected joints enforces that predicted keypoints maintain plausible body structure even when individual joint predictions are uncertain.
Trained model weights (full precision)
│
▼
┌─────────────────────────────────────────────┐
│ ruvector-sparse-inference │
│ │
│ PowerInfer-style activation sparsity: │
│ - Profile neuron activation frequency │
│ - Partition into hot (always active, 20%) │
│ and cold (conditionally active, 80%) │
│ - Hot neurons: GPU/SIMD fast path │
│ - Cold neurons: sparse lookup on demand │
│ │
│ Quantization: │
│ - Backbone: INT8 (4x memory reduction) │
│ - DensePose head: FP16 (2x reduction) │
│ - ModalityTranslator: FP16 │
│ │
│ Target: <50ms inference on ESP32-S3 │
│ <10ms on x86 with AVX2 │
└─────────────────────────────────────────────┘
Primary dataset: MM-Fi (NeurIPS 2023) — 40 subjects, 27 actions, 114 subcarriers, 3 RX antennas, 17 COCO keypoints + DensePose UV annotations.
Secondary dataset: Wi-Pose — 12 subjects, 12 actions, 30 subcarriers, 3×3 antenna array, 18 keypoints.
┌──────────────────────────────────────────────────────────┐
│ Data Loading Pipeline │
│ │
│ MM-Fi .npy ──► Resample 114→56 subcarriers ──┐ │
│ (ruvector-solver NeumannSolver) │ │
│ ├──► Batch│
│ Wi-Pose .mat ──► Zero-pad 30→56 subcarriers ──┘ [B,T*│
│ ant, │
│ Phase sanitize ──► Hampel filter ──► unwrap sub] │
│ (wifi-densepose-signal::phase_sanitizer) │
│ │
│ Temporal buffer ──► ruvector-temporal-tensor │
│ (100 frames/sample, tiered quantization) │
└──────────────────────────────────────────────────────────┘
For samples with 3D keypoints but no DensePose UV maps:
- Run Detectron2 DensePose R-CNN on paired RGB frames (one-time preprocessing step on GPU workstation)
- Generate
(part_labels [H,W], u_coords [H,W], v_coords [H,W])pseudo-labels - Cache as
.npyalongside original data - Teacher model is discarded after label generation — inference uses WiFi only
L_total = λ_kp · L_keypoint // MSE on predicted vs GT heatmaps
+ λ_part · L_part // Cross-entropy on 25-class body part segmentation
+ λ_uv · L_uv // Smooth L1 on UV coordinate regression
+ λ_xfer · L_transfer // MSE between CSI features and teacher visual features
+ λ_ot · L_ot // Optimal transport regularization (ruvector-math)
+ λ_graph · L_graph // GNN edge consistency loss (ruvector-gnn)RuVector enhancement: ruvector-math provides optimal transport (Wasserstein distance) as a regularization term. This penalizes predicted body part distributions that are far from the ground truth in the Wasserstein metric, which is more geometrically meaningful than pixel-wise cross-entropy for spatial body part segmentation.
| Parameter | Value | Rationale |
|---|---|---|
| Optimizer | AdamW | Weight decay regularization |
| Learning rate | 1e-3, cosine decay to 1e-5 | Standard for modality translation |
| Batch size | 32 | Fits in 24GB GPU VRAM |
| Epochs | 100 | With early stopping (patience=15) |
| Warmup | 5 epochs | Linear LR warmup |
| Train/val split | Subjects 1-32 / 33-40 | Subject-disjoint for generalization |
| Augmentation | Time-shift ±5 frames, amplitude noise ±2dB, antenna dropout 10% | CSI-domain augmentations |
| Hardware | Single RTX 3090 or A100 | ~8 hours on A100 |
| Checkpoint | Every epoch, keep best-by-validation-PCK | Deterministic seed |
| Metric | Target | Description |
|---|---|---|
| PCK@0.2 | >70% on MM-Fi val | Percentage of correct keypoints (threshold = 0.2 × torso diameter) |
| OKS mAP | >0.50 on MM-Fi val | Object Keypoint Similarity, COCO-standard |
| DensePose GPS | >0.30 on MM-Fi val | Geodesic Point Similarity for UV accuracy |
| Inference latency | <50ms per frame | On x86 with ONNX Runtime |
| Model size | <25MB (FP16) | Suitable for edge deployment |
After offline training produces a base model, SONA enables continuous adaptation to new environments without retraining from scratch.
┌──────────────────────────────────────────────────────────┐
│ SONA Online Adaptation Loop │
│ │
│ Base model (frozen weights W) │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────┐ │
│ │ LoRA Adaptation Matrices │ │
│ │ W_effective = W + α · A·B │ │
│ │ │ │
│ │ Rank r=4 for translator layers │ │
│ │ Rank r=2 for backbone layers │ │
│ │ Rank r=8 for DensePose head │ │
│ │ │ │
│ │ Total trainable params: ~50K │ │
│ │ (vs ~5M frozen base) │ │
│ └──────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────┐ │
│ │ EWC++ Regularizer │ │
│ │ L = L_task + λ·Σ F_i(θ-θ*)² │ │
│ │ │ │
│ │ Prevents forgetting base model │ │
│ │ knowledge when adapting to new │ │
│ │ environment │ │
│ └──────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Adaptation triggers: │
│ • First deployment in new room │
│ • PCK drops below threshold (drift detection) │
│ • User manually initiates calibration │
│ • Furniture/layout change detected (CSI baseline shift) │
│ │
│ Adaptation data: │
│ • Self-supervised: temporal consistency loss │
│ (pose at t should be similar to t-1 for slow motion) │
│ • Semi-supervised: user confirmation of presence/count │
│ • Optional: brief camera calibration session (5 min) │
│ │
│ Convergence: 10-50 gradient steps, <5 seconds on CPU │
└──────────────────────────────────────────────────────────┘
ESP32 CSI (UDP :5005)
│
▼
Rust Axum server (port 8080)
│
├─► RuVector signal preprocessing (Stage 1)
│ 5 crates, ~2ms per frame
│
├─► ONNX Runtime inference (Stage 2)
│ Quantized model, ~10ms per frame
│ OR ruvector-sparse-inference, ~8ms per frame
│
├─► GNN post-processing (ruvector-gnn)
│ Anatomical constraint enforcement, ~1ms
│
├─► SONA adaptation check (Stage 4)
│ <0.05ms per frame (gradient accumulation only)
│
└─► Output: DensePose results
│
├──► /api/v1/stream/pose (WebSocket, 17 keypoints)
├──► /api/v1/pose/current (REST, full DensePose)
└──► /ws/sensing (WebSocket, raw + processed)
Total inference budget: <15ms per frame at 20 Hz on x86, <50ms on ESP32-S3 (with sparse inference).
The trained model is packaged as a single .rvf file that contains everything needed for
inference — no external weight files, no ONNX runtime, no Python dependencies.
wifi-densepose-v1.rvf (single file, ~15-30 MB)
┌───────────────────────────────────────────────────────────────┐
│ SEGMENT 0: Manifest (0x05) │
│ ├── Model ID: "wifi-densepose-v1.0" │
│ ├── Training dataset: "mmfi-v1+wipose-v1" │
│ ├── Training config hash: SHA-256 │
│ ├── Target hardware: x86_64, aarch64, wasm32 │
│ ├── Segment directory (offsets to all segments) │
│ └── Level-1 TLV manifest with metadata tags │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 1: Vec (0x01) — Model Weight Embeddings │
│ ├── ModalityTranslator weights [64→128→256→3, Conv1D+ConvT] │
│ ├── ResNet18 backbone weights [3→64→128→256, residual blocks] │
│ ├── KeypointHead weights [256→17, deconv layers] │
│ ├── DensePoseHead weights [256→25+48, deconv layers] │
│ ├── GNN body graph weights [3 message-passing rounds] │
│ └── Graph transformer attention weights [proof-gated layers] │
│ Format: flat f32 vectors, 768-dim per weight tensor │
│ Total: ~5M parameters → ~20MB f32, ~10MB f16, ~5MB INT8 │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 2: Index (0x02) — HNSW Embedding Index │
│ ├── Layer A: Entry points + coarse routing centroids │
│ │ (loaded first, <5ms, enables approximate search) │
│ ├── Layer B: Hot region adjacency for frequently │
│ │ accessed weight clusters (100ms load) │
│ └── Layer C: Full adjacency graph for exact nearest │
│ neighbor lookup across all weight partitions │
│ Use: Fast weight lookup for sparse inference — │
│ only load hot neurons, skip cold neurons via HNSW routing │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 3: Overlay (0x03) — Dynamic Min-Cut Graph │
│ ├── Subcarrier partition graph (sensitive vs insensitive) │
│ ├── Min-cut witnesses from ruvector-mincut │
│ ├── Antenna topology graph (ESP32 mesh spatial layout) │
│ └── Body skeleton graph (17 COCO joints, 16 edges) │
│ Use: Pre-computed graph structures loaded at init time. │
│ Dynamic updates via ruvector-mincut insert/delete_edge │
│ as environment changes (furniture moves, new obstacles) │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 4: Quant (0x06) — Quantization Codebooks │
│ ├── INT8 codebook for backbone (4x memory reduction) │
│ ├── FP16 scale factors for translator + heads │
│ ├── Binary quantization tables for SIMD distance compute │
│ └── Per-layer calibration statistics (min, max, zero-point) │
│ Use: rvf-quant temperature-tiered quantization — │
│ hot layers stay f16, warm layers u8, cold layers binary │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 5: Witness (0x0A) — Training Proof Chain │
│ ├── Deterministic training proof (seed, loss curve, hash) │
│ ├── Dataset provenance (MM-Fi commit hash, download URL) │
│ ├── Validation metrics (PCK@0.2, OKS mAP, GPS scores) │
│ ├── Ed25519 signature over weight hash │
│ └── Attestation: training hardware, duration, config │
│ Use: Verifiable proof that model weights match a specific │
│ training run. Anyone can re-run training with same seed │
│ and verify the weight hash matches the witness. │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 6: Meta (0x07) — Model Metadata │
│ ├── COCO keypoint names and skeleton connectivity │
│ ├── DensePose body part labels (24 parts + background) │
│ ├── UV coordinate range and resolution │
│ ├── Input normalization statistics (mean, std per subcarrier)│
│ ├── RuVector crate versions used during training │
│ └── Environment calibration profiles (named, per-room) │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 7: AggregateWeights (0x36) — SONA LoRA Deltas │
│ ├── Per-environment LoRA adaptation matrices (A, B per layer)│
│ ├── EWC++ Fisher information diagonal │
│ ├── Optimal θ* reference parameters │
│ ├── Adaptation round count and convergence metrics │
│ └── Named profiles: "lab-a", "living-room", "office-3f" │
│ Use: Multiple environment adaptations stored in one file. │
│ Server loads the matching profile or creates a new one. │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 8: Profile (0x0B) — RVDNA Domain Profile │
│ ├── Domain: "wifi-csi-densepose" │
│ ├── Input spec: [B, T*ant, sub] CSI tensor format │
│ ├── Output spec: keypoints [B,17,H,W], parts [B,25,H,W], │
│ │ UV [B,48,H,W], confidence [B,1] │
│ ├── Hardware requirements: min RAM, recommended GPU │
│ └── Supported data sources: esp32, wifi-rssi, simulation │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 9: Crypto (0x0C) — Signature and Keys │
│ ├── Ed25519 public key for model publisher │
│ ├── Signature over all segment content hashes │
│ └── Certificate chain (optional, for enterprise deployment) │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 10: Wasm (0x10) — Self-Bootstrapping Runtime │
│ ├── Compiled WASM inference engine │
│ │ (ruvector-sparse-inference-wasm) │
│ ├── WASM microkernel for RVF segment parsing │
│ └── Browser-compatible: load .rvf → run inference in-browser │
│ Use: The .rvf file is fully self-contained — a WASM host │
│ can execute inference without any external dependencies. │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 11: Dashboard (0x11) — Embedded Visualization │
│ ├── Three.js-based pose visualization (HTML/JS/CSS) │
│ ├── Gaussian splat renderer for signal field │
│ └── Served at http://localhost:8080/ when model is loaded │
│ Use: Open the .rvf file → get a working UI with no install │
└───────────────────────────────────────────────────────────────┘
1. Read tail → find_latest_manifest() → SegmentDirectory
2. Load Manifest (seg 0) → validate magic, version, model ID
3. Load Profile (seg 8) → verify input/output spec compatibility
4. Load Crypto (seg 9) → verify Ed25519 signature chain
5. Load Quant (seg 4) → prepare quantization codebooks
6. Load Index Layer A (seg 2) → entry points ready (<5ms)
↓ (inference available at reduced accuracy)
7. Load Vec (seg 1) → hot weight partitions via Layer A routing
8. Load Index Layer B (seg 2) → hot adjacency ready (100ms)
↓ (inference at full accuracy for common poses)
9. Load Overlay (seg 3) → min-cut graphs, body skeleton
10. Load AggregateWeights (seg 7) → apply matching SONA profile
11. Load Index Layer C (seg 2) → complete graph loaded
↓ (full inference with all weight partitions)
12. Load Wasm (seg 10) → WASM runtime available (optional)
13. Load Dashboard (seg 11) → UI served (optional)
Progressive availability: Inference begins after step 6 (~5ms) with approximate results. Full accuracy is reached by step 9 (~500ms). This enables instant startup with gradually improving quality — critical for real-time applications.
After training completes, the model is packaged into an .rvf file:
# Build the RVF container from trained checkpoint
cargo run -p wifi-densepose-train --bin build-rvf -- \
--checkpoint checkpoints/best-pck.pt \
--quantize int8,fp16 \
--hnsw-build \
--sign --key model-signing-key.pem \
--include-wasm \
--include-dashboard ../../ui \
--output wifi-densepose-v1.rvf
# Verify the built container
cargo run -p wifi-densepose-train --bin verify-rvf -- \
--input wifi-densepose-v1.rvf \
--verify-signature \
--verify-witness \
--benchmark-inferenceThe sensing server loads the .rvf container at startup:
# Load model from RVF container
./target/release/sensing-server \
--model wifi-densepose-v1.rvf \
--source auto \
--ui-from-rvf # serve Dashboard segment instead of --ui-path// In sensing-server/src/main.rs
use rvf_runtime::RvfContainer;
use rvf_index::layers::IndexLayer;
use rvf_quant::QuantizedVec;
let container = RvfContainer::open("wifi-densepose-v1.rvf")?;
// Progressive load: Layer A first for instant startup
let index = container.load_index(IndexLayer::A)?;
let weights = container.load_vec_hot(&index)?; // hot partitions only
// Full load in background
tokio::spawn(async move {
container.load_index(IndexLayer::B).await?;
container.load_index(IndexLayer::C).await?;
container.load_vec_cold().await?; // remaining partitions
});
// SONA environment adaptation
let sona_deltas = container.load_aggregate_weights("office-3f")?;
model.apply_lora_deltas(&sona_deltas);
// Serve embedded dashboard
let dashboard = container.load_dashboard()?;
// Mount at /ui/* routes in Axum- Implement
MmFiDatasetinwifi-densepose-train/src/dataset.rs - Read MM-Fi
.npyfiles with antenna correction (1TX/3RX → 3×3 zero-padding) - Subcarrier resampling 114→56 via
ruvector-solver::NeumannSolver - Phase sanitization via
wifi-densepose-signal::phase_sanitizer - Implement
WiPoseDatasetfor secondary dataset - Temporal windowing with
ruvector-temporal-tensor - Deliverable:
cargo test -p wifi-densepose-trainwith dataset loading tests
- Add
ruvector-graph-transformerdependency towifi-densepose-train - Replace bottleneck self-attention in
ModalityTranslatorwith proof-gated graph transformer - Build antenna topology graph (nodes = antenna pairs, edges = spatial/temporal proximity)
- Add
ruvector-gnndependency for body graph reasoning - Build COCO body skeleton graph (17 nodes, 16 anatomical edges)
- Implement GNN message passing in spatial decoder
- Deliverable: Model forward pass produces correct output shapes with graph layers
- Python script using Detectron2 DensePose to generate UV pseudo-labels from MM-Fi RGB frames
- Cache labels as
.npyfor Rust loader consumption - Validate label quality on a random subset (visual inspection)
- Deliverable: Complete UV label set for MM-Fi training split
- Implement
WiFiDensePoseTrainerwith full loss function (6 terms) - Add
ruvector-mathoptimal transport loss term - Integrate GNN edge consistency loss
- Training loop with cosine LR schedule, early stopping, checkpointing
- Validation metrics: PCK@0.2, OKS mAP, DensePose GPS
- Deterministic proof verification (
proof.rs) with weight hash - Deliverable: Trained model checkpoint achieving PCK@0.2 >70% on MM-Fi validation
- Integrate
ruvector-sonainto inference pipeline - Implement LoRA injection at translator, backbone, and DensePose head layers
- Implement EWC++ Fisher information computation and regularization
- Self-supervised temporal consistency loss for unsupervised adaptation
- Calibration mode: 5-minute camera session for supervised fine-tuning
- Drift detection: monitor rolling PCK on temporal consistency proxy
- Deliverable: Adaptation converges in <50 gradient steps, PCK recovers within 10% of base
- Profile neuron activation frequencies on validation set
- Apply
ruvector-sparse-inferencehot/cold neuron partitioning - INT8 quantization for backbone, FP16 for heads
- ONNX export with quantized weights
- Benchmark on x86 (target: <10ms) and ARM (target: <50ms)
- WASM export via
ruvector-sparse-inference-wasmfor browser inference - Deliverable: Quantized ONNX model, benchmark results, WASM binary
- Implement
build-rvfbinary inwifi-densepose-train - Serialize trained weights into
Vecsegment (SegmentType::Vec, 0x01) - Build HNSW index over weight partitions for sparse inference (SegmentType::Index, 0x02)
- Serialize min-cut graph overlays: subcarrier partition, antenna topology, body skeleton (SegmentType::Overlay, 0x03)
- Generate quantization codebooks via
rvf-quant(SegmentType::Quant, 0x06) - Write training proof witness with Ed25519 signature (SegmentType::Witness, 0x0A)
- Store model metadata, COCO keypoint schema, normalization stats (SegmentType::Meta, 0x07)
- Store SONA LoRA adaptation deltas per environment (SegmentType::AggregateWeights, 0x36)
- Write RVDNA domain profile for WiFi CSI DensePose (SegmentType::Profile, 0x0B)
- Optionally embed WASM inference runtime (SegmentType::Wasm, 0x10)
- Optionally embed Three.js dashboard (SegmentType::Dashboard, 0x11)
- Build Level-1 manifest and segment directory (SegmentType::Manifest, 0x05)
- Implement
verify-rvfbinary for container validation - Deliverable:
wifi-densepose-v1.rvfsingle-file container, verifiable and self-contained
- Load
.rvfcontainer inwifi-densepose-sensing-serverviarvf-runtime - Progressive loading: Layer A first for instant startup, full graph in background
- Replace
derive_pose_from_sensing()heuristic with trained model inference - Add
--modelCLI flag accepting.rvfpath (or legacy.onnx) - Apply SONA LoRA deltas from
AggregateWeightssegment based on--envflag - Serve embedded Dashboard segment at
/ui/*when--ui-from-rvfis set - Graceful fallback to heuristic when no model file present
- Update WebSocket protocol to include DensePose UV data
- Deliverable: Sensing server serves trained model from single
.rvffile
| File | Purpose |
|---|---|
rust-port/.../wifi-densepose-train/src/dataset_mmfi.rs |
MM-Fi dataset loader with subcarrier resampling |
rust-port/.../wifi-densepose-train/src/dataset_wipose.rs |
Wi-Pose dataset loader |
rust-port/.../wifi-densepose-train/src/graph_transformer.rs |
Graph transformer integration |
rust-port/.../wifi-densepose-train/src/body_gnn.rs |
GNN body graph reasoning |
rust-port/.../wifi-densepose-train/src/adaptation.rs |
SONA LoRA + EWC++ adaptation |
rust-port/.../wifi-densepose-train/src/trainer.rs |
Training loop with multi-term loss |
scripts/generate_densepose_labels.py |
Teacher-student UV label generation |
scripts/benchmark_inference.py |
Inference latency benchmarking |
rust-port/.../wifi-densepose-train/src/rvf_builder.rs |
RVF container build pipeline |
rust-port/.../wifi-densepose-train/src/bin/build_rvf.rs |
CLI binary for building .rvf containers |
rust-port/.../wifi-densepose-train/src/bin/verify_rvf.rs |
CLI binary for verifying .rvf containers |
| File | Change |
|---|---|
rust-port/.../wifi-densepose-train/Cargo.toml |
Add ruvector-gnn, graph-transformer, sona, sparse-inference, math, rvf-types, rvf-wire, rvf-manifest, rvf-index, rvf-quant, rvf-crypto, rvf-runtime deps |
rust-port/.../wifi-densepose-train/src/model.rs |
Integrate graph transformer + GNN layers |
rust-port/.../wifi-densepose-train/src/losses.rs |
Add optimal transport + GNN edge consistency loss terms |
rust-port/.../wifi-densepose-train/src/config.rs |
Add training hyperparameters for new components |
rust-port/.../sensing-server/Cargo.toml |
Add rvf-runtime, rvf-types, rvf-index, rvf-quant deps |
rust-port/.../sensing-server/src/main.rs |
Add --model flag, load .rvf container, progressive startup, serve embedded dashboard |
- Trained model produces accurate DensePose: Moves from heuristic keypoints to learned body surface estimation backed by public dataset evaluation
- RuVector signal intelligence is a differentiator: Graph transformers on antenna topology and GNN body reasoning are novel — no prior WiFi pose system uses these techniques
- SONA enables zero-shot deployment: New environments don't require full retraining — LoRA adaptation with <50 gradient steps converges in seconds
- Sparse inference enables edge deployment: PowerInfer-style neuron partitioning brings DensePose inference to ESP32-class hardware
- Graceful degradation: Server falls back to heuristic pose when no model file is present — existing functionality is preserved
- Single-file deployment via RVF: Trained model, embeddings, HNSW index, quantization codebooks, SONA adaptation profiles, WASM runtime, and dashboard UI packaged in one
.rvffile — deploy by copying a single file - Progressive loading: RVF Layer A loads in <5ms for instant startup; full accuracy reached in ~500ms as remaining segments load
- Verifiable provenance: RVF Witness segment contains deterministic training proof with Ed25519 signature — anyone can re-run training and verify weight hash
- Self-bootstrapping: RVF Wasm segment enables browser-based inference with no server-side dependencies
- Open evaluation: PCK, OKS, GPS metrics on public MM-Fi dataset provide reproducible, comparable results
- Training requires GPU: Initial model training needs RTX 3090 or better (~8 hours on A100). Not all developers will have access.
- Teacher-student label generation requires Detectron2: One-time Python + CUDA dependency for generating UV pseudo-labels from RGB frames
- MM-Fi CC BY-NC license: Weights trained on MM-Fi cannot be used commercially without collecting proprietary data
- Environment-specific adaptation still required: SONA reduces the burden but a brief calibration session in each new environment is still recommended for best accuracy
- 6 additional RuVector crate dependencies: Increases compile time and binary size. Mitigated by feature flags (e.g.,
--features trained-model). - Model size on disk: ~25MB (FP16) or ~12MB (INT8). Acceptable for server deployment, may need further pruning for WASM.
| Risk | Mitigation |
|---|---|
| MM-Fi 114→56 interpolation loses accuracy | Train at native 114 as alternative; ESP32 mesh can collect 56-sub data natively |
| GNN overfits to training body types | Augment with diverse body proportions; Wi-Pose adds subject diversity |
| SONA adaptation diverges in adversarial environments | EWC++ regularization caps parameter drift; rollback to base weights on detection |
| Sparse inference degrades accuracy | Benchmark INT8 vs FP16 vs FP32; fall back to full precision if quality drops |
| Training proof hash changes with RuVector version updates | Pin ruvector crate versions in Cargo.toml; regenerate hash on version bumps |
- Geng et al., "DensePose From WiFi" (CMU, arXiv:2301.00250, 2023)
- Yang et al., "MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset" (NeurIPS 2023, arXiv:2305.10345)
- Hu et al., "LoRA: Low-Rank Adaptation of Large Language Models" (ICLR 2022)
- Kirkpatrick et al., "Overcoming Catastrophic Forgetting in Neural Networks" (PNAS, 2017)
- Song et al., "PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU" (2024)
- ADR-005: SONA Self-Learning for Pose Estimation
- ADR-015: Public Dataset Strategy for Trained Pose Estimation Model
- ADR-016: RuVector Integration for Training Pipeline
- ADR-020: Migrate AI/Model Inference to Rust with RuVector and ONNX Runtime
ruQu ("Classical nervous system for quantum machines") provides real-time coherence
assessment via dynamic min-cut. While primarily designed for quantum error correction
(syndrome decoding, surface code arbitration), its core primitive — the CoherenceGate —
is architecturally relevant to WiFi CSI processing:
-
CoherenceGate uses
ruvector-mincutto make real-time gate/pass decisions on signal streams based on structural coherence thresholds. In quantum computing, this gates qubit syndrome streams. For WiFi CSI, the same mechanism could gate CSI subcarrier streams — passing only subcarriers whose coherence (phase stability across antennas) exceeds a dynamic threshold. -
Syndrome filtering (
filters.rs) implements Kalman-like adaptive filters that could be repurposed for CSI noise filtering — treating each subcarrier's amplitude drift as a "syndrome" stream. -
Min-cut gated transformer integration (optional feature) provides coherence-optimized attention with 50% FLOP reduction — directly applicable to the
ModalityTranslatorbottleneck.
Decision: ruQu is not included in the initial pipeline (Phase 1-8) but is marked as a
Phase 9 exploration candidate for coherence-gated CSI filtering. The CoherenceGate
primitive maps naturally to subcarrier quality assessment, and the integration path is
clean since ruQu already depends on ruvector-mincut.
The pipeline supports three data sources for training, used in combination:
| Source | Subcarriers | Pose Labels | Volume | Cost | When |
|---|---|---|---|---|---|
| MM-Fi (public) | 114 → 56 (interpolated) | 17 COCO + DensePose UV | 40 subjects, 320K frames | Free (CC BY-NC) | Phase 1 — bootstrap |
| Wi-Pose (public) | 30 → 56 (zero-padded) | 18 keypoints | 12 subjects, 166K packets | Free (research) | Phase 1 — diversity |
| ESP32 self-collected | 56 (native) | Teacher-student from camera | Unlimited, environment-specific | Hardware only ($54) | Phase 4+ — fine-tuning |
Recommended approach: Both public + ESP32 data.
-
Pre-train on MM-Fi + Wi-Pose (public data, Phase 1-4): Provides the base model with diverse subjects and actions. The 114→56 subcarrier interpolation is acceptable for learning general CSI-to-pose mappings.
-
Fine-tune on ESP32 self-collected data (Phase 5+, SONA adaptation): Collect 5-30 minutes of paired ESP32 CSI + camera data in each target environment. The camera serves as the teacher model (Detectron2 generates pseudo-labels). SONA LoRA adaptation takes <50 gradient steps to converge.
-
Continuous adaptation (runtime): SONA's self-supervised temporal consistency loss refines the model without any camera, using the assumption that poses change smoothly over short time windows.
This three-tier strategy gives you:
- A working model from day one (public data)
- Environment-specific accuracy (ESP32 fine-tuning)
- Ongoing drift correction (SONA runtime adaptation)