ADR-085: RuVector Neural Trader — Dynamic Market Graphs, MinCut Coherence Gating, and Proof-Gated Mutation

Status

Proposed

Date

2026-03-06

Deciders

ruv

ADR-016 RuVector integration patterns
ADR-030 RVF computational container
ADR-040 WASM programmable sensing
ADR-041 curated module registry
ADR-042 Security RVF AIDefence TEE
ADR-047 Proof-gated mutation protocol
examples/neural-trader/ existing example scaffold
Cognitive MinCut Engine
Mincut Gated Transformer
ruvector-postgres architecture
Cognitum Gate coherence layer

Context

Most trading systems still split the problem the wrong way.

They keep market data in one system, features in another, models in another, audit logs somewhere else, and risk logic in handwritten code wrapped around the outside. That creates latency, drift, and a complete mess when you try to explain why a model made a decision or why a learning update happened.

A neural trader built on RuVector should treat the market as a living graph, not a table of candles. The limit order book, executions, cancellations, venue changes, and cross-asset interactions form a dynamic relational structure. That structure is exactly where short-horizon edge exists.

The design goal is a single substrate where:

Raw market events become typed graph state
Vector embeddings represent evolving microstructure state
GNN and temporal attention operate directly on that state
Dynamic mincut acts as a first-class coherence and fragility signal
Every state mutation and policy action is proof-gated and attestable
Online learning remains bounded, replayable, and auditable

This ADR defines the RuVector-native implementation for Neural Trader as a coherence-first trading substrate for prediction, risk control, and bounded execution research.

Decision

We will implement Neural Trader as a RuVector-native market intelligence stack with six layers:

Ingest and normalization layer
Dynamic heterogeneous market graph in RuVector
Vector and graph learning layer using temporal GNN and attention
Two-stage memory selection and replay layer
MinCut-based coherence gate for write, retrieve, learn, and act
Policy and actuation layer with proof-gated mutation and witness logs

The system will use Postgres as the relational source of record, with ruvector-postgres as the embedded vector engine and the RuVector graph substrate for dynamic structural reasoning.

No model output may directly mutate live strategy state, place orders, or update memory without passing coherence, risk, and policy gates.

Why This Decision

This approach matches the actual shape of markets.

A limit order book is not just a time series. It is a dynamic graph with queue locality, price adjacency, event causality, hidden liquidity hints, and regime-dependent cross-symbol coupling. A graph-plus-vector substrate captures that directly.

RuVector also gives us something most trading systems do not have:

Dynamic mincut as a real-time structural integrity signal
Unified vector-plus-graph storage
Replayable witness logs
Proof-gated state mutation
Local-first deployment paths from server to WASM to edge nodes

The result is a trading research platform that optimizes for bounded intelligence rather than blind prediction.

Scope

In scope

Market data representation
RuVector schema
Embedding and learning design
Memory selection
Coherence gating
Actuation policy
Verification and auditability
Deployment topology

Out of scope

Broker-specific adapters in detail
Exchange colocation engineering
Final production capital allocation policy
Regulatory filing requirements by jurisdiction

Assumptions

Primary use case is short-horizon market making, execution assistance, or micro-alpha research.
Input streams include order book updates, trades, cancels, modifies, venue metadata, and optional cross-asset feeds.
Latency budget is sub-second for research serving, with optional lower-latency kernels for action gating.
Hidden liquidity cannot be observed directly, so proxies are inferred from event patterns.
Online learning must remain bounded and reversible.
Correctness is treated as adversarially stressed rather than guaranteed.

Architecture

1. Ingest and Normalization

Input streams:

L2 or L3 order book deltas
Trades and fills
Order lifecycle events (new, modify, cancel, expire)
Venue state and session markers
Symbol metadata
Optional news, macro, or derived volatility streams

Normalization output:

Canonical event envelopes
Sequence-aligned timestamps
Symbol and venue partition keys
Side, price, size, aggressor, queue, and microstructure features
Compact hashes for traceability

Canonical event envelope:

pub struct MarketEvent {
    pub event_id: [u8; 16],
    pub ts_exchange_ns: u64,
    pub ts_ingest_ns: u64,
    pub venue_id: u16,
    pub symbol_id: u32,
    pub event_type: EventType,
    pub side: Option<Side>,
    pub price_fp: i64,
    pub qty_fp: i64,
    pub order_id_hash: Option<[u8; 16]>,
    pub participant_id_hash: Option<[u8; 16]>,
    pub flags: u32,
    pub seq: u64,
}

2. RuVector Graph Model

The order book becomes a typed heterogeneous dynamic graph.

Node kinds:

#	Kind	Description
1	Symbol	Tradable instrument
2	Venue	Exchange or dark pool
3	PriceLevel	Individual price level in the book
4	Order	Resting or aggressing order proxy
5	Trade	Matched execution
6	Event	Raw market event
7	Participant	Anonymized participant proxy
8	TimeBucket	Discretized time window
9	Regime	Market regime classification
10	StrategyState	Current strategy context

Edge kinds:

#	Edge	From → To
1	`AT_LEVEL`	Order → PriceLevel
2	`NEXT_TICK`	PriceLevel ↔ PriceLevel
3	`GENERATED`	Event → Order or Trade
4	`MATCHED`	Aggressor ↔ Resting order proxy
5	`MODIFIED_FROM`	Order → Order (prior version)
6	`CANCELED_BY`	Event → Order
7	`BELONGS_TO_SYMBOL`	* → Symbol
8	`ON_VENUE`	* → Venue
9	`IN_WINDOW`	* → TimeBucket
10	`CORRELATED_WITH`	Symbol ↔ Symbol
11	`IN_REGIME`	TimeBucket → Regime
12	`AFFECTS_STATE`	* → StrategyState

Core properties — PriceLevel:

Visible depth
Estimated hidden depth
Queue length
Local imbalance
Refill rate
Depletion rate
Spread distance
Local realized volatility

Core properties — Order:

Side
Limit price
Current queue estimate
Age
Modify count
Cancel hazard score
Fill hazard score

Core properties — Trade:

Aggressor side
Size
Slippage to mid
Post-trade impact window

Core properties — Edge:

Event time delta
Transition count
Influence score
Coherence contribution
Venue confidence

3. Vector Representation

Each important subgraph window is embedded into RuVector.

Embedding families:

Book state embedding
Queue state embedding
Event stream embedding
Cross-symbol regime embedding
Strategy context embedding
Risk context embedding

Recommended representation split:

Dense float embeddings for state similarity
Compressed low-bit serving vectors for fast retrieval
Graph neighborhood fingerprints for structural similarity
Contrastive delta embeddings for regime shift detection

Example keyspaces in ruvector-postgres:

-- Event log: range-partitioned by ts_exchange_ns for bounded retention
CREATE TABLE nt_event_log (
    event_id       BYTEA NOT NULL,
    ts_exchange_ns BIGINT NOT NULL,
    ts_ingest_ns   BIGINT NOT NULL,
    venue_id       INT NOT NULL,
    symbol_id      INT NOT NULL,
    event_type     INT NOT NULL,
    payload        JSONB NOT NULL,
    witness_hash   BYTEA,
    PRIMARY KEY (ts_exchange_ns, event_id)
) PARTITION BY RANGE (ts_exchange_ns);

CREATE INDEX idx_event_log_symbol_ts
    ON nt_event_log (symbol_id, ts_exchange_ns);
CREATE INDEX idx_event_log_venue_ts
    ON nt_event_log (venue_id, ts_exchange_ns);

-- Embeddings: composite index for time-range similarity queries
CREATE TABLE nt_embeddings (
    embedding_id   BIGSERIAL PRIMARY KEY,
    symbol_id      INT NOT NULL,
    venue_id       INT NOT NULL,
    ts_ns          BIGINT NOT NULL,
    embedding_type TEXT NOT NULL,
    dim            INT NOT NULL,
    metadata       JSONB NOT NULL,
    embedding      vector(256)
);

CREATE INDEX idx_embeddings_symbol_ts
    ON nt_embeddings (symbol_id, ts_ns DESC);
CREATE INDEX idx_embeddings_type_ts
    ON nt_embeddings (embedding_type, ts_ns DESC);
CREATE INDEX idx_embeddings_vec_hnsw
    ON nt_embeddings USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 200);

-- Replay segments: partitioned by start_ts_ns for retention management
CREATE TABLE nt_segments (
    segment_id   BIGSERIAL NOT NULL,
    symbol_id    INT NOT NULL,
    start_ts_ns  BIGINT NOT NULL,
    end_ts_ns    BIGINT NOT NULL,
    segment_kind TEXT NOT NULL,
    rvf_blob     BYTEA,
    signature    BYTEA,
    witness_hash BYTEA,
    metadata     JSONB,
    PRIMARY KEY (start_ts_ns, segment_id)
) PARTITION BY RANGE (start_ts_ns);

CREATE INDEX idx_segments_symbol_ts
    ON nt_segments (symbol_id, start_ts_ns DESC);
CREATE INDEX idx_segments_kind
    ON nt_segments (segment_kind, start_ts_ns DESC);

4. Learning Layer

We will use a temporal graph learning stack.

Model family:

Typed message passing over dynamic graph neighborhoods
Temporal attention over recent event windows
Optional sequence head for action or risk outputs
Auxiliary contrastive loss for regime separation
Coherence regularization using mincut and boundary stability

Primary prediction heads:

Next-window mid-price move probability
Fill probability for candidate placements
Cancel probability for resting liquidity
Slippage risk
Local volatility jump risk
Regime transition probability

Control heads:

Place or do-not-place
Modify or hold
Size scaling factor
Venue selection
Learning write admission score

Loss design — total loss:

L = L_pred + λ₁·L_fill + λ₂·L_risk + λ₃·L_contrast + λ₄·L_coherence + λ₅·L_budget

Where:

L_pred — predicts short-horizon outcome
L_fill — estimates execution quality
L_risk — penalizes unstable high-drawdown actions
L_contrast — separates regimes and recurrent motifs
L_coherence — penalizes representation drift across stable partitions
L_budget — penalizes actions that exceed risk or actuation budgets

5. Memory Design

Memory must be selective, bounded, and useful.

Stage A: Streaming Sketch

Keep cheap summaries for recent heavy hitters.

Structures:

Count-Min sketch for repeated motifs
Top-K for impactful levels, venues, regimes
Rolling range sketches for volatility and imbalance bands
Delta histograms for event transitions

Purpose:

Detect recurring market motifs
Prioritize candidate memory writes
Reduce storage pressure
Preserve streaming summaries even when raw fragments age out

Stage B: Uncertainty-Guided Reservoir

Store high-value replay fragments when one or more conditions hold:

High model uncertainty
Large realized PnL impact
Regime transition
Structural anomaly
Rare queue pattern
High disagreement between model heads

Each stored fragment becomes an RVF or signed segment containing:

Compact subgraph
Embeddings
Labels and realized outcomes
Coherence statistics
Lineage metadata
Witness hash and signature

6. Coherence Gate

Dynamic mincut is the central gate.

We compute a compact induced subgraph linking:

Incoming market events
Local price levels
Relevant prior memories
Current strategy state
Risk nodes

From this graph we derive:

Canonical mincut partition
Cut value
Boundary node identities
Cut drift over time
Embedding drift by partition
CUSUM alarms over cut metrics

Gate uses:

Memory write admission
Memory retrieval confidence
Online learner update permission
Action permission
Early rollback trigger
Anomaly escalation

Gate policy — allow only when ALL are true:

Cut value above floor for current regime
Boundary identity stable across last N windows
No sustained CUSUM breach
Risk budgets available
Policy allows actuation
Model confidence exceeds threshold conditioned on coherence

Gate result type:

pub struct CoherenceDecision {
    pub allow_retrieve: bool,
    pub allow_write: bool,
    pub allow_learn: bool,
    pub allow_act: bool,
    pub mincut_value: u64,
    pub partition_hash: [u8; 16],
    pub drift_score: f32,
    pub cusum_score: f32,
    pub reasons: Vec<String>,
}

Proof-Gated Mutation

No state mutation occurs without a proof token.

This includes:

Memory writes
Model promotion
Policy threshold changes
Live order intents
Strategy state transitions

Mutation protocol:

Compute features and local graph
Compute coherence decision
Evaluate policy kernel
Mint verified token if allowed
Apply mutation
Append witness receipt

Receipt fields:

Timestamp
Model ID
Input segment hash
Coherence witness hash
Policy hash
Action intent
Verified token ID
Resulting state hash

Serving Flow

Research or Paper Trading Path

Ingest market events
Update graph and embeddings
Retrieve similar memory fragments
Compute model outputs
Run coherence gate
Run policy and budget checks
Emit action recommendation
Store replay artifacts if admitted

Live Bounded Execution Path

Ingest event burst
Update local graph cache
Score candidate actions
Compute mincut coherence
Check exposure and slippage budgets
Require proof token
Publish broker intent
Record signed receipt

Policy Kernel

The policy kernel is explicit and auditable.

Inputs:

Coherence decision
Model outputs
Position state
Exposure limits
Venue constraints
Liquidity conditions
Market halts or macro blocks

Rules:

Never place if coherence is unstable
Never upsize in regime uncertainty spike
Never write memory during adversarial drift burst unless explicitly quarantined
Never learn online when realized slippage exceeds bound and cut drift is rising
Always throttle actuation when order rate or cancel rate limits approach venue thresholds

Data Retention and Lineage

Three Tiers

Hot tier:

Recent event graph state
Recent embeddings
Recent witness chain
Active memory reservoir

Warm tier:

Signed replay segments
Compressed embeddings
Model evaluation sets
Daily partition statistics

Cold tier:

Long-horizon archives
Training corpora
Promoted model lineage
Audit snapshots

Lineage requirements:

Every model maps to training fragments
Every live action maps to model and policy version
Every mutation maps to a verified token and witness chain
Every rollback maps to explicit trigger and prior state hash

RuVector Implementation Details

Collections

Recommended logical collections:

nt_market_graph
nt_embeddings_hot
nt_embeddings_archive
nt_memory_segments
nt_policy_receipts
nt_model_registry
nt_regime_index

Indexing

HNSW or RuVector ANN for embedding retrieval
Graph neighborhood cache for local subgraph extraction
Time-partitioned relational tables in Postgres
Quantized serving vectors for low-latency retrieval
Optional hyperbolic geometry for regime and hierarchy embeddings

Retrieval Strategy

Hybrid retrieval score:

S = α·similarity + β·structural_overlap + γ·regime_match + δ·coherence_bonus

Where:

similarity — vector distance
structural_overlap — graph neighborhood match
regime_match — volatility and spread regime comparison
coherence_bonus — reward for fragments from stable partitions

Weights are constrained: α + β + γ + δ = 1. Defaults: α=0.4, β=0.25, γ=0.2, δ=0.15. Tuned per regime via walk-forward validation.

Rust Module Layout

crates/
  neural-trader-core/         # Event schema, types, ingest
  neural-trader-graph/         # Dynamic heterogeneous market graph
  neural-trader-features/      # Feature extraction and embedding
  neural-trader-memory/        # Two-stage memory selection
  neural-trader-coherence/     # MinCut coherence gate
  neural-trader-policy/        # Policy kernel and risk budgets
  neural-trader-execution/     # Broker adapters, order intent
  neural-trader-replay/        # RVF replay segments, witness logs
  neural-trader-rvf/           # RVF serialization bindings
  neural-trader-server/        # gRPC/HTTP serving layer

Core Traits

pub trait EventIngestor {
    fn ingest(&mut self, event: MarketEvent) -> anyhow::Result<()>;
}

pub trait GraphUpdater {
    fn apply_event(&mut self, event: &MarketEvent) -> anyhow::Result<GraphDelta>;
}

pub trait Embedder {
    fn embed_state(&self, ctx: &StateWindow) -> anyhow::Result<Vec<f32>>;
}

pub trait MemoryStore {
    fn retrieve(&self, query: &MemoryQuery) -> anyhow::Result<Vec<MemorySegment>>;
    fn maybe_write(
        &mut self,
        seg: MemorySegment,
        gate: &CoherenceDecision,
    ) -> anyhow::Result<bool>;
}

pub trait CoherenceGate {
    fn evaluate(&self, ctx: &GateContext) -> anyhow::Result<CoherenceDecision>;
}

pub trait PolicyKernel {
    fn decide(&self, input: &PolicyInput) -> anyhow::Result<ActionDecision>;
}

pub trait WitnessLogger {
    fn append_receipt(&mut self, receipt: WitnessReceipt) -> anyhow::Result<()>;
}

Training Plan

Offline Phase

Ingest historical L2 or L3 streams
Build dynamic graph windows
Create replay segments
Train temporal GNN and retrieval heads
Calibrate confidence
Validate on walk-forward splits
Measure coherence-aware versus non-coherence baselines

Online Bounded Adaptation

Allowed:

Calibration updates
Retrieval weighting
Memory admission thresholds
Narrow regime adaptation

Forbidden without manual promotion:

Major architecture changes
Policy kernel changes
Risk budget changes
Output head rewiring

Evaluation

Core Metrics

Prediction:

Fill probability calibration
Short-horizon direction AUC
Slippage error
Realized adverse selection

Trading:

PnL
Sharpe or information ratio
Max drawdown
Inventory risk
Cancel-to-fill quality
Venue quality

Coherence:

Average mincut by regime
Partition stability
Drift detection precision
False-positive gate rate
Rollback trigger quality

Systems:

p50 / p95 / p99 latency
Retrieval latency
Write amplification
Storage growth
Witness overhead

Acceptance Criteria

Phase 1 — Research

Replayable end-to-end pipeline
Deterministic witness logs
Measurable improvement from graph-plus-coherence over price-only baseline
Bounded online updates with rollback

Phase 2 — Paper Trading

Stable gate behavior under live feed noise
No uncontrolled action bursts
No unverified mutations
Explainable receipts for every recommendation

Phase 3 — Live Small Capital

Strict exposure limits enforced
Slippage within approved band
Rollback tested in production shadow mode
Daily audit completeness at 100%

Safety and Governance

Mandatory Controls

Notional exposure caps
Per-symbol limits
Sector or cross-asset correlation caps
Order rate and cancel rate caps
Slippage budget
Venue health checks
Market halt awareness
Human override and kill switch

Governance Requirements

All policy changes versioned
All model promotions signed
All live mutations proof-gated
All replay sets immutable after seal
All exceptions logged with witness chain

Failure Modes

1. Regime Shift Masquerading as Edge

Symptom: Model confidence rises while execution deteriorates.

Fix: Increase weight of coherence gate, reduce online learning scope, quarantine new memory writes.

2. Retrieval Poisoning

Symptom: Bad fragments dominate replay or inference retrieval.

Fix: Signed segment lineage, structural overlap thresholding, memory deprecation, reservoir diversity constraints.

3. Feedback Loop with Market Impact

Symptom: Strategy reacts to its own footprint.

Fix: Actuation throttles, self-impact features, venue split, delayed reinforcement of impacted samples.

4. Overfitting to Stable Partitions

Symptom: System ignores true novelty.

Fix: Maintain novelty quota in memory reservoir, adversarial validation, regime-balanced evaluation.

5. Latency Creep

Symptom: Graph growth degrades serving time.

Fix: Compact local subgraphs, quantized embeddings, hot-path kernels, bounded neighborhood extraction.

Alternatives Considered

Alternative A: Pure Time-Series Transformer

Over candles and book tensors.

Rejected: Ignores explicit queue topology, event causality, and structural integrity.

Alternative B: Traditional Feature Engineering + Boosted Trees

Rejected: Works in narrow slices, but memory, structure, and drift handling remain bolted on rather than native.

Alternative C: End-to-End RL Trader

Rejected: Action-space instability, reward hacking risk, and poor auditability for early deployment.

Consequences

Positive

Unified substrate for data, memory, learning, and governance
Explicit structural reasoning over market microstructure
Bounded and auditable online learning
First-class drift and fragility detection
Reproducible replays and mutation receipts

Negative

More complex graph engineering
Higher initial systems effort than plain tensor pipelines
Policy design must be disciplined
Coherence thresholds require calibration by regime

Implementation Plan

Phase 1 — Foundation

Define canonical market event schema
Implement RuVector graph projection
Implement hot embedding pipeline
Implement replay segment writer
Implement mincut gate service
Implement witness receipts

Phase 2 — Learning

Train baseline GNN plus temporal attention
Add retrieval-augmented prediction
Add uncertainty scoring
Add reservoir memory writer
Compare against price-only baseline

Phase 3 — Bounded Action

Implement policy kernel
Implement paper trading adapter
Add risk budgets and throttles
Test rollback
Certify live shadow mode

Phase 4 — Live Research

Small capital deployment
Conservative venue set
Daily audit review
Promote only signed models
Continuous regime monitoring

Minimal Example Configuration

neural_trader:
  symbol_universe:
    - ES
    - NQ
    - CL

  ingest:
    venue_clock_tolerance_ns: 500000
    reorder_buffer_events: 2048

  graph:
    max_local_levels_per_side: 32
    max_orders_per_window: 5000
    neighborhood_hops: 2

  embeddings:
    dim: 256
    quantized_dim: 256
    similarity_metric: cosine

  memory:
    stage_a:
      count_min_width: 4096
      count_min_depth: 4
      topk: 256
    stage_b:
      reservoir_size: 50000
      min_uncertainty: 0.18
      min_realized_impact_bp: 1.5

  coherence:
    mincut_floor_by_regime:
      calm: 12
      normal: 9
      volatile: 6
    cusum_threshold: 4.5
    boundary_stability_windows: 8

  policy:
    max_notional_usd: 250000
    max_symbol_notional_usd: 50000
    max_order_rate_per_sec: 10
    max_cancel_rate_per_sec: 15
    max_slippage_bp: 2.0
    require_verified_token: true

  learning:
    online_mode: bounded
    allow_calibration_updates: true
    allow_memory_write: true
    allow_weight_updates: false

  retention:
    hot_window_hours: 4
    warm_retention_days: 30
    cold_archive_days: 365
    partition_interval_ns: 3600000000000  # 1 hour per partition
    vacuum_schedule_cron: "0 */6 * * *"

Decision Summary

Neural Trader will be built as a RuVector-native dynamic market graph system where vectors, graphs, temporal learning, and dynamic mincut work together as one bounded intelligence loop.

The core principle is simple:

Do not trust prediction alone. Trust prediction only when the surrounding market structure is coherent enough to justify learning, remembering, or acting.

That gives us a trader that is not just neural, but structurally self-aware.

Implementation Priority

Best immediate path is three crates first:

neural-trader-core — ingest, canonical types, event schema
neural-trader-coherence — mincut gating, coherence decisions
neural-trader-replay — witnessable segments, RVF integration

That gets ingest, witnessable segments, and mincut gating working before the full model stack is finalized.

Stretch option: Adding a Mincut Gated Transformer head for early exit and sparse compute during regime instability.

Frontier option: Deploying the coherence gate as a tiny deterministic kernel on Cognitum-style edge nodes or WASM workers so action permission stays cheap, bounded, and independently attestable.

Benchmark test: On replay, the coherence-gated model should beat a tensor-only baseline on slippage-adjusted PnL while reducing unstable memory writes and false actuation during regime shifts.

FilesExpand file tree

ADR-085-neural-trader-ruvector.md

Latest commit

History