Neural Self-Learning DAG Architecture

Overview

The Neural Self-Learning DAG system transforms RuVector-Postgres from a static query executor into an adaptive system that learns optimal configurations from query patterns.

System Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                        NEURAL DAG RUVECTOR-POSTGRES                         │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                         SQL INTERFACE LAYER                          │   │
│  │  ruvector_enable_neural_dag() | ruvector_dag_patterns() | ...       │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│  ┌─────────────────────────────────┴───────────────────────────────────┐   │
│  │                      QUERY OPTIMIZER LAYER                           │   │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌────────────┐  │   │
│  │  │   Pattern   │  │  Attention  │  │    Cost     │  │   Plan     │  │   │
│  │  │   Matcher   │  │  Selector   │  │  Estimator  │  │  Rewriter  │  │   │
│  │  └─────────────┘  └─────────────┘  └─────────────┘  └────────────┘  │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│  ┌─────────────────────────────────┴───────────────────────────────────┐   │
│  │                       DAG ATTENTION LAYER                            │   │
│  │  ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐           │   │
│  │  │Topological│ │  Causal   │ │ Critical  │ │  MinCut   │           │   │
│  │  │ Attention │ │   Cone    │ │   Path    │ │  Gated    │           │   │
│  │  └───────────┘ └───────────┘ └───────────┘ └───────────┘           │   │
│  │  ┌───────────┐ ┌───────────┐ ┌───────────┐                         │   │
│  │  │Hierarchic │ │ Parallel  │ │ Temporal  │                         │   │
│  │  │  Lorentz  │ │  Branch   │ │   BTSP    │                         │   │
│  │  └───────────┘ └───────────┘ └───────────┘                         │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│  ┌─────────────────────────────────┴───────────────────────────────────┐   │
│  │                        SONA LEARNING LAYER                           │   │
│  │  ┌─────────────────────────────────────────────────────────────┐    │   │
│  │  │  INSTANT LOOP (<100μs)           BACKGROUND LOOP (hourly)   │    │   │
│  │  │  ┌─────────────┐                 ┌─────────────┐            │    │   │
│  │  │  │  MicroLoRA  │                 │  BaseLoRA   │            │    │   │
│  │  │  │  (rank 1-2) │                 │  (rank 8)   │            │    │   │
│  │  │  └─────────────┘                 └─────────────┘            │    │   │
│  │  │  ┌─────────────┐                 ┌─────────────┐            │    │   │
│  │  │  │ Trajectory  │ ──────────────► │ ReasoningBk │            │    │   │
│  │  │  │   Buffer    │                 │  (K-means)  │            │    │   │
│  │  │  └─────────────┘                 └─────────────┘            │    │   │
│  │  │                                  ┌─────────────┐            │    │   │
│  │  │                                  │   EWC++     │            │    │   │
│  │  │                                  │ (forgetting)│            │    │   │
│  │  │                                  └─────────────┘            │    │   │
│  │  └─────────────────────────────────────────────────────────────┘    │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│  ┌─────────────────────────────────┴───────────────────────────────────┐   │
│  │                      OPTIMIZATION LAYER                              │   │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌────────────┐  │   │
│  │  │   MinCut    │  │    HDC      │  │    BTSP     │  │   Self-    │  │   │
│  │  │  Analysis   │  │   State     │  │   Memory    │  │  Healing   │  │   │
│  │  │ O(n^0.12)   │  │ Compression │  │  One-Shot   │  │   Engine   │  │   │
│  │  └─────────────┘  └─────────────┘  └─────────────┘  └────────────┘  │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│  ┌─────────────────────────────────┴───────────────────────────────────┐   │
│  │                      STORAGE LAYER                                   │   │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌────────────┐  │   │
│  │  │   Pattern   │  │  Embedding  │  │  Trajectory │  │   Index    │  │   │
│  │  │   Store     │  │   Cache     │  │   History   │  │  Metadata  │  │   │
│  │  └─────────────┘  └─────────────┘  └─────────────┘  └────────────┘  │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                   OPTIONAL: QUDAG CONSENSUS LAYER                    │   │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌────────────┐  │   │
│  │  │  Federated  │  │  Pattern    │  │   ML-DSA    │  │    rUv     │  │   │
│  │  │  Learning   │  │  Consensus  │  │  Signatures │  │   Tokens   │  │   │
│  │  └─────────────┘  └─────────────┘  └─────────────┘  └────────────┘  │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Component Descriptions

1. SQL Interface Layer

Provides PostgreSQL-native functions for interacting with the Neural DAG system.

Key Components:

ruvector_enable_neural_dag() - Enable learning for a table
ruvector_dag_patterns() - View learned patterns
ruvector_attention_*() - DAG attention functions
ruvector_dag_learn() - Trigger learning cycle

Location: crates/ruvector-postgres/src/dag/operators.rs

2. Query Optimizer Layer

Intercepts queries and applies learned optimizations.

Key Components:

Pattern Matcher: Finds similar past query patterns via cosine similarity
Attention Selector: UCB bandit for choosing optimal attention type
Cost Estimator: Adaptive cost model with micro-LoRA updates
Plan Rewriter: Applies learned operator ordering and parameters

Location: crates/ruvector-postgres/src/dag/optimizer.rs

3. DAG Attention Layer

Seven specialized attention mechanisms for DAG structures.

Attention Type	Use Case	Complexity
Topological	Respect DAG ordering	O(n·k)
Causal Cone	Distance-weighted ancestors	O(n·d)
Critical Path	Focus on bottlenecks	O(n + critical_len)
MinCut Gated	Gate by criticality	O(n^0.12 + n·k)
Hierarchical Lorentz	Deep nesting	O(n·d)
Parallel Branch	Coordinate branches	O(n·b)
Temporal BTSP	Time-correlated patterns	O(n·w)

Location: crates/ruvector-postgres/src/dag/attention/

4. SONA Learning Layer

Two-tier learning system for continuous optimization.

Instant Loop (per-query):

MicroLoRA adaptation (rank 1-2)
Trajectory recording
<100μs overhead

Background Loop (hourly):

K-means++ pattern extraction
BaseLoRA updates (rank 8)
EWC++ constraint application

Location: crates/ruvector-postgres/src/dag/learning/

5. Optimization Layer

Advanced optimization components.

Key Components:

MinCut Analysis: O(n^0.12) bottleneck detection
HDC State: 10K-bit hypervector compression
BTSP Memory: One-shot pattern recall
Self-Healing: Proactive index repair

Location: crates/ruvector-postgres/src/dag/optimization/

6. Storage Layer

Persistent storage for learned patterns and state.

Key Components:

Pattern Store: DashMap + PostgreSQL tables
Embedding Cache: LRU cache for hot embeddings
Trajectory History: Ring buffer for recent queries
Index Metadata: Pattern-to-index mappings

Location: crates/ruvector-postgres/src/dag/storage/

7. QuDAG Consensus Layer (Optional)

Distributed learning via quantum-resistant consensus.

Key Components:

Federated Learning: Privacy-preserving pattern sharing
Pattern Consensus: QR-Avalanche for pattern validation
ML-DSA Signatures: Quantum-resistant pattern signing
rUv Tokens: Incentivize learning contributions

Location: crates/ruvector-postgres/src/dag/qudag/

Data Flow

Query Execution Flow

SQL Query
    │
    ▼
┌─────────────────────────────────────┐
│ 1. Pattern Matching                 │
│    - Embed query plan               │
│    - Find similar patterns in       │
│      ReasoningBank (cosine sim)     │
│    - Return top-k matches           │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│ 2. Optimization Decision            │
│    - If pattern found (conf > 0.8): │
│      Apply learned configuration    │
│    - Else:                          │
│      Use defaults + micro-LoRA      │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│ 3. Attention Selection              │
│    - UCB bandit selects attention   │
│    - Based on query pattern type    │
│    - Exploration vs exploitation    │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│ 4. Plan Execution                   │
│    - Execute with optimized params  │
│    - Record operator timings        │
│    - Track intermediate results     │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│ 5. Trajectory Recording             │
│    - Store query embedding          │
│    - Store operator activations     │
│    - Store outcome metrics          │
│    - Compute quality score          │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│ 6. Instant Learning                 │
│    - MicroLoRA gradient accumulate  │
│    - Auto-flush at 100 queries      │
│    - Update attention selector      │
└─────────────────────────────────────┘

Learning Cycle Flow

Hourly Trigger
    │
    ▼
┌─────────────────────────────────────┐
│ 1. Drain Trajectory Buffer          │
│    - Collect 1000+ trajectories     │
│    - Filter by quality threshold    │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│ 2. K-means++ Clustering             │
│    - 100 clusters                   │
│    - Deterministic initialization   │
│    - Max 100 iterations             │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│ 3. Pattern Extraction               │
│    - Compute cluster centroids      │
│    - Extract optimal parameters     │
│    - Calculate confidence scores    │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│ 4. EWC++ Constraint Check           │
│    - Compute Fisher information     │
│    - Apply forgetting prevention    │
│    - Detect task boundaries         │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│ 5. BaseLoRA Update                  │
│    - Apply constrained gradients    │
│    - Update all layers              │
│    - Merge weights if needed        │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│ 6. ReasoningBank Update             │
│    - Store new patterns             │
│    - Consolidate similar patterns   │
│    - Evict low-confidence patterns  │
└─────────────────────────────────────┘

Module Dependencies

ruvector-postgres/src/dag/
├── mod.rs                    # Module root, re-exports
├── operators.rs              # SQL function definitions
│
├── attention/
│   ├── mod.rs               # Attention trait and registry
│   ├── topological.rs       # TopologicalAttention
│   ├── causal_cone.rs       # CausalConeAttention
│   ├── critical_path.rs     # CriticalPathAttention
│   ├── mincut_gated.rs      # MinCutGatedAttention
│   ├── hierarchical.rs      # HierarchicalLorentzAttention
│   ├── parallel_branch.rs   # ParallelBranchAttention
│   ├── temporal_btsp.rs     # TemporalBTSPAttention
│   └── ensemble.rs          # EnsembleAttention
│
├── learning/
│   ├── mod.rs               # Learning coordinator
│   ├── sona_engine.rs       # SONA integration wrapper
│   ├── trajectory.rs        # Trajectory buffer
│   ├── patterns.rs          # Pattern extraction
│   ├── reasoning_bank.rs    # Pattern storage
│   ├── ewc.rs               # EWC++ integration
│   └── attention_selector.rs # UCB bandit selector
│
├── optimizer/
│   ├── mod.rs               # Optimizer coordinator
│   ├── pattern_matcher.rs   # Pattern matching
│   ├── cost_estimator.rs    # Adaptive costs
│   └── plan_rewriter.rs     # Plan transformation
│
├── optimization/
│   ├── mod.rs               # Optimization utilities
│   ├── mincut.rs            # Min-cut integration
│   ├── hdc_state.rs         # HDC compression
│   ├── btsp_memory.rs       # BTSP one-shot
│   └── self_healing.rs      # Self-healing engine
│
├── storage/
│   ├── mod.rs               # Storage coordinator
│   ├── pattern_store.rs     # Pattern persistence
│   ├── embedding_cache.rs   # Embedding LRU
│   └── trajectory_store.rs  # Trajectory history
│
├── qudag/
│   ├── mod.rs               # QuDAG integration
│   ├── federated.rs         # Federated learning
│   ├── consensus.rs         # Pattern consensus
│   ├── signatures.rs        # ML-DSA signing
│   └── tokens.rs            # rUv token interface
│
└── types/
    ├── mod.rs               # Type definitions
    ├── neural_plan.rs       # NeuralDagPlan
    ├── trajectory.rs        # DagTrajectory
    ├── pattern.rs           # LearnedDagPattern
    └── metrics.rs           # ExecutionMetrics

Configuration

Default Configuration

pub struct NeuralDagConfig {
    // Learning
    pub learning_enabled: bool,           // true
    pub max_trajectories: usize,          // 10000
    pub pattern_clusters: usize,          // 100
    pub quality_threshold: f32,           // 0.3
    pub background_interval_ms: u64,      // 3600000 (1 hour)

    // Attention
    pub default_attention: DagAttentionType, // Topological
    pub attention_exploration: f32,       // 0.1
    pub ucb_exploration_c: f32,           // 1.414

    // SONA
    pub micro_lora_rank: usize,           // 2
    pub micro_lora_lr: f32,               // 0.002
    pub base_lora_rank: usize,            // 8
    pub base_lora_lr: f32,                // 0.001

    // EWC++
    pub ewc_lambda: f32,                  // 2000.0
    pub ewc_max_lambda: f32,              // 15000.0
    pub ewc_fisher_decay: f32,            // 0.999

    // MinCut
    pub mincut_enabled: bool,             // true
    pub mincut_threshold: f32,            // 0.5

    // HDC
    pub hdc_dimensions: usize,            // 10000

    // Self-Healing
    pub healing_enabled: bool,            // true
    pub healing_check_interval_ms: u64,   // 300000 (5 min)
}

PostgreSQL GUC Variables

-- Enable/disable neural DAG
SET ruvector.neural_dag_enabled = true;

-- Learning parameters
SET ruvector.dag_learning_rate = 0.002;
SET ruvector.dag_pattern_clusters = 100;
SET ruvector.dag_quality_threshold = 0.3;

-- Attention parameters
SET ruvector.dag_attention_type = 'auto';
SET ruvector.dag_attention_exploration = 0.1;

-- EWC parameters
SET ruvector.dag_ewc_lambda = 2000.0;

-- MinCut parameters
SET ruvector.dag_mincut_enabled = true;
SET ruvector.dag_mincut_threshold = 0.5;

Performance Targets

Operation	Target Latency	Notes
Pattern matching	<1ms	Top-5 similar patterns
Attention computation	<500μs	Per operator
MicroLoRA forward	<100μs	Per query
Trajectory recording	<50μs	Non-blocking
Background learning	<5s	1000 trajectories
MinCut analysis	<10ms	O(n^0.12)
HDC encoding	<100μs	10K dimensions

Memory Budget

Component	Budget	Notes
Pattern Store	50MB	~1000 patterns per table
Embedding Cache	20MB	LRU for hot embeddings
Trajectory Buffer	20MB	10K trajectories
MicroLoRA	10KB	Per active query
BaseLoRA	400KB	Per table
HDC State	1.2KB	Per state snapshot

Total per table: ~100MB maximum

Thread Safety

All components use thread-safe primitives:

DashMap for concurrent pattern storage
parking_lot::RwLock for embedding cache
crossbeam::ArrayQueue for trajectory buffer
AtomicU64 for counters and statistics
PostgreSQL background workers for learning cycles

Error Handling

pub enum NeuralDagError {
    // Configuration errors
    InvalidConfig(String),
    TableNotEnabled(String),

    // Learning errors
    InsufficientTrajectories,
    PatternExtractionFailed,
    EwcConstraintViolation,

    // Attention errors
    AttentionComputationFailed,
    InvalidDagStructure,

    // Storage errors
    PatternStoreFull,
    EmbeddingCacheMiss,

    // MinCut errors
    MinCutComputationFailed,
    GraphDisconnected,

    // QuDAG errors (optional)
    ConsensusTimeout,
    SignatureVerificationFailed,
}

All errors are logged and non-fatal - the system falls back to default behavior on error.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Neural Self-Learning DAG Architecture

Overview

System Architecture

Component Descriptions

1. SQL Interface Layer

2. Query Optimizer Layer

3. DAG Attention Layer

4. SONA Learning Layer

5. Optimization Layer

6. Storage Layer

7. QuDAG Consensus Layer (Optional)

Data Flow

Query Execution Flow

Learning Cycle Flow

Module Dependencies

Configuration

Default Configuration

PostgreSQL GUC Variables

Performance Targets

Memory Budget

Thread Safety

Error Handling

FilesExpand file tree

01-ARCHITECTURE.md

Latest commit

History

01-ARCHITECTURE.md

File metadata and controls

Neural Self-Learning DAG Architecture

Overview

System Architecture

Component Descriptions

1. SQL Interface Layer

2. Query Optimizer Layer

3. DAG Attention Layer

4. SONA Learning Layer

5. Optimization Layer

6. Storage Layer

7. QuDAG Consensus Layer (Optional)

Data Flow

Query Execution Flow

Learning Cycle Flow

Module Dependencies

Configuration

Default Configuration

PostgreSQL GUC Variables

Performance Targets

Memory Budget

Thread Safety

Error Handling