Skip to content

Security: AaryanBansal-dev/LMP_Lattice-Mesh-Protocol

Security

SECURITY.md

Lattice Mesh Protocol (LMP)

Security Enhancements

This document collects security hardening ideas and design notes that complement the normative rules in MAIN.md.

Table of contents

1. Enhanced Handshake Security

Problem: Introduction token theft enables MITM before handshake completion

Solution: PAKE-augmented initial contact

// Add SPAKE2+ for introduction token protection
struct IntroductionToken {
    bob_ltik_pub: PublicKey,
    bob_mtsk_pub: PublicKey,
    temp_dht_address: Address,
    expiration: Timestamp,
    pake_salt: [u8; 32],  // NEW: SPAKE2+ salt
    signature: Signature,
}

// Alice initiates with password-based authentication
impl ClientHello {
    fn new_with_pake(
        intro_token: &IntroductionToken,
        shared_secret: &str  // From QR code or out-of-band
    ) -> Self {
        // SPAKE2+ prevents MITM even if token stolen
        let pake = SPAKE2Plus::new(
            b"LMP-introduction",
            shared_secret.as_bytes(),
            &intro_token.pake_salt
        );
        
        let (pake_msg, pake_state) = pake.start();
        
        ClientHello {
            // ... existing fields ...
            pake_message: pake_msg,  // NEW
            pake_commitment: hash(pake_state),  // NEW
        }
    }
}

Benefits:

Even if introduction token leaked, attacker needs shared secret from QR code Backward compatible: falls back to signature-only if PAKE not supported Adds ~50ms latency, acceptable for handshake

2. Adaptive Timestamp Tolerance

Problem: 60-second timestamp window too strict for high-latency mesh networks

Solution: Sliding Window with Network Condition Adaptation

struct TimestampValidator {
    base_tolerance: Duration,
    adaptive_bonus: Duration,
    recent_latencies: RingBuffer<Duration, 100>,
}

impl TimestampValidator {
    fn new() -> Self {
        Self {
            base_tolerance: Duration::from_secs(60),
            adaptive_bonus: Duration::from_secs(0),
            recent_latencies: RingBuffer::new(),
        }
    }
    
    fn validate(&mut self, msg_timestamp: i64) -> bool {
        let now = current_timestamp();
        let observed_latency = (now - msg_timestamp).abs();
        
        // Update adaptive tolerance based on network conditions
        self.recent_latencies.push(Duration::from_millis(observed_latency as u64));
        let p95_latency = self.percentile(0.95);
        
        // Allow timestamp if within base + adaptive window
        let max_tolerance = self.base_tolerance + self.adaptive_bonus;
        
        if observed_latency <= max_tolerance.as_millis() as i64 {
            true
        } else {
            // Check if network degraded recently
            if p95_latency > Duration::from_secs(30) {
                self.adaptive_bonus = p95_latency; // Increase tolerance
                log::warn!("High network latency detected, adjusting tolerance to {}s", 
                          (self.base_tolerance + self.adaptive_bonus).as_secs());
            }
            false
        }
    }
    
    fn percentile(&self, p: f64) -> Duration {
        let mut sorted = self.recent_latencies.to_vec();
        sorted.sort();
        sorted[(sorted.len() as f64 * p) as usize]
    }
}

Benefits:

Adapts to degraded network conditions automatically Maintains strict validation (60s) under normal conditions Prevents false rejections during mesh congestion

3. Multi-Device Nonce Collision Prevention

Problem: Multiple devices with same conversation_id could reuse (key, nonce) pairs

Solution: Device-Specific Nonce Derivation

// Nonce is 96 bits (12 bytes) for ChaCha20-Poly1305.
// Derive a deterministic 32-bit prefix from (conversation_id, sender_device_id), then append a 64-bit counter.
// This avoids cross-device collisions when multiple devices participate in the same conversation.
fn derive_nonce(conversation_id: &[u8; 32], sender_device_id: &[u8; 16], message_number: u64) -> [u8; 12] {
    let prefix32 = hkdf_sha3_256_extract(
        &[conversation_id.as_slice(), sender_device_id.as_slice()].concat(),
        b"LMP-nonce-prefix",
        4,
    );

    let mut nonce = [0u8; 12];
    nonce[0..4].copy_from_slice(&prefix32);
    nonce[4..12].copy_from_slice(&message_number.to_le_bytes());
    nonce
}

Benefits:

Mathematically prevents nonce reuse across devices Same conversation, different devices → different nonces guaranteed Maintains 64-bit message counter space per device

Trade-off: Slightly higher implementation complexity due to deterministic prefix derivation

4. Ratchet-State Authenticated Replay Protection

Problem: Attacker could replay old messages from previous ratchet iterations during out-of-order delivery

Solution: Ratchet-Bound MAC with Forward Processing

struct MessageHeader {
    conversation_id: [u8; 32],
    sender_device_id: [u8; 16],
    message_number: u64,
    prev_chain_length: u64,
    ratchet_id: [u8; 16],  // NEW: Unique per ratchet iteration
    timestamp: i64,
    eph_x25519_pub: Option<[u8; 32]>,
    eph_kyber_pub: Option<Vec<u8>>,
    mac: [u8; 16],  // Now covers ratchet_id
}

impl RatchetState {
    fn new_iteration(&mut self) -> [u8; 16] {
        // Generate unique ratchet ID
        let mut ratchet_id = [0u8; 16];
        let input = [
            &self.root_key[..],
            &self.iteration_count.to_le_bytes(),
            &current_timestamp().to_le_bytes()
        ].concat();
        
        hkdf_sha3_256_extract(
            &input,
            b"ratchet-iteration-id",
            16
        ).copy_to_slice(&mut ratchet_id);
        
        self.current_ratchet_id = ratchet_id;
        self.iteration_count += 1;
        ratchet_id
    }
    
    fn validate_message(&self, header: &MessageHeader) -> bool {
        // Reject if ratchet_id doesn't match current or recent iterations
        if header.ratchet_id != self.current_ratchet_id &&
           !self.recent_ratchet_ids.contains(&header.ratchet_id) {
            log::warn!("Message from unknown ratchet iteration, possible replay");
            return false;
        }
        
        // Verify MAC includes ratchet_id (prevents cross-ratchet replay)
        self.verify_mac(header)
    }
}

// Forward-processing: Accept new ratchet even if old messages pending
impl MessageProcessor {
    fn handle_out_of_order_ratchet(&mut self, new_ratchet_msg: Message) {
        // Don't wait for old messages; advance ratchet immediately
        log::info!("New ratchet detected with {} skipped messages, advancing", 
                  new_ratchet_msg.header.message_number);
        
        // Mark old messages as permanently lost
        self.mark_messages_lost(
            self.current_ratchet_id,
            self.last_received_msg..new_ratchet_msg.header.message_number
        );
        
        // Process new ratchet
        self.ratchet_state.advance(new_ratchet_msg);
    }
}

Benefits:

Prevents cross-ratchet replay attacks Attacker cannot replay M_old from ratchet_N during ratchet_N+1 Forward-processing prevents temporary state confusion Old messages explicitly marked lost rather than held indefinitely

5. Skipped Message Key DoS Mitigation

Problem: Attacker floods with gaps just below MAX_SKIP repeatedly, exhausting memory

Solution: Per-Sender Rate Limiting + Exponential Backoff

struct SkippedKeyManager {
    skipped_keys: HashMap<MessageId, MessageKey>,
    sender_skip_counts: HashMap<DeviceId, SkipStats>,
    global_skip_limit: usize,  // 1000
    per_sender_limit: usize,   // 100 per sender
}

struct SkipStats {
    total_skips: usize,
    last_skip_time: Instant,
    consecutive_skips: usize,
}

impl SkippedKeyManager {
    fn try_skip(&mut self, 
                sender: &DeviceId, 
                skip_count: usize) -> Result<(), SkipError> {
        
        let stats = self.sender_skip_counts
            .entry(*sender)
            .or_insert(SkipStats::default());
        
        // Check per-sender limit
        if stats.total_skips + skip_count > self.per_sender_limit {
            log::warn!("Sender {} exceeded skip limit, possible DoS", sender);
            return Err(SkipError::SenderLimitExceeded);
        }
        
        // Check global limit
        if self.skipped_keys.len() + skip_count > self.global_skip_limit {
            log::warn!("Global skip limit reached, rejecting message");
            return Err(SkipError::GlobalLimitExceeded);
        }
        
        // Exponential backoff for consecutive skips
        let min_delay = Duration::from_millis(100 * (1 << stats.consecutive_skips));
        if stats.last_skip_time.elapsed() < min_delay {
            log::warn!("Sender {} skipping too rapidly, applying backoff", sender);
            return Err(SkipError::RateLimited);
        }
        
        // Update stats
        stats.total_skips += skip_count;
        stats.last_skip_time = Instant::now();
        stats.consecutive_skips += 1;
        
        Ok(())
    }
    
    fn on_successful_decrypt(&mut self, sender: &DeviceId) {
        // Reset consecutive counter on normal message
        if let Some(stats) = self.sender_skip_counts.get_mut(sender) {
            stats.consecutive_skips = 0;
        }
    }
}

Benefits:

Limits attacker impact to 100 skipped keys per sender Exponential backoff prevents rapid skip flooding Normal out-of-order delivery unaffected (resets on successful decryption)

6. Multi-Path Mesh Routing

Problem: 30% compromised relays could correlate traffic or disrupt delivery

Solution: Redundant Parallel Paths + Path Diversity

struct MeshRouter {
    active_paths: HashMap<ConversationId, Vec<Path>>,
    path_metrics: HashMap<PathId, PathMetrics>,
}

struct Path {
    id: PathId,
    hops: Vec<NodeId>,
    latency: Duration,
    reliability: f64,
}

impl MeshRouter {
    fn send_message(&mut self, msg: &Message, conversation_id: &ConversationId) {
        let paths = self.select_diverse_paths(conversation_id, 3);  // 3 parallel paths
        
        for (i, path) in paths.iter().enumerate() {
            let fragment = if i == 0 {
                msg.clone()  // Primary: full message
            } else {
                msg.create_redundancy_shard(i)  // Secondary: erasure-coded shards
            };
            
            self.route_via_path(fragment, path);
        }
    }
    
    fn select_diverse_paths(&self, 
                            conversation_id: &ConversationId, 
                            count: usize) -> Vec<Path> {
        let mut paths = Vec::new();
        let mut used_nodes = HashSet::new();
        
        for _ in 0..count {
            // Select path with minimal node overlap
            let candidate = self.find_path_avoiding(&used_nodes);
            
            for hop in &candidate.hops {
                used_nodes.insert(*hop);
            }
            
            paths.push(candidate);
        }
        
        paths
    }
    
    fn find_path_avoiding(&self, excluded_nodes: &HashSet<NodeId>) -> Path {
        // Dijkstra's algorithm with node exclusion
        // Prioritize: low latency, high reliability, disjoint from excluded
        // ...
    }
}

// Erasure coding for redundancy
impl Message {
    fn create_redundancy_shard(&self, shard_id: usize) -> Message {
        // Use Reed-Solomon: 2 of 3 shards sufficient to reconstruct
        let encoder = ReedSolomon::new(2, 1).unwrap();
        let shards = encoder.encode(&self.payload);
        
        Message {
            payload: shards[shard_id].clone(),
            shard_metadata: ShardMetadata {
                shard_id,
                total_shards: 3,
                reconstruction_needed: 2,
            },
            ..self.clone()
        }
    }
}

Benefits:

Reduces correlation risk: Attacker needs to compromise multiple disjoint paths Improves reliability: Message delivered if any 2 of 3 paths succeed Increases cost for attackers: Must control 30% of nodes on multiple parallel paths

Trade-offs:

3x bandwidth usage (mitigated by erasure coding) Slightly higher latency (wait for 2 of 3 paths)

7. Adaptive Cover Traffic

Problem: Regular patterns in cover traffic may still leak metadata during long-term observation

Solution: Behavioral Mimicking + Random Bursts

struct CoverTrafficGenerator {
    baseline_rate: f64,  // 1 cell per 5 min
    user_activity_profile: ActivityProfile,
    last_burst: Instant,
}

struct ActivityProfile {
    hourly_distribution: [f64; 24],  // Probability per hour
    burst_probability: f64,
    burst_size_range: (usize, usize),
}

impl CoverTrafficGenerator {
    fn generate_schedule(&mut self) -> Vec<Instant> {
        let mut schedule = Vec::new();
        let now = Instant::now();
        
        // Baseline Poisson process
        let mut next_event = now + self.poisson_sample(self.baseline_rate);
        
        for _ in 0..100 {  // Next 100 events
            schedule.push(next_event);
            next_event += self.poisson_sample(self.baseline_rate);
        }
        
        // Add random bursts (mimic user typing sessions)
        if rand::random::<f64>() < self.user_activity_profile.burst_probability {
            let burst_time = now + Duration::from_secs(rand::gen_range(0..3600));
            let burst_size = rand::gen_range(
                self.user_activity_profile.burst_size_range.0..
                self.user_activity_profile.burst_size_range.1
            );
            
            for i in 0..burst_size {
                schedule.push(burst_time + Duration::from_millis(i * 500));
            }
        }
        
        // Sort and deduplicate
        schedule.sort();
        schedule.dedup();
        schedule
    }
    
    fn adapt_to_user_behavior(&mut self, real_message_times: &[Instant]) {
        // Learn user's messaging pattern
        let mut hourly_counts = [0usize; 24];
        
        for time in real_message_times {
            let hour = time.hour();
            hourly_counts[hour] += 1;
        }
        
        // Normalize to probability distribution
        let total: usize = hourly_counts.iter().sum();
        self.user_activity_profile.hourly_distribution = 
            hourly_counts.map(|c| c as f64 / total as f64);
        
        log::info!("Cover traffic adapted to user's activity pattern");
    }
}

Benefits:

Cover traffic mimics user's real behavior (e.g., active 9AM-5PM, quiet at night) Random bursts prevent "too regular" patterns Adapts over time to changing user habits

8. Post-Quantum Library Hardening

Problem: PQ libraries (Kyber, Dilithium) are new and may have side-channel vulnerabilities

Solution: Constant-Time Verification + Dual-Library Validation

// Use TWO independent PQ implementations, verify both agree
struct HardenedPQCrypto {
    primary_kyber: liboqs::Kyber768,
    secondary_kyber: pqcrypto::Kyber768,  // Different implementation
}

impl HardenedPQCrypto {
    fn encapsulate(&self, public_key: &[u8]) -> Result<(Vec<u8>, Vec<u8>), Error> {
        // Encapsulate with both libraries
        let (ct1, ss1) = self.primary_kyber.encapsulate(public_key)?;
        let (ct2, ss2) = self.secondary_kyber.encapsulate(public_key)?;
        
        // Verify both produce same result (catch implementation bugs)
        if constant_time_eq(&ss1, &ss2) {
            Ok((ct1, ss1))
        } else {
            log::error!("PQ library mismatch detected, possible bug or attack");
            Err(Error::PQLibraryMismatch)
        }
    }
    
    fn decapsulate(&self, 
                   ciphertext: &[u8], 
                   secret_key: &[u8]) -> Result<Vec<u8>, Error> {
        let ss1 = self.primary_kyber.decapsulate(ciphertext, secret_key)?;
        let ss2 = self.secondary_kyber.decapsulate(ciphertext, secret_key)?;
        
        if constant_time_eq(&ss1, &ss2) {
            Ok(ss1)
        } else {
            Err(Error::PQLibraryMismatch)
        }
    }
}

// Side-channel resistant comparison
fn constant_time_eq(a: &[u8], b: &[u8]) -> bool {
    if a.len() != b.len() {
        return false;
    }
    
    let mut diff = 0u8;
    for (x, y) in a.iter().zip(b.iter()) {
        diff |= x ^ y;
    }
    
    diff == 0
}

Benefits:

Catches implementation bugs before deployment Two independent codebases reduce risk of common vulnerabilities Constant-time operations prevent timing attacks

Trade-off: 2x computation for PQ operations (acceptable for handshakes, which are infrequent)


Security Policy (Responsible Disclosure)

If you believe you’ve found a vulnerability in the protocol or its implementations:

  • Do not open a public issue with exploit details.
  • Do contact the maintainers privately with:
    • affected component/version
    • reproduction steps or proof-of-concept
    • impact assessment
    • any suggested mitigation

We aim to acknowledge reports within 7 days and provide a remediation plan or fix timeline as soon as possible.


9. Implemented Security Fixes (v2.0)

The following critical and high-severity fixes have been implemented following a hostile security audit. See SECURITY_REMEDIATION.md for the complete remediation specification.

9.1 Handshake Shared Secret Fix (CRITICAL)

Issue: The InitiatorHandshake::complete() function re-encapsulated to the recipient's Kyber prekey instead of using the stored shared secret from ClientHello creation, causing Alice and Bob to derive DIFFERENT shared secrets.

Fix Implemented:

  • Added new_with_secret() method to ClientHello that returns the Kyber shared secret
  • Added kyber_ss_ab field to InitiatorHandshake to store the shared secret
  • Updated complete() to use the stored secret instead of re-encapsulating
  • Deprecated the old new() method with compile-time warning

Location: lmp-core/src/protocol/handshake.rs

9.2 Unknown Key-Share (UKS) Attack Prevention (HIGH)

Issue: An attacker could replace the kyber_ciphertext with one encapsulated to their own key, causing identity misbinding.

Fix Implemented:

  • Added recipient_identity_hash field to ClientHello
  • Hash is computed as HKDF(recipient_ed25519 || recipient_dilithium, "recipient-identity-commit")
  • Identity hash is included in the signed portion of ClientHello
  • Responder verifies identity hash before processing

Location: lmp-core/src/protocol/handshake.rs

9.3 HKDF RFC 5869 Compliance Fix (MEDIUM)

Issue: HKDF used H(salt || ikm) instead of proper HMAC(salt, ikm).

Fix Implemented:

  • Updated Hkdf::extract() to use HMAC-SHA3-256(salt, ikm)
  • Updated Hkdf::expand() to use HMAC-SHA3-256(PRK, T(i-1) | info | i)
  • Added hmac crate dependency

Location: lmp-core/src/crypto/hkdf.rs

9.4 Skip Rate Limiter for DoS Mitigation (MEDIUM)

Issue: Attackers could send messages with large gaps in message numbers, forcing victims to derive many skipped keys.

Fix Implemented:

  • New SkipRateLimiter module with per-sender and global limits
  • Exponential backoff for consecutive skip events
  • Normal message receipt resets consecutive skip counter
  • Configurable limits via SkipLimitConfig

Location: lmp-core/src/protocol/skip_limiter.rs

9.5 Peer Ratchet Staleness Enforcement (HIGH)

Issue: Malicious peers could refuse to advance DH ratchet, keeping compromised chain keys valid indefinitely.

Fix Implemented:

  • Added peer_last_ratchet_time and peer_ratchet_epoch tracking to RatchetState
  • New RatchetPolicy struct with configurable staleness thresholds
  • check_peer_staleness() returns RatchetStatus (Healthy/Warning/Stale/CriticallyStale)
  • Sessions can be rejected or terminated based on peer staleness

Location: lmp-core/src/protocol/ratchet.rs

9.6 New Error Types

Added structured error types for security conditions:

  • SkipLimitExceeded - Per-sender skip limit exceeded
  • GlobalSkipLimitExceeded - Global skip limit exceeded
  • SkipRateLimited - Exponential backoff not satisfied
  • IdentityMismatch - UKS attack prevention
  • PeerRatchetStale - Peer not ratcheting
  • NonceCounterOverflow - Requires mandatory ratchet
  • SecurityDowngrade - PAKE requirement not met
  • PakeRequired - PAKE authentication required

Location: lmp-core/src/error.rs


Protocol Invariants (Enforced)

ID Invariant Enforcement
INV-HS-1 Both parties MUST derive identical shared secrets Kyber secret stored, not re-encapsulated
INV-HS-2 Identity commitment MUST bind Kyber encapsulation recipient_identity_hash field
INV-HKDF-1 Extract MUST use HMAC-SHA3-256, not raw hash Proper HMAC implementation
INV-SKIP-1 Per-sender skipped keys MUST NOT exceed limit SkipRateLimiter enforcement
INV-RATCH-1 Peer MUST ratchet within policy time limits Staleness checking

There aren’t any published security advisories