This document collects security hardening ideas and design notes that complement the normative rules in MAIN.md.
- 1. Enhanced Handshake Security
- 2. Adaptive Timestamp Tolerance
- 3. Multi-Device Nonce Collision Prevention
- 4. Ratchet-State Authenticated Replay Protection
- 5. Skipped Message Key DoS Mitigation
- 6. Multi-Path Mesh Routing
- 7. Adaptive Cover Traffic
- 8. Post-Quantum Library Hardening
- Security Policy (Responsible Disclosure)
Problem: Introduction token theft enables MITM before handshake completion
Solution: PAKE-augmented initial contact
// Add SPAKE2+ for introduction token protection
struct IntroductionToken {
bob_ltik_pub: PublicKey,
bob_mtsk_pub: PublicKey,
temp_dht_address: Address,
expiration: Timestamp,
pake_salt: [u8; 32], // NEW: SPAKE2+ salt
signature: Signature,
}
// Alice initiates with password-based authentication
impl ClientHello {
fn new_with_pake(
intro_token: &IntroductionToken,
shared_secret: &str // From QR code or out-of-band
) -> Self {
// SPAKE2+ prevents MITM even if token stolen
let pake = SPAKE2Plus::new(
b"LMP-introduction",
shared_secret.as_bytes(),
&intro_token.pake_salt
);
let (pake_msg, pake_state) = pake.start();
ClientHello {
// ... existing fields ...
pake_message: pake_msg, // NEW
pake_commitment: hash(pake_state), // NEW
}
}
}Benefits:
Even if introduction token leaked, attacker needs shared secret from QR code Backward compatible: falls back to signature-only if PAKE not supported Adds ~50ms latency, acceptable for handshake
Problem: 60-second timestamp window too strict for high-latency mesh networks
Solution: Sliding Window with Network Condition Adaptation
struct TimestampValidator {
base_tolerance: Duration,
adaptive_bonus: Duration,
recent_latencies: RingBuffer<Duration, 100>,
}
impl TimestampValidator {
fn new() -> Self {
Self {
base_tolerance: Duration::from_secs(60),
adaptive_bonus: Duration::from_secs(0),
recent_latencies: RingBuffer::new(),
}
}
fn validate(&mut self, msg_timestamp: i64) -> bool {
let now = current_timestamp();
let observed_latency = (now - msg_timestamp).abs();
// Update adaptive tolerance based on network conditions
self.recent_latencies.push(Duration::from_millis(observed_latency as u64));
let p95_latency = self.percentile(0.95);
// Allow timestamp if within base + adaptive window
let max_tolerance = self.base_tolerance + self.adaptive_bonus;
if observed_latency <= max_tolerance.as_millis() as i64 {
true
} else {
// Check if network degraded recently
if p95_latency > Duration::from_secs(30) {
self.adaptive_bonus = p95_latency; // Increase tolerance
log::warn!("High network latency detected, adjusting tolerance to {}s",
(self.base_tolerance + self.adaptive_bonus).as_secs());
}
false
}
}
fn percentile(&self, p: f64) -> Duration {
let mut sorted = self.recent_latencies.to_vec();
sorted.sort();
sorted[(sorted.len() as f64 * p) as usize]
}
}Benefits:
Adapts to degraded network conditions automatically Maintains strict validation (60s) under normal conditions Prevents false rejections during mesh congestion
Problem: Multiple devices with same conversation_id could reuse (key, nonce) pairs
Solution: Device-Specific Nonce Derivation
// Nonce is 96 bits (12 bytes) for ChaCha20-Poly1305.
// Derive a deterministic 32-bit prefix from (conversation_id, sender_device_id), then append a 64-bit counter.
// This avoids cross-device collisions when multiple devices participate in the same conversation.
fn derive_nonce(conversation_id: &[u8; 32], sender_device_id: &[u8; 16], message_number: u64) -> [u8; 12] {
let prefix32 = hkdf_sha3_256_extract(
&[conversation_id.as_slice(), sender_device_id.as_slice()].concat(),
b"LMP-nonce-prefix",
4,
);
let mut nonce = [0u8; 12];
nonce[0..4].copy_from_slice(&prefix32);
nonce[4..12].copy_from_slice(&message_number.to_le_bytes());
nonce
}Benefits:
Mathematically prevents nonce reuse across devices Same conversation, different devices → different nonces guaranteed Maintains 64-bit message counter space per device
Trade-off: Slightly higher implementation complexity due to deterministic prefix derivation
Problem: Attacker could replay old messages from previous ratchet iterations during out-of-order delivery
Solution: Ratchet-Bound MAC with Forward Processing
struct MessageHeader {
conversation_id: [u8; 32],
sender_device_id: [u8; 16],
message_number: u64,
prev_chain_length: u64,
ratchet_id: [u8; 16], // NEW: Unique per ratchet iteration
timestamp: i64,
eph_x25519_pub: Option<[u8; 32]>,
eph_kyber_pub: Option<Vec<u8>>,
mac: [u8; 16], // Now covers ratchet_id
}
impl RatchetState {
fn new_iteration(&mut self) -> [u8; 16] {
// Generate unique ratchet ID
let mut ratchet_id = [0u8; 16];
let input = [
&self.root_key[..],
&self.iteration_count.to_le_bytes(),
¤t_timestamp().to_le_bytes()
].concat();
hkdf_sha3_256_extract(
&input,
b"ratchet-iteration-id",
16
).copy_to_slice(&mut ratchet_id);
self.current_ratchet_id = ratchet_id;
self.iteration_count += 1;
ratchet_id
}
fn validate_message(&self, header: &MessageHeader) -> bool {
// Reject if ratchet_id doesn't match current or recent iterations
if header.ratchet_id != self.current_ratchet_id &&
!self.recent_ratchet_ids.contains(&header.ratchet_id) {
log::warn!("Message from unknown ratchet iteration, possible replay");
return false;
}
// Verify MAC includes ratchet_id (prevents cross-ratchet replay)
self.verify_mac(header)
}
}
// Forward-processing: Accept new ratchet even if old messages pending
impl MessageProcessor {
fn handle_out_of_order_ratchet(&mut self, new_ratchet_msg: Message) {
// Don't wait for old messages; advance ratchet immediately
log::info!("New ratchet detected with {} skipped messages, advancing",
new_ratchet_msg.header.message_number);
// Mark old messages as permanently lost
self.mark_messages_lost(
self.current_ratchet_id,
self.last_received_msg..new_ratchet_msg.header.message_number
);
// Process new ratchet
self.ratchet_state.advance(new_ratchet_msg);
}
}Benefits:
Prevents cross-ratchet replay attacks Attacker cannot replay M_old from ratchet_N during ratchet_N+1 Forward-processing prevents temporary state confusion Old messages explicitly marked lost rather than held indefinitely
Problem: Attacker floods with gaps just below MAX_SKIP repeatedly, exhausting memory
Solution: Per-Sender Rate Limiting + Exponential Backoff
struct SkippedKeyManager {
skipped_keys: HashMap<MessageId, MessageKey>,
sender_skip_counts: HashMap<DeviceId, SkipStats>,
global_skip_limit: usize, // 1000
per_sender_limit: usize, // 100 per sender
}
struct SkipStats {
total_skips: usize,
last_skip_time: Instant,
consecutive_skips: usize,
}
impl SkippedKeyManager {
fn try_skip(&mut self,
sender: &DeviceId,
skip_count: usize) -> Result<(), SkipError> {
let stats = self.sender_skip_counts
.entry(*sender)
.or_insert(SkipStats::default());
// Check per-sender limit
if stats.total_skips + skip_count > self.per_sender_limit {
log::warn!("Sender {} exceeded skip limit, possible DoS", sender);
return Err(SkipError::SenderLimitExceeded);
}
// Check global limit
if self.skipped_keys.len() + skip_count > self.global_skip_limit {
log::warn!("Global skip limit reached, rejecting message");
return Err(SkipError::GlobalLimitExceeded);
}
// Exponential backoff for consecutive skips
let min_delay = Duration::from_millis(100 * (1 << stats.consecutive_skips));
if stats.last_skip_time.elapsed() < min_delay {
log::warn!("Sender {} skipping too rapidly, applying backoff", sender);
return Err(SkipError::RateLimited);
}
// Update stats
stats.total_skips += skip_count;
stats.last_skip_time = Instant::now();
stats.consecutive_skips += 1;
Ok(())
}
fn on_successful_decrypt(&mut self, sender: &DeviceId) {
// Reset consecutive counter on normal message
if let Some(stats) = self.sender_skip_counts.get_mut(sender) {
stats.consecutive_skips = 0;
}
}
}Benefits:
Limits attacker impact to 100 skipped keys per sender Exponential backoff prevents rapid skip flooding Normal out-of-order delivery unaffected (resets on successful decryption)
Problem: 30% compromised relays could correlate traffic or disrupt delivery
Solution: Redundant Parallel Paths + Path Diversity
struct MeshRouter {
active_paths: HashMap<ConversationId, Vec<Path>>,
path_metrics: HashMap<PathId, PathMetrics>,
}
struct Path {
id: PathId,
hops: Vec<NodeId>,
latency: Duration,
reliability: f64,
}
impl MeshRouter {
fn send_message(&mut self, msg: &Message, conversation_id: &ConversationId) {
let paths = self.select_diverse_paths(conversation_id, 3); // 3 parallel paths
for (i, path) in paths.iter().enumerate() {
let fragment = if i == 0 {
msg.clone() // Primary: full message
} else {
msg.create_redundancy_shard(i) // Secondary: erasure-coded shards
};
self.route_via_path(fragment, path);
}
}
fn select_diverse_paths(&self,
conversation_id: &ConversationId,
count: usize) -> Vec<Path> {
let mut paths = Vec::new();
let mut used_nodes = HashSet::new();
for _ in 0..count {
// Select path with minimal node overlap
let candidate = self.find_path_avoiding(&used_nodes);
for hop in &candidate.hops {
used_nodes.insert(*hop);
}
paths.push(candidate);
}
paths
}
fn find_path_avoiding(&self, excluded_nodes: &HashSet<NodeId>) -> Path {
// Dijkstra's algorithm with node exclusion
// Prioritize: low latency, high reliability, disjoint from excluded
// ...
}
}
// Erasure coding for redundancy
impl Message {
fn create_redundancy_shard(&self, shard_id: usize) -> Message {
// Use Reed-Solomon: 2 of 3 shards sufficient to reconstruct
let encoder = ReedSolomon::new(2, 1).unwrap();
let shards = encoder.encode(&self.payload);
Message {
payload: shards[shard_id].clone(),
shard_metadata: ShardMetadata {
shard_id,
total_shards: 3,
reconstruction_needed: 2,
},
..self.clone()
}
}
}Benefits:
Reduces correlation risk: Attacker needs to compromise multiple disjoint paths Improves reliability: Message delivered if any 2 of 3 paths succeed Increases cost for attackers: Must control 30% of nodes on multiple parallel paths
Trade-offs:
3x bandwidth usage (mitigated by erasure coding) Slightly higher latency (wait for 2 of 3 paths)
Problem: Regular patterns in cover traffic may still leak metadata during long-term observation
Solution: Behavioral Mimicking + Random Bursts
struct CoverTrafficGenerator {
baseline_rate: f64, // 1 cell per 5 min
user_activity_profile: ActivityProfile,
last_burst: Instant,
}
struct ActivityProfile {
hourly_distribution: [f64; 24], // Probability per hour
burst_probability: f64,
burst_size_range: (usize, usize),
}
impl CoverTrafficGenerator {
fn generate_schedule(&mut self) -> Vec<Instant> {
let mut schedule = Vec::new();
let now = Instant::now();
// Baseline Poisson process
let mut next_event = now + self.poisson_sample(self.baseline_rate);
for _ in 0..100 { // Next 100 events
schedule.push(next_event);
next_event += self.poisson_sample(self.baseline_rate);
}
// Add random bursts (mimic user typing sessions)
if rand::random::<f64>() < self.user_activity_profile.burst_probability {
let burst_time = now + Duration::from_secs(rand::gen_range(0..3600));
let burst_size = rand::gen_range(
self.user_activity_profile.burst_size_range.0..
self.user_activity_profile.burst_size_range.1
);
for i in 0..burst_size {
schedule.push(burst_time + Duration::from_millis(i * 500));
}
}
// Sort and deduplicate
schedule.sort();
schedule.dedup();
schedule
}
fn adapt_to_user_behavior(&mut self, real_message_times: &[Instant]) {
// Learn user's messaging pattern
let mut hourly_counts = [0usize; 24];
for time in real_message_times {
let hour = time.hour();
hourly_counts[hour] += 1;
}
// Normalize to probability distribution
let total: usize = hourly_counts.iter().sum();
self.user_activity_profile.hourly_distribution =
hourly_counts.map(|c| c as f64 / total as f64);
log::info!("Cover traffic adapted to user's activity pattern");
}
}Benefits:
Cover traffic mimics user's real behavior (e.g., active 9AM-5PM, quiet at night) Random bursts prevent "too regular" patterns Adapts over time to changing user habits
Problem: PQ libraries (Kyber, Dilithium) are new and may have side-channel vulnerabilities
Solution: Constant-Time Verification + Dual-Library Validation
// Use TWO independent PQ implementations, verify both agree
struct HardenedPQCrypto {
primary_kyber: liboqs::Kyber768,
secondary_kyber: pqcrypto::Kyber768, // Different implementation
}
impl HardenedPQCrypto {
fn encapsulate(&self, public_key: &[u8]) -> Result<(Vec<u8>, Vec<u8>), Error> {
// Encapsulate with both libraries
let (ct1, ss1) = self.primary_kyber.encapsulate(public_key)?;
let (ct2, ss2) = self.secondary_kyber.encapsulate(public_key)?;
// Verify both produce same result (catch implementation bugs)
if constant_time_eq(&ss1, &ss2) {
Ok((ct1, ss1))
} else {
log::error!("PQ library mismatch detected, possible bug or attack");
Err(Error::PQLibraryMismatch)
}
}
fn decapsulate(&self,
ciphertext: &[u8],
secret_key: &[u8]) -> Result<Vec<u8>, Error> {
let ss1 = self.primary_kyber.decapsulate(ciphertext, secret_key)?;
let ss2 = self.secondary_kyber.decapsulate(ciphertext, secret_key)?;
if constant_time_eq(&ss1, &ss2) {
Ok(ss1)
} else {
Err(Error::PQLibraryMismatch)
}
}
}
// Side-channel resistant comparison
fn constant_time_eq(a: &[u8], b: &[u8]) -> bool {
if a.len() != b.len() {
return false;
}
let mut diff = 0u8;
for (x, y) in a.iter().zip(b.iter()) {
diff |= x ^ y;
}
diff == 0
}Benefits:
Catches implementation bugs before deployment Two independent codebases reduce risk of common vulnerabilities Constant-time operations prevent timing attacks
Trade-off: 2x computation for PQ operations (acceptable for handshakes, which are infrequent)
If you believe you’ve found a vulnerability in the protocol or its implementations:
- Do not open a public issue with exploit details.
- Do contact the maintainers privately with:
- affected component/version
- reproduction steps or proof-of-concept
- impact assessment
- any suggested mitigation
We aim to acknowledge reports within 7 days and provide a remediation plan or fix timeline as soon as possible.
The following critical and high-severity fixes have been implemented following a hostile security audit. See SECURITY_REMEDIATION.md for the complete remediation specification.
Issue: The InitiatorHandshake::complete() function re-encapsulated to the recipient's Kyber prekey instead of using the stored shared secret from ClientHello creation, causing Alice and Bob to derive DIFFERENT shared secrets.
Fix Implemented:
- Added
new_with_secret()method toClientHellothat returns the Kyber shared secret - Added
kyber_ss_abfield toInitiatorHandshaketo store the shared secret - Updated
complete()to use the stored secret instead of re-encapsulating - Deprecated the old
new()method with compile-time warning
Location: lmp-core/src/protocol/handshake.rs
Issue: An attacker could replace the kyber_ciphertext with one encapsulated to their own key, causing identity misbinding.
Fix Implemented:
- Added
recipient_identity_hashfield toClientHello - Hash is computed as
HKDF(recipient_ed25519 || recipient_dilithium, "recipient-identity-commit") - Identity hash is included in the signed portion of ClientHello
- Responder verifies identity hash before processing
Location: lmp-core/src/protocol/handshake.rs
Issue: HKDF used H(salt || ikm) instead of proper HMAC(salt, ikm).
Fix Implemented:
- Updated
Hkdf::extract()to useHMAC-SHA3-256(salt, ikm) - Updated
Hkdf::expand()to useHMAC-SHA3-256(PRK, T(i-1) | info | i) - Added
hmaccrate dependency
Location: lmp-core/src/crypto/hkdf.rs
Issue: Attackers could send messages with large gaps in message numbers, forcing victims to derive many skipped keys.
Fix Implemented:
- New
SkipRateLimitermodule with per-sender and global limits - Exponential backoff for consecutive skip events
- Normal message receipt resets consecutive skip counter
- Configurable limits via
SkipLimitConfig
Location: lmp-core/src/protocol/skip_limiter.rs
Issue: Malicious peers could refuse to advance DH ratchet, keeping compromised chain keys valid indefinitely.
Fix Implemented:
- Added
peer_last_ratchet_timeandpeer_ratchet_epochtracking toRatchetState - New
RatchetPolicystruct with configurable staleness thresholds check_peer_staleness()returnsRatchetStatus(Healthy/Warning/Stale/CriticallyStale)- Sessions can be rejected or terminated based on peer staleness
Location: lmp-core/src/protocol/ratchet.rs
Added structured error types for security conditions:
SkipLimitExceeded- Per-sender skip limit exceededGlobalSkipLimitExceeded- Global skip limit exceededSkipRateLimited- Exponential backoff not satisfiedIdentityMismatch- UKS attack preventionPeerRatchetStale- Peer not ratchetingNonceCounterOverflow- Requires mandatory ratchetSecurityDowngrade- PAKE requirement not metPakeRequired- PAKE authentication required
Location: lmp-core/src/error.rs
| ID | Invariant | Enforcement |
|---|---|---|
| INV-HS-1 | Both parties MUST derive identical shared secrets | Kyber secret stored, not re-encapsulated |
| INV-HS-2 | Identity commitment MUST bind Kyber encapsulation | recipient_identity_hash field |
| INV-HKDF-1 | Extract MUST use HMAC-SHA3-256, not raw hash | Proper HMAC implementation |
| INV-SKIP-1 | Per-sender skipped keys MUST NOT exceed limit | SkipRateLimiter enforcement |
| INV-RATCH-1 | Peer MUST ratchet within policy time limits | Staleness checking |