Skip to content

[DB-013] Fix NonceCache epoch increment logic #688

@umwelt

Description

@umwelt

Priority: MEDIUM

Phase: 4 - Production Readiness

Problem

The NonceCache epoch increment logic is broken due to per-handshake instantiation. The epoch is supposed to increment on each restart to invalidate old nonces, but this is currently disabled.

Affected Files

  • lib-network/src/handshake/nonce_cache.rs (lines 188-197)

Current State

// ALPHA FIX: Do NOT increment epoch on open - this was causing epoch desync
// between client and server because NonceCache::open() is called per-handshake.
// The epoch increment belongs in a singleton initialization, not per-open.

// TODO: Implement proper network-derived epoch (genesis hash or chain height)

Impact: Cross-restart replay protection doesn't work.

Root Cause

NonceCache is instantiated per-handshake call, not as a singleton:

// In zhtp handler - called for EACH handshake
let nonce_cache = NonceCache::open(nonce_db_path, 3600, 100_000)?;

Solution

Option A: Singleton NonceCache (Recommended)

lazy_static! {
    static ref NONCE_CACHE: Arc<RwLock<NonceCache>> = {
        let cache = NonceCache::open_singleton(path, ttl, max_size)
            .expect("Failed to initialize nonce cache");
        Arc::new(RwLock::new(cache))
    };
}

// In handshake - use singleton
let cache = NONCE_CACHE.read().await;
cache.check_and_store(&nonce)?;

Option B: Network-Derived Epoch

impl NonceCache {
    pub fn open_with_epoch(
        path: PathBuf,
        ttl: u64,
        max_size: usize,
        network_epoch: u64,  // From genesis hash or chain height
    ) -> Result<Self> {
        // Use network_epoch instead of incrementing local counter
    }
}

Tests to Un-ignore

  • test_epoch_increments_on_reopen (line 699)
  • test_epoch_persistence (line 741)
  • test_cross_restart_replay_protection (line 779)

Acceptance Criteria

  • NonceCache is singleton or network-derived epoch
  • Epoch increments correctly on restart
  • Cross-restart replay protection works
  • All ignored tests un-ignored and passing
  • No epoch desync between client/server

Related

  • Part of storage layer overhaul: docs/architecture/STORAGE_OVERHAUL.md

Metadata

Metadata

Assignees

Labels

P2-MediumMedium priority - post-beta improvementsbugSomething isn't workingnetworkNetwork and P2P communicationphase-4Phase 4: Integration testing

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions