Skip to content

[Safety Bug] Non-durable PreAccept responses enable conflicting fast commits #28

@SherlockShemol

Description

@SherlockShemol

Summary

A safety vulnerability exists in the EPaxos implementation where replicas can respond to PreAccept messages before durably persisting the instance state. After a crash and restart, a replica may "forget" its participation in a fast quorum, breaking quorum intersection guarantees. This can allow two conflicting commands to both fast-commit with empty dependency sets, leading to execution order divergence and state inconsistency across replicas.

Bug Location

File: src/epaxos/epaxos.go
Function: handlePreAccept (lines 900-998) and sync (lines 218-224)

Problematic Code

1. The sync() function may be a no-op:

// sync with the stable store
func (r *Replica) sync() {
    if !r.Durable {     // ⚠️ If Durable=false, nothing is persisted!
        return
    }
    r.StableStore.Sync()
}

2. PreAccept handler sends response after sync():

func (r *Replica) handlePreAccept(preAccept *epaxosproto.PreAccept) {
    // ... update instance state in memory ...
    
    r.InstanceSpace[preAccept.Replica][preAccept.Instance] = &Instance{
        preAccept.Command,
        preAccept.Ballot,
        status,
        seq,
        deps,
        // ...
    }
    
    r.recordInstanceMetadata(r.InstanceSpace[preAccept.Replica][preAccept.Instance])
    r.recordCommands(preAccept.Command)
    r.sync()  // ← This may do nothing if Durable=false!
    
    // Then send response (lines 982-995)
    if changed || uncommittedDeps || ... {
        r.replyPreAccept(preAccept.LeaderId, &epaxosproto.PreAcceptReply{...})
    } else {
        r.SendMsg(preAccept.LeaderId, r.preAcceptOKRPC, pok)
    }
}

Root Cause

  1. Default configuration: Durable = false by default, meaning sync() does nothing
  2. Even with Durable = true: There's still a window between sync() completing and network send
  3. Memory-only state: If crash occurs after sending PreAcceptOK but before actual disk persistence, the instance state is lost

Attack Scenario

Consider N=5 replicas (R0-R4):

  1. Command A issued to R0:

    • R0 broadcasts PreAccept(A) to fast quorum {R0, R1, R2, R3}
    • All reply PreAcceptOK(A, seq=1, deps=∅)
    • R0 fast-commits A
  2. R2 crashes after sending PreAcceptOK but before durable persistence:

    • R2's in-memory state of A is lost
    • On restart, R2 has no record of A
  3. Command B (conflicting with A) issued to R4:

    • R4 broadcasts PreAccept(B) to fast quorum {R1, R2, R3, R4}
    • R2 (having "forgotten" A) replies PreAcceptOK(B, seq=1, deps=∅)
    • R4 fast-commits B with no dependency on A
  4. Result: Both A and B are committed with seq=1 and deps=∅

    • R0, R1 execute: A then B → final value = B
    • R3, R4 execute: B then A → final value = A
    • State divergence!

Test Case

func TestCrashThenForgetFastQuorumVotes(t *testing.T)

Test Output:

=== RUN   TestCrashThenForgetFastQuorumVotes
    crash_then_forget_fast_quorum_test.go:167: Both A and B are COMMITTED 
        with no dependency edges between them on all replicas.
    crash_then_forget_fast_quorum_test.go:220: Final values for key k 
        across replicas: [2 2 2 1 1]
    crash_then_forget_fast_quorum_test.go:223: execution-order agreement 
        violation: replicas disagree on final value of k (min=1, max=2)
--- FAIL: TestCrashThenForgetFastQuorumVotes (0.12s)

Impact

  • Severity: Critical
  • Type: Safety/Agreement violation
  • Impact: Replicas can permanently diverge in state, violating linearizability

Suggested Fix

Option 1: Force durable persistence before responding

func (r *Replica) handlePreAccept(preAccept *epaxosproto.PreAccept) {
    // ... update instance state ...
    
    r.recordInstanceMetadata(inst)
    r.recordCommands(preAccept.Command)
    
    // MUST sync before responding, regardless of Durable flag
    r.StableStore.Sync()  // Force sync
    
    // Only then send response
    r.replyPreAccept(...)
}

Option 2: Make Durable=true the default

func NewReplica(...) *Replica {
    r := &Replica{
        // ...
        Durable: true,  // Default to durable for safety
        // ...
    }
}

Option 3: Use write-ahead logging

Ensure all state changes are written to a WAL before any response is sent, and replay the WAL on recovery.

Notes

This is a known class of vulnerability in distributed consensus systems. The EPaxos paper assumes durable storage semantics, but the implementation allows non-durable mode for testing/performance, which breaks safety guarantees.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions