NodeSync is a fault-tolerant distributed key-value store built in Python that demonstrates core distributed systems concepts including replication, leader election, failure detection, and automatic failover.
This project was built as a learning-oriented yet production-inspired system to explore how real-world distributed databases maintain availability and consistency under failures.
- Distributed key-value storage using TCP sockets
- Multi-node replication
- Heartbeat-based failure detection
- Leader election (Raft-lite)
- Leader-based writes with automatic forwarding
- Automatic leader re-election on node failure
- Concurrent client handling using threads
- Each node runs an independent TCP server
- Nodes communicate peer-to-peer
- One node acts as the leader
- All
SEToperations go through the leader - Followers forward writes to the leader
- Writes are replicated to all alive peers
- Heartbeats continuously monitor node health
- Leader is re-elected automatically on failure
Leader election is deterministic:
➡️ the alive node with the highest port number becomes leader.
Read detailed NodeSync Architecture at Architecture.md
NodeSync/
├── nodes/
│ └── node.py # Distributed node implementation
│ └── init.py
├── client/
│ └── init.py
├── logs.md
├── README.md
└── requirements.txt
Each node runs as an independent process.
- The first argument is the node's port (used as its ID)
- Remaining arguments are peer addresses
python nodes/node.py 5000 127.0.0.1:5001 127.0.0.1:5002
python nodes/node.py 5001 127.0.0.1:5000 127.0.0.1:5002
python nodes/node.py 5002 127.0.0.1:5000 127.0.0.1:5001
The node with the highest port becomes leader.
16:55:49 [NodeSync ] Node 5000 running on 127.0.0.1:5000
16:55:49 [NodeSync ] Peers: [('127.0.0.1', 5001), ('127.0.0.1', 5002)]
16:55:49 [ELECTION ] New leader elected: 5002
$client = New-Object System.Net.Sockets.TcpClient("127.0.0.1", 5002)
$stream = $client.GetStream()
$writer = New-Object System.IO.StreamWriter($stream)
$reader = New-Object System.IO.StreamReader($stream)
$writer.AutoFlush = $true
$writer.WriteLine("SET hello world")
$reader.ReadLine()
$writer.WriteLine("GET hello")
$reader.ReadLine()
NodeSync supports runtime-switchable consistency models, allowing experimentation with CAP trade-offs.
-
Eventual consistency (default)
- Writes are replicated asynchronously
- Lower latency
- Temporary inconsistencies may occur
-
Strong consistency (quorum-based)
- Leader waits for acknowledgements from a majority of nodes
- Higher latency
- Linearizable writes as long as quorum is available
Consistency can be changed dynamically using a client command:
CONSISTENCY eventual
CONSISTENCY strong
The consistency mode applies to the node receiving the command (typically the leader).
$writer.WriteLine("CONSISTENCY strong")
$reader.ReadLine()
$writer.WriteLine("SET key value")
$reader.ReadLine()
If quorum cannot be reached in strong mode, the write fails with:
FAIL: quorum not reached
This enables direct comparison of consistency-performance trade-offs.
NodeSync was benchmarked under both eventual and strong consistency modes.
Results show that quorum-based strong consistency increases write latency due to replica acknowledgements, while eventual consistency provides lower latency.
- Detailed observations & results are available in benchmark/RESULTS.md.
python benchmark/benchmark.py
[Benchmark] Running Eventual Consistency test...
[Benchmark] Running Strong Consistency test...
======== RESULTS ========
Eventual Consistency:
Avg latency: 0.0139s
Max latency: 0.0371s
Strong Consistency:
Avg latency: 0.0153s
Max latency: 0.0429s
- Kill the leader node
- Remaining nodes elect a new leader automatically
- Writes continue without downtime
- Replication
- Leader election
- Failover
- Heartbeats
- Eventual consistency
- Concurrency control
- Persistent write-ahead logs
- Strong consistency using Raft log replication
- Sharding and partitioning
- Client library abstraction
NodeSync demonstrates hands-on understanding of distributed systems fundamentals commonly taught in graduate-level courses, including fault tolerance, leader-based coordination, and replication strategies.