Skip to content

SwathiSBhat/Raft2PC-Bank

Repository files navigation

Raft2PC-Bank

Objective

This project implements a fault-tolerant distributed transaction processing system that supports a simple banking application. The data for users with ID ranging from 1-3000 is sharded across 3 clusters. Each cluster consists of 3 servers to replicate the data to ensure fault-tolerance. The application mainly supports 2 kinds of transactions - intra-shard and cross-shard. Intra-shard transaction occurs between users whose data is part of the same shard while Cross-shard transaction occurs between users whose data is stored across different shards.

Client shard mapping

Cluster User IDs
C1 [1, 2, 3, ..., 1000]
C2 [1001, 1002, 1003, ..., 2000]
C3 [2001, 2002, 2003, ..., 3000]

Intra-shard transaction

To process intra-shard transactions, Raft protocol is used to reach consensus for each individual transaction (of the form x y amt). The leader checks for 2 conditions:

  • there are no locks on data items x and y
  • the balance of x is at least equal to amt If both conditions are met, the leader needs to obtain locks and all other nodes obtain locks too. The leader executes the Raft protocol. Once then entry is committed on the log, the data store is updated. The leader then sends a message back to the client letting the client know that the transaction has been committed. The client waits for a reply from the leader to accept the result. In case the client does not receive a response within a specified time, the client retries the same request.

Cross-shard transaction

To process cross-shard transactions, 2 Phase-Commit (2PC) protocol is used across the clusters involved in the transaction.

  • The client acts as the co-ordinator for 2PC. The client initiates the request by sending the transaction to the respective clusters which initiate the Raft protocol.
  • To prevent concurrent updates to the data items, all servers in both clusters acquire locks on accessed items (item x in one cluster and item y in the other). Upon reaching consensus, the transaction is considered committed, but it does not execute the operation on the data store (as would be normal in Raft). We wait till the transaction is committed.
  • The leader of each cluster sends either a prepared/yes or abort message back to the transaction coordinator.
  • Upon receiving prepared/yes messages from both involved clusters, the client (coordinator) sends a commit message to all servers in both clusters. At this point, the transaction is truly considered committed, and the corresponding data store is updated.
  • Conversely, if any cluster aborts the transaction or if the transaction coordinator times out, the coordinator will issue an abort message to all servers in both clusters. Now, the transaction is not executed on the data store.
  • If the outcome is a commit, each server releases its locks. If the outcome is an abort or if a timeout occurs, the server will not execute the operation and releases its locks. In either scenario, the server sends an Ack message back to the coordinating client.
Screenshot 2025-03-31 at 11 05 09 AM

Architecture

The architecture consists of multiple clients (the diagram shows just 3), a network server and 3 clusters of servers each containing 3 servers. All messages go through the network server. The network server provides commands to create network partititoning and fail servers.

Screenshot 2025-03-30 at 7 06 21 PM Screenshot 2025-03-31 at 11 04 48 AM Screenshot 2025-03-31 at 11 04 58 AM

Running the code

The code requires Python3 to be installed and there are no other dependencies. Follow the below order to start each of the processes in different instances of your terminal. NOTE - Make sure all the files in /data folder are empty or deleted before starting the servers. These files store the committed entries on disk based on Raft consensus.

  1. Start network server
python3 network_server.py
  1. Start clients - Syntax is python3 client.py <client_num>
python3 client.py 1
python3 client.py 2
...
  1. Start servers - Syntax is python3 server.py <server_num>
python3 server.py 1
python3 server.py 2
...
python3 server.py 9

Supported commands

  1. Printing balance of an account - print_balance <user id> prints the balance of user with the given id on all the 3 servers in the cluster it is stored in.
Screenshot 2025-03-31 at 3 35 15 PM
  1. Intra-shard or cross shard transaction - sender_id, receiver_id, amount initiates a transfer of amount from the sender to the receiver.
Screenshot 2025-03-31 at 3 37 32 PM Screenshot 2025-03-31 at 3 38 03 PM
  1. Printing committed transactions in Raft log - print_committed_txns will print all the committed transactions in the logs of all the servers. This log will be consistent across servers in the same cluster.
Screenshot 2025-03-31 at 3 39 14 PM Screenshot 2025-03-31 at 3 38 56 PM
  1. Performance - Displays the performance of all transactions initiated from the given client.
Screenshot 2025-03-31 at 3 39 29 PM

Supported scenarios

  1. All combinations of concurrent transactions - intra-shard and cross-shard.
  2. Servers make progress as long as majority of the servers are up. If not, the client retries after a certain time until a result is obtained.
  3. All network partitioning scenarios - If a network partition has a majority of servers, it will make progress and on recovery from partitioning, the partitioned server will update it's log.
  4. Recovery from failure and partitioning - The recovered server will automatically update it's log to the latest.

Contributors

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages