Skip to content

A peer-to-peer distributed file storage system built in Go that enables multiple nodes to store, retrieve, and manage files across a decentralized network with automatic replication and encryption.

Notifications You must be signed in to change notification settings

ArditZubaku/distributed-file-storage

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

22 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Distributed File Storage System

A peer-to-peer distributed file storage system built in Go that enables multiple nodes to store, retrieve, and manage files across a decentralized network with automatic replication and encryption.

πŸš€ Features

  • Peer-to-Peer Network: TCP-based communication between nodes
  • File Encryption: AES encryption for all stored files
  • Content Addressable Storage (CAS): Files are stored using SHA-1 hashing for deduplication
  • Automatic Replication: Files are automatically replicated across network nodes
  • Network Discovery: Bootstrap nodes for network formation
  • File Operations: Store, retrieve, and delete files across the network
  • Decentralized Architecture: No single point of failure

πŸ—οΈ Architecture

Core Components

1. P2P Network Layer (p2p/)

  • Transport Interface: Abstraction for network communication protocols
  • TCP Transport: TCP-based implementation for reliable peer communication
  • Peer Management: Handles connections, handshakes, and message routing
  • Message Encoding: Gob-based message serialization

2. Storage Layer (store/)

  • Content Addressable Storage: Files stored with SHA-1 hash-based paths
  • Path Transformation: Hierarchical directory structure for efficient storage
  • Encryption/Decryption: Transparent file encryption using AES
  • File Operations: Read, write, delete, and existence checks

3. Server Layer (server/)

  • File Server: Main orchestrator combining P2P and storage layers
  • Message Handling: Processes store, get, and delete requests
  • Network Operations: Broadcasts operations to peer nodes
  • Peer Discovery: Bootstrap and maintain peer connections

4. Crypto Layer (crypto/)

  • Key Generation: Secure random key generation for encryption
  • File Hashing: MD5 and SHA-1 hashing utilities
  • AES Encryption: Stream encryption/decryption for file content

πŸ“ Project Structure

β”œβ”€β”€ main.go              # Application entry point and demo
β”œβ”€β”€ go.mod               # Go module definition
β”œβ”€β”€ Makefile             # Build and test commands
β”œβ”€β”€ p2p/                 # Peer-to-peer networking
β”‚   β”œβ”€β”€ transport.go     # Transport interface
β”‚   β”œβ”€β”€ tcp_transport.go # TCP implementation
β”‚   β”œβ”€β”€ message.go       # Message types and constants
β”‚   β”œβ”€β”€ encoding.go      # Message encoding/decoding
β”‚   └── handshake.go     # Peer handshake logic
β”œβ”€β”€ server/              # File server implementation
β”‚   └── server.go        # Main server logic
β”œβ”€β”€ store/               # Storage management
β”‚   β”œβ”€β”€ store.go         # Storage interface and CAS
β”‚   └── store_test.go    # Storage tests
└── crypto/              # Cryptographic utilities
    └── crypto.go        # Encryption and hashing

πŸ› οΈ How It Works

1. Network Formation

  • Nodes connect to bootstrap peers to join the network
  • TCP connections are established between peers
  • Handshake protocol ensures secure peer authentication

2. File Storage Process

Client β†’ Store File β†’ Local Storage + Network Broadcast β†’ Peer Replication
  1. File is encrypted using AES with a random key
  2. File is stored locally using CAS (content-addressable storage)
  3. Store message is broadcast to all connected peers
  4. Peers receive and store the encrypted file locally

3. File Retrieval Process

Client β†’ Request File β†’ Check Local β†’ Network Query β†’ Stream Response
  1. Check if file exists in local storage
  2. If not found locally, broadcast get request to peers
  3. Peer with the file streams it back encrypted
  4. File is decrypted and returned to client

4. File Deletion Process

Client β†’ Delete Request β†’ Network Broadcast β†’ Peer Deletion β†’ Local Cleanup
  1. Broadcast delete message to all peers first
  2. Peers delete their local copies
  3. Finally delete from local storage

5. Content Addressable Storage

Files are stored in a hierarchical structure based on SHA-1 hash:

/storage_root/
  └── node_id/
      └── c08bf/
          └── 03b70/
              └── 1e356/
                  └── 06cf8/
                      └── ef77d12e1a44fc5e31071a750e3ea90d9872352a7a13

πŸš€ Getting Started

Prerequisites

  • Go 1.24+ installed
  • Network ports 3000, 4000, 5000 available for demo

Installation

# Clone the repository
git clone https://github.com/ArditZubaku/distributed-file-storage.git
cd distributed-file-storage

# Install dependencies
go mod tidy

# Build the project
make build

Running the Demo

# Run the demonstration
make run

# Or run directly
go run main.go

Running Tests

make test

πŸ’‘ Usage Example

The main.go demonstrates a complete workflow:

// Create a 3-node network
s1 := makeServer("3000", "")                    // Bootstrap node
s2 := makeServer("4000", ":3000")              // Connects to s1
s3 := makeServer("5000", ":3000", ":4000")     // Connects to s1 and s2

// Store a file (automatically replicated)
data := bytes.NewReader([]byte("my big data file"))
s3.StoreFile("picture.png", data)

// Retrieve file from network
reader, err := s3.Get("picture.png")

// Delete from all nodes
s3.DeleteFile("picture.png")

πŸ”§ Configuration

Server Configuration

fileServerOpts := server.FileServerOpts{
    ID:                "unique-node-id",           // Auto-generated if empty
    EncKey:            crypto.NewEncryptionKey(),  // AES encryption key
    StorageRoot:       "storage_directory",       // Local storage path
    PathTransformFunc: store.CASPathTransformFunc, // CAS path transform
    Transport:         tcpTransport,              // P2P transport
    BootstrapNodes:    []string{":3000"},         // Bootstrap peers
}

Transport Configuration

tcpTransportOpts := p2p.TCPTransportOpts{
    ListenAddr:    ":3000",                    // Listening address
    HandShakeFunc: p2p.NOPHandShakeFunc,       // Handshake function
    Decoder:       p2p.DefaultDecoder{},       // Message decoder
    OnPeer:        server.OnPeer,              // Peer connection handler
}

πŸ” Security Features

  • File Encryption: All files encrypted with AES before storage
  • Content Integrity: SHA-1 hashing ensures file integrity
  • Secure Communication: TCP with handshake protocol
  • Random Key Generation: Cryptographically secure random keys

πŸ§ͺ Testing

The project includes comprehensive tests:

  • Unit tests for storage operations
  • P2P transport testing
  • Integration tests for file operations
# Run all tests
go test ./... -v

# Test specific package
go test ./store -v
go test ./p2p -v

πŸ”„ Protocol Messages

Message Types

  • MessageStoreFile: Replicates file to peers
  • MessageGetFile: Requests file from network
  • MessageDeleteFile: Removes file from all nodes

Message Flow

[Node A] β†’ Store Message β†’ [Node B, Node C]
[Node A] ← File Stream ← [Node B] (if has file)
[Node A] β†’ Delete Message β†’ [Node B, Node C]

🚧 Future Enhancements

  • Web API interface for file operations
  • Node discovery via DHT (Distributed Hash Table)
  • Configurable replication factor
  • File versioning and conflict resolution
  • Performance metrics and monitoring
  • Support for large file streaming
  • Network partition tolerance
  • Authentication and authorization

About

A peer-to-peer distributed file storage system built in Go that enables multiple nodes to store, retrieve, and manage files across a decentralized network with automatic replication and encryption.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published