Skip to content

This repository contains a high-performance, multi-threaded HTTP server implemented from scratch using Python's low-level socket programming and the threading module. The objective of this project was to gain a deep understanding of the HTTP/1.1 protocol, concurrent programming principles, and fundamental network security measures.

Notifications You must be signed in to change notification settings

PRAteek-singHWY/MULTI-THREADED-HTTP-SERVER

Repository files navigation

Multi-threaded HTTP Server (Python Socket Implementation)

This project presents a high-performance, multi-threaded HTTP/1.1 server built entirely from scratch using Python's low-level socket and threading modules. It serves as a comprehensive demonstration of concurrent programming, network protocol handling, and fundamental security practices.


1. Build and Execution Instructions 🚀

This server requires Python 3.x and relies only on standard library modules.

A. Project Setup and File Structure

  1. Repository Structure: Ensure the following directory structure is present. The resources directory serves as the web root for static content.

    • project/
      • ├── server.py
      • └── resources/
        • ├── index.html
        • ├── about.html
        • ├── contact.html
        • ├── sample.txt
        • ├── large_image.png (> 1MB)
        • ├── photo.jpg
        • └── uploads/ (Directory for POST results - **must be created and writable**)
  2. Test Files: The resources/ directory must be populated with the required test content: three HTML files, two text files, two PNG images (one exceeding 1MB), and two JPEG images.

B. Server Execution

The server accepts up to three command-line arguments, overriding the defaults: <Port> <Host Address> <Max Thread Pool Size>.

Argument Default Value Description
Port 8080 The TCP port for the server to bind and listen on.
Host Address 127.0.0.1 The interface address (e.g., 0.0.0.0 for all interfaces).
Pool Size 10 The maximum number of concurrent threads available to handle clients.
  • Default Run:

    python3 server.py

    (Starts server on 127.0.0.1:8080 with 10 worker threads)

  • Custom Run Example:

    python3 server.py 8000 0.0.0.0 20

    (Starts server on 0.0.0.0:8000 with 20 worker threads)


2. Concurrency Architecture: Thread Pool & Queue ⚙️

The server implements the Producer-Consumer pattern using a fixed-size thread pool for concurrent request processing.

  • Core Components:
    • ThreadPool: The manager responsible for spawning and maintaining a fixed count of WorkerThread instances.
    • WorkerThread (Consumers): Threads that block on the connection queue, dequeue client sockets, and execute the RequestHandler logic.
    • connection_queue (Shared Buffer): A global list where the main thread (Producer) places accepted client sockets.
  • Synchronization:
    • threading.Lock (queue_lock): A Mutex strictly protecting the connection_queue during read and write operations, eliminating race conditions.
    • threading.Condition (pool_condition): Utilized for efficient inter-thread communication. Worker threads call wait() when idle. The main thread calls notify() when a new client arrives, allowing threads to wake up efficiently without busy-waiting.
  • Saturation Handling: If the queue exceeds capacity while all threads are busy, the system returns a 503 Service Unavailable response with a Retry-After header directly to the client and logs the saturation event.

3. Robust Binary and File Transfer Implementation 📦

The server handles both static HTML rendering and robust binary data streaming with strict adherence to HTTP headers.

  • Content Identification and MIME Types:
    • .html files are served as text/html; charset=utf-8.
    • Files with extensions .png, .jpg, .jpeg, and .txt are treated as binary downloads and served with Content-Type: application/octet-stream.
  • Forced Download Headers:
    • Content-Disposition: attachment; filename="[filename]" is explicitly included in binary responses. This instructs the client (browser) to download the file rather than attempting to display or render the content inline.
  • Efficient Data Streaming:
    • File I/O uses Python's binary read mode ('rb') to ensure raw byte data integrity.
    • Files are read and sent over the socket in 8192-byte chunks (BUFFER_SIZE). This chunking strategy minimizes memory usage for large files and ensures efficient, non-blocking network transmission.
    • Content-Length is always set to the exact file size in bytes for accurate transmission and client verification.

4. Critical Security Measures Implemented 🛡️

The RequestHandler incorporates two mandatory security validation steps to prevent common network attacks.

A. Path Traversal Protection

  • Function: _safe_path_resolve(request_path)
  • Mechanism: The function uses os.path.abspath() to resolve the requested path into its canonical form.
  • Strict Validation: The canonicalized path is checked to ensure it begins with the absolute path of the designated resources directory. Requests containing malicious sequences like .. or ./ that attempt to access files outside the server root are blocked, resulting in a 403 Forbidden response.

B. Host Header Validation

  • Function: _validate_host()
  • Mechanism: The mandatory Host header is checked against the server's configured host/port tuple ([Host:Port]).
  • Validation Rules:
    1. Missing Host: Returns 400 Bad Request.
    2. Mismatch: Returns 403 Forbidden if the header value does not match the expected server address (e.g., 127.0.0.1:8080, localhost:8080, or the server's configured IP).

5. Known Limitations and Design Notes ⚠️

While the server meets all specified requirements, it has inherent limitations due to its low-level, educational implementation:

  • Brittle HTTP Parsing: The parser relies on simple string splitting for the request line and headers. It is not resilient against complex, non-standard, or deliberately malformed HTTP payloads, unlike a production-grade library.
  • Sequential Keep-Alive Processing: HTTP/1.1 connection persistence is implemented, but all requests within a single persistent connection are handled sequentially by the same thread. The concurrency benefits apply only to simultaneous connections from different clients.
  • Basic Timeout Handling: The socket timeout mechanism is limited to detecting connection idleness (no data received). It lacks advanced features for monitoring slow data transmission rates to protect against Denial of Service (DoS) attacks like Slowloris.

About

This repository contains a high-performance, multi-threaded HTTP server implemented from scratch using Python's low-level socket programming and the threading module. The objective of this project was to gain a deep understanding of the HTTP/1.1 protocol, concurrent programming principles, and fundamental network security measures.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published