⚡ hyprTrader: Low-Latency Limit Order Book & Matching Engine

hyprTrader is a high-frequency trading (HFT) simulation engine built in C++20. It implements a Limit Order Book (LOB) with Price-Time Priority matching logic, optimized for sub-microsecond latency on commodity hardware.

The system bypasses standard library overheads by utilizing custom memory arenas, lock-free concurrency patterns, and cache-friendly data structures to minimize Tick-to-Trade latency.

🚀 Key Technical Features

Custom Memory Arena: Implemented a slab allocator (MemoryPool) to eliminate malloc/free overhead on the hot path, ensuring O(1) allocation/deallocation and preventing memory fragmentation.
Lock-Free Concurrency: Utilizes a Single-Producer Single-Consumer (SPSC) ring buffer using std::atomic with explicit memory ordering (acquire/release) for wait-free thread communication.
Cache Locality: Order Book data structures are designed with contiguous memory layouts (flat arrays over node-based trees) to minimize CPU cache misses.
Zero-Copy Architecture: Pointers/Indices are passed through the pipeline rather than copying full Order objects.
Determinism: Engine logic is separated from IO to allow for deterministic replay of market data (PCAP) for backtesting.

🏗️ System Architecture

High-Level Data Flow

The system separates the Network/Feed Handler (Producer) from the Matching Engine (Consumer) to ensure the core logic never blocks on I/O.

graph LR
    %% Nodes
    A[Market Data Feed] -->|TCP/UDP| B(Gateway / Feed Handler)
    B -->|Order Object| C{Lock-Free Ring Buffer}
    C -->|Pop| D[Matching Engine Core]
    D -->|Match| E[(Limit Order Book)]
    E -->|Execution Report| F[Trade Log / Gateway]

    %% Styling - Professional "Systems" Palette
    classDef hardware fill:#f5f5f5,stroke:#333,stroke-width:2px,color:#333;
    classDef logic fill:#d4e1f5,stroke:#333,stroke-width:2px,color:#333;
    classDef storage fill:#fff9c4,stroke:#d4a017,stroke-width:2px,color:#333;
    classDef critical fill:#ffccbc,stroke:#bf360c,stroke-width:2px,color:#333;

    %% Apply Styles
    class A,F hardware;
    class B,D logic;
    class E storage;

    %% Specific Style for the Buffer (The "Hot Path")
    style C fill:#b2dfdb,stroke:#00695c,stroke-width:2px,color:#004d40

Memory Layout Strategy

Instead of using std::map (Red-Black Tree) which scatters nodes in heap memory, hyprTrader uses pre-allocated vectors and intrusive linked lists.

classDiagram
    direction TB
    %% Styling
    classDef mainClass fill:#e1f5fe,stroke:#01579b,stroke-width:2px,color:#000;
    classDef subClass fill:#ffffff,stroke:#333,stroke-width:1px,color:#000;

    class MemoryPool {
        +vector~Order~ store
        +vector~size_t~ free_indices
        +allocate() Order*
        +deallocate(Order*)
    }
    class Order {
        +uint32_t id
        +uint32_t price
        +uint32_t next_order_index
        +uint32_t prev_order_index
    }

    %% Apply Styles
    class MemoryPool mainClass
    class Order subClass

    MemoryPool "1" *-- "100,000" Order : Pre-Allocated Contiguous Block

📊 Performance Benchmarks

Hardware: Fedora Linux (WSL2), GCC 15.x, Intel/AMD Host CPU

1. Memory Allocation Latency

Comparison between standard new/delete and hyprTrader::MemoryPool.

Allocator Type	Operations	Avg Latency (ns)	99th % Latency (ns)
std::allocator (Heap)	1,000,000	~120 ns	~450 ns
Custom MemoryPool	1,000,000	~4 ns	~6 ns

2. Order Matching Latency (Tick-to-Trade)

Time taken from receiving an order to generating a fill (match).

Scenario	Throughput (msgs/sec)	Avg Latency (µs)
Add Order (No Match)	2.5 Million	0.35 µs
Match Execution	1.8 Million	0.85 µs

> Note: Benchmarks run using Google Benchmark framework with CPU isolation.

🛠️ Build & Run

Prerequisites

Linux Environment (Fedora/Ubuntu)
C++20 Compiler (GCC 11+ or Clang 14+)
CMake 3.15+

Build Instructions

# Clone the repository
git clone git@github.com:0xSaksham/hyprTrader.git
cd hyprTrader

# Create build directory
mkdir build && cd build

# Configure and Build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)

Running Tests & Benchmarks

# Run Unit Tests (GTest)
./tests/hyprTrader_tests

# Run Micro-benchmarks
./benchmarks/hyprTrader_bench

🧠 Design Decisions & Trade-offs

Why C++20?

Used std::span for safe buffer views and concepts for constraining template types in the Memory Pool, enforcing type safety at compile time without runtime penalties.

Why SPSC Queue?

A Multi-Producer queue would require CAS (Compare-And-Swap) loops or heavy locking. Since we have a dedicated Gateway thread, an SPSC queue allows us to use simple atomic loads/stores, which is the fastest possible thread sync mechanism.

Intrusive Linked Lists

Standard std::list performs a specialized allocation for every node. By embedding next/prev indices directly into the Order struct, we keep the data cache-hot and avoid pointer chasing across the heap.

🗺️ Roadmap

Core Order Book Data Structures
Custom Arena Allocator
ITCH 5.0 Market Data Parser
TCP Order Entry Gateway
Docker containerization for deployment

👤 Author

Saksham Gupta Aspiring Low-Latency Systems Engineer GitHub | LinkedIn | Blog

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
src		src
tests		tests
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ hyprTrader: Low-Latency Limit Order Book & Matching Engine

🚀 Key Technical Features

🏗️ System Architecture

High-Level Data Flow

Memory Layout Strategy

📊 Performance Benchmarks

1. Memory Allocation Latency

2. Order Matching Latency (Tick-to-Trade)

🛠️ Build & Run

Prerequisites

Build Instructions

Running Tests & Benchmarks

🧠 Design Decisions & Trade-offs

Why C++20?

Why SPSC Queue?

Intrusive Linked Lists

🗺️ Roadmap

👤 Author

About

Uh oh!

Releases

Packages

Languages

0xSaksham/hyprTrader

Folders and files

Latest commit

History

Repository files navigation

⚡ hyprTrader: Low-Latency Limit Order Book & Matching Engine

🚀 Key Technical Features

🏗️ System Architecture

High-Level Data Flow

Memory Layout Strategy

📊 Performance Benchmarks

1. Memory Allocation Latency

2. Order Matching Latency (Tick-to-Trade)

🛠️ Build & Run

Prerequisites

Build Instructions

Running Tests & Benchmarks

🧠 Design Decisions & Trade-offs

Why C++20?

Why SPSC Queue?

Intrusive Linked Lists

🗺️ Roadmap

👤 Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages