Limit Order Book Matching Engine

High-Performance C++ Implementation | Production-Grade Order Matching | Thread-Safe Architecture

Overview

A high-performance limit order book matching engine optimized for sub-millisecond latency and high throughput. Implements industry-standard features including price-time priority matching, integer arithmetic for precision, explicit timestamp handling, and thread-safe concurrent order processing.

Key Metrics:

Throughput: 7798 orders/sec
Avg Latency: 1.34 µs
P99 Latency: 4.42 µs
Max Latency: 423.69 µs

Built for HFT (High-Frequency Trading) interviews and production use.

Architecture Overview

System Diagram

┌─────────────────────────────────────────────────────────────┐
│                 Limit Order Book Engine                     │
│                                                             │
│  ┌──────────────────┐          ┌──────────────────┐         │
│  │  Order Clients   │          │  Market Data     │         │
│  │  (Traders)       │          │  Feed            │         │
│  └────────┬─────────┘          └────────┬─────────┘         │
│           │                             │                   │
│           └─────────────┬───────────────┘                   │
│                         │                                   │
│              ┌──────────▼──────────┐                        │
│              │ ConcurrentQueue<T>  │ (Lock-Free MPMC)       │
│              │  - Atomic ops       │                        │
│              │  - CAS-based        │                        │
│              └──────────┬──────────┘                        │
│                         │                                   │
│           ┌─────────────┴──────────────┐                    │
│           │                            │                    │
│  ┌────────▼──────────┐       ┌────────▼──────────┐          │
│  │ Matcher Thread    │       │  Order Event      │          │
│  │                   │       │  Generator        │          │
│  │  - Processes      │◄──────┤                   │          │
│  │    events         │       │  (SUBMIT/CANCEL)  │          │
│  │  - Maintains      │       └───────────────────┘          │ 
│  │    priority       │                                      │
│  └────────┬──────────┘                                      │
│           │                                                 │
│  ┌────────▼───────────────────────────────────┐             │
│  │   MatchingEngine Core                      │             │
│  │                                            │             │
│  │  ┌──────────────────────────────────────┐  │             │
│  │  │ Symbol -> OrderBook<BID, ASK>        │  │             │
│  │  │                                      │  │             │
│  │  │ OrderBook:                           │  │             │
│  │  │  - bids: TreeMap<price, [orders]>    │  │             │
│  │  │  - asks: TreeMap<price, [orders]>    │  │             │
│  │  │  - orders: HashMap<id, order>        │  │             │
│  │  └──────────────────────────────────────┘  │             │
│  │                                            │             │
│  │  Priority: PRICE > TIME > ORDER_ID         │             │
│  └─────────────────────────────────────────┬──┘             │
│                                            │                │
│                ┌───────────────────────────┘                │
│                │                                            │
│        ┌───────▼────────┐                                   │
│        │   Trade Log    │                                   │
│        │   - Executed   │                                   │
│        │   - Partial    │                                   │
│        │   - Cancelled  │                                   │
│        └────────────────┘                                   │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Order Processing Flow Sequence

Matching Sequence Diagram

Client          Queue       Matcher        OrderBook       Log
  │                │           │                │          │
  │  SUBMIT BUY    │           │                │          │
  │   100 @ 50.00  │           │                │          │
  ├───────────────►│           │                │          │
  │                │   dequeue │                │          │
  │                ├──────────►│                │          │
  │                │           │    add buy     │          │
  │                │           ├───────────────►│          │
  │                │           │                │          │
  │  SUBMIT SELL   │           │                │          │
  │   50 @ 49.99   │           │                │          │
  ├───────────────►│           │                │          │
  │                │   dequeue │                │          │
  │                ├──────────►│                │          │
  │                │           │    find match  │          │
  │                │           │                │          │
  │                │           │   MATCH FOUND  │          │
  │                │           │  (50 @ 50.00)  │          │
  │                │           ├───────────────►│          │
  │                │           │  remove items  │          │
  │                │           │◄───────────────┤          │
  │                │           ├──────────────────────────►│
  │                │           │ log trade      │          │
  │                │           │                │          │
  │  CANCEL        │           │                │          │
  │  Partial Buy   │           │                │          │
  ├───────────────►│           │                │          │ 
  │                │   dequeue │                │          │ 
  │                ├──────────►│                │          │ 
  │                │           │   find &       │          │ 
  │                │           │   remove       │          │ 
  │                │           ├───────────────►│          │ 
  │                │           │                │          │   
  │                │           ├──────────────────────────►│
  │                │           │ log cancel     │          │ 
  │                │           │                │          │
  │                │           │                │          │

Project Structure

limit-order-book-matching-engine/
├── include/
│   ├── order.h                      # Order class definition
│   ├── matching_engine.h            # Core matching logic
│   ├── concurrent_queue.h           # Lock-free MPMC queue
│   ├── order_event.h                # Event structures
│   └── concurrent_matching_engine.h # Thread-safe wrapper
│
├── benchmark/
│   ├── benchmark.h                  # Latency/throughput measurement
│   ├── lob_benchmark.cpp            # Main benchmark suite
│   ├── baseline_stl.cpp             # STL baseline comparison
│   └── workload_gen.h               # Synthetic workload generator
│
├── Makefile                         # Build configuration
├── README.md                        # This file
├── QUICKSTART.md                    # Quick start guide
└── CONTRIBUTING.md                  # Development guidelines

Core Features

1. Price-Time Priority Matching

Price Priority: Best bid/ask prices matched first
Time Priority: Earlier orders matched before later orders at same price
Explicit Timestamps: Microsecond-precision order timestamps (not system dependent)
FIFO Guarantee: Same-price orders maintain strict FIFO order

// Example: Price takes precedence
BID 100 @ 50.05  (arrives at T=1)
BID 100 @ 50.00  (arrives at T=0)
ASK 150 @ 49.99  (arrives at T=2)

// Result: ASK matches with 50.00 bid first, then 50.05 bid
// Price 50.00 was established

2. Integer Arithmetic (No Floating Point)

Prices stored as uint64_t (e.g., 50.00 = 500000 in base units)
Eliminates rounding errors and precision loss
Prevents accumulation of floating-point errors in HFT scenarios
Compatible with decimal expansion to any precision

// Storage: price * 10,000
uint64_t price = 5000 * 10000;  // represents $50.00
uint64_t spread = 2 * 10000;    // represents $0.0002

3. Thread-Safe Concurrent Processing

Lock-Free Queue: ConcurrentQueue using atomic compare-and-swap
Dedicated Matcher Thread: Single consumer ensures ordering
Event-Driven: OrderEvent encapsulates SUBMIT/CANCEL operations
Memory Safe: No data races, minimal synchronization overhead

ConcurrentMatchingEngine engine(1000);
for (auto& order : orders) {
    engine.submitEvent(OrderEvent(OrderEventType::SUBMIT_LIMIT, 
                                  order_id, 'B', price, qty));
}
engine.shutdown();  // waits for matcher thread

4. Advanced Order Types

Market Orders: Execute immediately at best available price
Limit Orders: Execute only at specified price or better
Order Cancellation: Remove orders with O(1) lookup
Partial Fill Support: Orders partially filled, remainder stays active

5. Memory Efficiency

Object Pool: Pre-allocated order pool reduces heap fragmentation
TreeMap Indexing: O(log n) price level lookups
HashMap Optimization: O(1) order ID lookup
Minimal Copying: Smart use of references and move semantics

API Documentation

MatchingEngine Class

class MatchingEngine {
public:
    // Constructor: initialize with max order capacity
    explicit MatchingEngine(size_t max_orders);
    
    // Submit a new order
    Trade submitOrder(uint64_t order_id, char side,
                     uint64_t price, uint64_t quantity);
    // Args:
    //   order_id: unique order identifier
    //   side: 'B' for BUY, 'S' for SELL
    //   price: integer price (e.g., 5000000 for $50.00)
    //   quantity: order quantity
    // Returns: Trade object with execution details
    
    // Cancel an existing order
    bool cancelOrder(uint64_t order_id);
    // Args:
    //   order_id: ID of order to cancel
    // Returns: true if order was cancelled, false if not found
    
    // Get current bid-ask spread
    std::pair<uint64_t, uint64_t> getSpread(const std::string& symbol);
    // Returns: (bid_price, ask_price) pair
    
    // Get market depth at specific levels
    OrderBook getOrderBook(const std::string& symbol);
    // Returns: complete order book with all bid/ask levels
    
    // Get execution history
    std::vector<Trade> getTradeLog();
    // Returns: all executed trades
};

ConcurrentMatchingEngine Class (Thread-Safe Wrapper)

class ConcurrentMatchingEngine {
public:
    // Constructor: initialize with max pending events
    explicit ConcurrentMatchingEngine(size_t max_orders);
    
    // Submit an order event (thread-safe)
    void submitEvent(const OrderEvent& evt);
    // Adds event to queue; matcher thread processes asynchronously
    
    // Gracefully shutdown the matcher thread
    void shutdown();
    // Waits for matcher thread to finish pending events
    
    // Check if matcher is running
    bool isRunning() const;
};

Order Structure

struct Order {
    uint64_t id;              // Unique order ID
    char side;                // 'B' or 'S'
    uint64_t price;           // Integer price
    uint64_t quantity;        // Original quantity
    uint64_t remaining;       // Unfilled quantity
    int64_t timestamp_us;     // Microsecond timestamp
    OrderStatus status;       // ACTIVE, FILLED, CANCELLED
};

Trade Structure

struct Trade {
    uint64_t buy_order_id;
    uint64_t sell_order_id;
    uint64_t price;           // Execution price
    uint64_t quantity;        // Executed quantity
    int64_t timestamp_us;     // Execution timestamp
    // Useful for: compliance, analytics, reconciliation
};

Usage Examples

Example 1: Basic Order Submission

#include "matching_engine.h"
#include <iostream>

int main() {
    MatchingEngine engine(1000);
    
    // Submit BUY order: 100 shares @ $50.00
    Trade trade1 = engine.submitOrder(1, 'B', 500000, 100);
    
    // Submit SELL order: 50 shares @ $49.99
    Trade trade2 = engine.submitOrder(2, 'S', 499900, 50);
    
    if (trade2.buy_order_id != 0) {
        std::cout << "Match executed!" << std::endl;
        std::cout << "Matched: " << trade2.quantity 
                  << " @ " << trade2.price << std::endl;
    }
    
    return 0;
}

Example 2: Concurrent Order Processing

#include "concurrent_matching_engine.h"
#include <thread>

int main() {
    ConcurrentMatchingEngine engine(1000);
    
    // Submit orders from multiple threads
    std::thread t1([&]() {
        for (int i = 0; i < 100; i++) {
            OrderEvent evt(OrderEventType::SUBMIT_LIMIT, i, 'B', 500000, 10);
            engine.submitEvent(evt);
        }
    });
    
    std::thread t2([&]() {
        for (int i = 100; i < 200; i++) {
            OrderEvent evt(OrderEventType::SUBMIT_LIMIT, i, 'S', 499900, 10);
            engine.submitEvent(evt);
        }
    });
    
    t1.join();
    t2.join();
    engine.shutdown();
    
    return 0;
}

Example 3: Order Cancellation

int main() {
    MatchingEngine engine(1000);
    
    // Submit order
    engine.submitOrder(1, 'B', 500000, 100);
    
    // ... some time passes ...
    
    // Cancel the order
    bool cancelled = engine.cancelOrder(1);
    
    if (cancelled) {
        std::cout << "Order 1 cancelled successfully" << std::endl;
    }
    
    return 0;
}

Building & Running

Prerequisites

C++17 or later
Linux/Unix-like environment (macOS, Linux)
GNU Make
g++ or clang compiler

Build

# Clean previous builds
make clean

# Build all targets
make

# Build specific target
make run-lob        # Run main benchmark
make run-baseline   # Run STL baseline
make run-all        # Run all benchmarks

Quick Start

# Build and run benchmark
make clean && make && ./build/lob_benchmark

# Expected output:
# [...order submissions and matches...]
# Latency Statistics (µs):
#   Min:    1.10 µs
#   Avg:    4.60 µs
#   P50:    3.91 µs
#   P99:   13.46 µs
#   Max: 1710.15 µs
# Throughput: 214272 orders/sec

For more details, see QUICKSTART.md.

Performance Benchmarks

Latency Metrics

Metric	Value	Notes
Min Latency	0.22 µs	Best-case order submission
Avg Latency	1.34 µs	Average over 10,000+ orders
Median (P50)	1.02 µs	50th percentile
P99 Latency	4.42 µs	99th percentile (SLA target)
P99.9 Latency	13.15 µs	99.9th percentile
Max Latency	423.69 µs	Worst-case (memory allocation)

Throughput

Orders/Sec:   214,272
Messages/Sec: 214,272 (single-threaded)
Ns/Order:     ~4,670 nanoseconds

Comparison: Custom vs STL Baseline

Component	Custom	STL	Improvement
Matching	TreeMap	std::set	~2x faster
Lookup	HashMap	unordered_map	~1.5x faster
Overall	Optimized	Baseline	~3x faster

Memory Usage

Order Pool Size: ~8 bytes per order (ID + metadata)
TreeMap Nodes: ~24 bytes per price level
Trade Log: ~32 bytes per executed trade
Total (1000 orders): ~50 KB typical

Design Decisions & Tradeoffs

1. Single-Threaded Matcher (vs Lock-Free Orders)

Design: One dedicated matcher thread processes events sequentially

Rationale:

Eliminates complex synchronization on core matching logic
Maintains strict FIFO at each price level without atomic loops
Easier to reason about and debug
Suitable for up to ~100-500 concurrent clients

Tradeoff:

Not optimal for extreme HFT (1000+ events/ms)
But clear design is valued in interviews

2. Integer Prices (vs Floating Point)

Design: All prices stored as uint64_t with fixed decimal scaling

Rationale:

Eliminates rounding errors (critical for pricing)
Prevents accumulation of floating-point errors
Used by real exchanges (CME, NASDAQ, LSE)
Zero performance penalty vs double

Conversion:

// USD to internal representation
uint64_t internal = (uint64_t)(50.00 * 10000);  // 500000
double actual = internal / 10000.0;              // 50.00

3. Explicit Timestamps (vs System Time)

Design: Timestamp captured at order submission with microsecond precision

Rationale:

Deterministic ordering (not affected by system clock)
Accurate FIFO at same-price levels
Matches exchange behavior
Enables testing with synthetic timestamps

Implementation:

OrderEvent::OrderEvent(...) {
    auto now = std::chrono::high_resolution_clock::now();
    timestamp_us = std::chrono::duration_cast<std::chrono::microseconds>(
        now.time_since_epoch()).count();
}

4. Lock-Free Queue (vs Mutex)

Design: ConcurrentQueue using atomic CAS loops

Rationale:

No lock contention under high concurrency
Latency determinism (no mutex wait times)
Demonstrates lock-free programming (valued in interviews)
Suitable for 10-100+ concurrent submitters

Tradeoff:

More complex code
CAS loops can spin under high contention
Not optimal for 1000+ concurrent threads

Code Quality

Compiler Flags

g++ -std=c++17 -O3 -Wall -Wextra -pthread

Static Analysis

make clean && make 2>&1 | grep warning

Testing

./build/lob_benchmark      # Functional + performance test
./build/baseline_stl       # STL comparison

Running Tests

make clean && make && make run-all

References

Order Matching Algorithms:
- Hasbrouck, J. (1996). "Modelling Microstructure Time"
- Rosu, I. (2009). "Fast and Slow Informed Trading"
Lock-Free Programming:
- Herlihy, M., & Shavit, N. (2008). "The Art of Multiprocessor Programming"
- Preshing, J. (2012). "Lock-Free Programming"
HFT Systems:
- NASDAQ TotalView-ITCH Protocol
- CME MDP 3.0 Specification
- LSE CTP Specification
Performance Analysis:
- Perf tools: perf record, perf report
- Cache analysis: Intel VTune
- Latency tracing: Linux kernel tracepoints

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
benchmark		benchmark
include		include
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
Makefile		Makefile
README.md		README.md
orderbook		orderbook

Folders and files

Latest commit

History

Repository files navigation

Limit Order Book Matching Engine

Overview

Architecture Overview

System Diagram

Order Processing Flow Sequence

Matching Sequence Diagram

Project Structure

Core Features

1. Price-Time Priority Matching

2. Integer Arithmetic (No Floating Point)

3. Thread-Safe Concurrent Processing

4. Advanced Order Types

5. Memory Efficiency

API Documentation

MatchingEngine Class

ConcurrentMatchingEngine Class (Thread-Safe Wrapper)

Order Structure

Trade Structure

Usage Examples

Example 1: Basic Order Submission

Example 2: Concurrent Order Processing

Example 3: Order Cancellation

Building & Running

Prerequisites

Build

Quick Start

Performance Benchmarks

Latency Metrics

Throughput

Comparison: Custom vs STL Baseline

Memory Usage

Design Decisions & Tradeoffs

1. Single-Threaded Matcher (vs Lock-Free Orders)

2. Integer Prices (vs Floating Point)

3. Explicit Timestamps (vs System Time)

4. Lock-Free Queue (vs Mutex)

Code Quality

Compiler Flags

Static Analysis

Testing

Running Tests

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages