Skip to content

Latest commit

 

History

History
174 lines (129 loc) · 4.58 KB

File metadata and controls

174 lines (129 loc) · 4.58 KB

MBP-10 Orderbook Reconstruction - Usage Guide

Quick Start

```bash

Basic usage

./reconstruction_orderbook mbo.csv

Specify output file

./reconstruction_orderbook mbo.csv my_output.csv

Enable verbose logging

./reconstruction_orderbook mbo.csv --verbose

Enable data validation

./reconstruction_orderbook mbo.csv --validate

Combine options

./reconstruction_orderbook mbo.csv output.csv --verbose --validate ```

Command Line Options

Option Description
--verbose, -v Enable detailed processing logs
--validate Enable comprehensive data validation
--help, -h Show help message

Understanding the Output

Processing Statistics

The program provides detailed statistics about the processing:

``` 📊 Input Analysis: Total records: 5885

  • Add orders (A): 2915 (49.5%)
  • Cancel orders (C): 2913 (49.5%)
  • Trades (T): 46 (0.8%)
  • 'N' side records: 35 (ignored)

📈 Output Analysis: MBP records generated: 5828 Conversion ratio: 99.0%

⚡ Performance Metrics: Processing time: 106 ms Processing rate: 55519 operations/second ```

Validation Checks

When using --validate, the program performs additional checks:

  • Conversion Ratio: Should be >50% for healthy data
  • Orderbook Integrity: Ensures consistent state
  • Add/Cancel Balance: Checks for reasonable ratios
  • Bid/Ask Presence: Validates two-sided market

Final Orderbook State

The program displays the final orderbook with:

  • Top 5 price levels on each side
  • Order counts per level
  • Current spread
  • Total orders and levels

Data Requirements

Input Format (MBO CSV)

The input file should contain MBO records with these columns: ``` ts_recv,ts_event,rtype,publisher_id,instrument_id,action,side,price,size,flags,order_id,priority,channel_id,ts_in_delta,symbol ```

Supported Actions

Action Description Processing
A Add order Creates new order in book
C Cancel order Removes order from book
M Modify order Updates existing order
T Trade Part of T→F→C sequence
F Fill Part of T→F→C sequence
R Clear Ignored (as per requirements)

Special Handling

  1. Initial Clear: First 'R' action is automatically skipped
  2. Trade Sequences: T→F→C sequences are combined into single 'T' records
  3. 'N' Side Trades: Completely ignored
  4. Unknown Actions: Counted but ignored

Performance Expectations

Typical Performance

  • Processing Rate: 50,000-100,000 operations/second
  • Memory Usage: ~64 bytes per active order
  • Conversion Ratio: 90-99% for healthy data

Optimization Tips

  1. Use release build (make) for maximum performance
  2. Run on dedicated CPU cores for consistent timing
  3. Ensure sufficient RAM for large datasets
  4. Use SSD storage for faster I/O

Troubleshooting

Common Issues

Low Conversion Ratio (<50%)

  • Check for excessive 'N' side trades
  • Verify T→F→C sequence completeness
  • Look for data corruption or missing records

Size Underflow Warnings

  • Indicates cancel operations exceeding available quantity
  • May suggest out-of-order data or missing records
  • Usually handled gracefully with warnings

Missing Orders in Final State

  • Normal if most orders were cancelled during session
  • Check add/cancel ratio for balance

Debug Mode

For detailed debugging, use the debug build: ```bash make debug ./reconstruction_orderbook_debug mbo.csv --verbose --validate ```

This enables:

  • Address sanitizer for memory issues
  • Undefined behavior detection
  • Additional validation checks
  • Detailed error reporting

Output Format (MBP CSV)

The output file contains MBP records with the same column structure as input, but represents aggregated price levels rather than individual orders.

Key Differences from Input

  • Aggregated Quantities: Size represents total at price level
  • Trade Records: T→F→C sequences become single 'T' records
  • Order IDs: May be 0 for trades (no specific order)
  • Reduced Volume: Filtered and combined records

Integration

Batch Processing

```bash

Process multiple files

for file in *.csv; do ./reconstruction_orderbook "$file" "mbp_$file" done ```

Pipeline Integration

```bash

Use in data pipeline

cat mbo_data.csv | ./reconstruction_orderbook /dev/stdin /dev/stdout | next_processor ```

Performance Monitoring

```bash

Time execution

time ./reconstruction_orderbook large_file.csv

Memory profiling

valgrind --tool=massif ./reconstruction_orderbook_debug mbo.csv