```bash
./reconstruction_orderbook mbo.csv
./reconstruction_orderbook mbo.csv my_output.csv
./reconstruction_orderbook mbo.csv --verbose
./reconstruction_orderbook mbo.csv --validate
./reconstruction_orderbook mbo.csv output.csv --verbose --validate ```
| Option | Description |
|---|---|
--verbose, -v |
Enable detailed processing logs |
--validate |
Enable comprehensive data validation |
--help, -h |
Show help message |
The program provides detailed statistics about the processing:
``` 📊 Input Analysis: Total records: 5885
- Add orders (A): 2915 (49.5%)
- Cancel orders (C): 2913 (49.5%)
- Trades (T): 46 (0.8%)
- 'N' side records: 35 (ignored)
📈 Output Analysis: MBP records generated: 5828 Conversion ratio: 99.0%
⚡ Performance Metrics: Processing time: 106 ms Processing rate: 55519 operations/second ```
When using --validate, the program performs additional checks:
- Conversion Ratio: Should be >50% for healthy data
- Orderbook Integrity: Ensures consistent state
- Add/Cancel Balance: Checks for reasonable ratios
- Bid/Ask Presence: Validates two-sided market
The program displays the final orderbook with:
- Top 5 price levels on each side
- Order counts per level
- Current spread
- Total orders and levels
The input file should contain MBO records with these columns: ``` ts_recv,ts_event,rtype,publisher_id,instrument_id,action,side,price,size,flags,order_id,priority,channel_id,ts_in_delta,symbol ```
| Action | Description | Processing |
|---|---|---|
A |
Add order | Creates new order in book |
C |
Cancel order | Removes order from book |
M |
Modify order | Updates existing order |
T |
Trade | Part of T→F→C sequence |
F |
Fill | Part of T→F→C sequence |
R |
Clear | Ignored (as per requirements) |
- Initial Clear: First 'R' action is automatically skipped
- Trade Sequences: T→F→C sequences are combined into single 'T' records
- 'N' Side Trades: Completely ignored
- Unknown Actions: Counted but ignored
- Processing Rate: 50,000-100,000 operations/second
- Memory Usage: ~64 bytes per active order
- Conversion Ratio: 90-99% for healthy data
- Use release build (
make) for maximum performance - Run on dedicated CPU cores for consistent timing
- Ensure sufficient RAM for large datasets
- Use SSD storage for faster I/O
Low Conversion Ratio (<50%)
- Check for excessive 'N' side trades
- Verify T→F→C sequence completeness
- Look for data corruption or missing records
Size Underflow Warnings
- Indicates cancel operations exceeding available quantity
- May suggest out-of-order data or missing records
- Usually handled gracefully with warnings
Missing Orders in Final State
- Normal if most orders were cancelled during session
- Check add/cancel ratio for balance
For detailed debugging, use the debug build: ```bash make debug ./reconstruction_orderbook_debug mbo.csv --verbose --validate ```
This enables:
- Address sanitizer for memory issues
- Undefined behavior detection
- Additional validation checks
- Detailed error reporting
The output file contains MBP records with the same column structure as input, but represents aggregated price levels rather than individual orders.
- Aggregated Quantities: Size represents total at price level
- Trade Records: T→F→C sequences become single 'T' records
- Order IDs: May be 0 for trades (no specific order)
- Reduced Volume: Filtered and combined records
```bash
for file in *.csv; do ./reconstruction_orderbook "$file" "mbp_$file" done ```
```bash
cat mbo_data.csv | ./reconstruction_orderbook /dev/stdin /dev/stdout | next_processor ```
```bash
time ./reconstruction_orderbook large_file.csv
valgrind --tool=massif ./reconstruction_orderbook_debug mbo.csv