-
Notifications
You must be signed in to change notification settings - Fork 295
Open
Description
Subquery Performance Benchmarks
Create system-level benchmarks for subqueries to assess performance, scalability, and memory usage.
Context
Subqueries are implemented using in-memory Materializer processes that maintain indexes of subquery results and trigger move-in/move-out events when data changes. We need benchmarks that run the full system (sync service + Postgres) and measure real-world performance characteristics.
Memory Locations to Monitor
Consumer Process State (Consumer.State)
| Field | What It Stores | Memory Impact |
|---|---|---|
buffer |
Transactions queued while buffering (reverse order) | O(buffered_txns × txn_size) - can grow unbounded during initial snapshot or move-in waits |
txn_offset_mapping |
List of {shape_offset, txn_boundary} tuples for flush alignment |
O(unflushed_txns × 32 bytes) |
writer |
Storage writer state (ETS refs, file handles, buffers) | Varies by storage backend |
shape |
Full Shape struct including shape_dependencies |
O(query_complexity + nested_shapes) |
transaction_builder |
Partial transaction fragments being assembled | O(fragment_size) - transient |
MoveIns State (Consumer.MoveIns)
| Field | What It Stores | Memory Impact |
|---|---|---|
waiting_move_ins |
Map of name → {pg_snapshot, {ref_key, MapSet[moved_values]}} |
O(concurrent_move_ins × values_per_move_in) |
filtering_move_ins |
List of {pg_snapshot, MapSet[keys]} for completed but filtering move-ins |
O(filtering_move_ins × keys_per_move_in) - can be large if many rows moved in |
touch_tracker |
Map of key → xid tracking which keys were touched |
O(touched_keys) - grows with change volume, GC'd periodically |
in_flight_values |
Precalculated map of all moved-in values to skip in WHERE evaluation | O(total_in_flight_values) |
moved_out_tags |
Map of move_in_name → MapSet[tags] for move-outs during move-in |
O(concurrent_move_ins × moved_out_tags) |
move_in_buffering_snapshot |
Union snapshot {xmin, xmax, xip_list} |
~100 bytes (but xip_list can grow) |
maximum_resolved_snapshot |
Snapshot for visibility boundary | ~100 bytes |
minimum_unresolved_snapshot |
Snapshot for visibility boundary | ~100 bytes |
InitialSnapshot State (Consumer.InitialSnapshot)
| Field | What It Stores | Memory Impact |
|---|---|---|
pg_snapshot |
Tuple {xmin, xmax, xip_list} |
O(in_progress_txns) - xip_list can be large under high concurrency |
awaiting_snapshot_start |
List of GenServer.from() references |
O(waiting_clients) |
Materializer Process State (Consumer.Materializer)
| Field | What It Stores | Memory Impact |
|---|---|---|
index |
All rows in subquery result: key → value |
O(rows × row_size) - primary memory consumer |
tag_indices |
Reverse index: tag_hash → MapSet[keys] |
O(unique_tags × keys_per_tag) |
value_counts |
Reference counting: value → count |
O(unique_values × 16 bytes) |
subscribers |
MapSet of subscribed PIDs | O(subscriber_count) |
Proposed Benchmarks
1. Concurrent Shape Creation
Measure:
- Time to create all shapes
- Memory usage
Vary:
- Number of shapes
- Number of subqueries per shape
- Size of subquery result set
- Subquery nesting depth
- DB latency (higher latency increases buffer size)
- Composite vs single-column key
2. Replication Throughput
Measure:
- Latency from insert to client receipt
- Memory usage
Vary:
- Number of shapes
- Number of transactions
- Number of rows per transaction
- Number of subqueries per shape
- Size of subquery result set
- Composite vs single-column key
3. Move-In/Move-Out
Measure:
- Latency from insert/delete to client receipt
- Memory usage (peak and steady-state)
- Connection pool utilisation
- GC pause times
- Message queue lengths
Vary:
- Move-in vs move-out
- Number of subqueries per shape
- Number of shapes
- Batch size (rows affected per move-in/out)
- DB latency (higher latency increases buffer size)
- Subquery nesting depth
- Composite vs single-column key
Metadata
Metadata
Assignees
Labels
No labels