apollo_network_benchmark: add ReveresedSqmrSender struct and implementation#11554
Conversation
|
There hasn't been any activity on this pull request recently, and in order to prioritize active work, it has been marked as stale. |
| break; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Only one query stored despite broadcast-to-all intent
Medium Severity
collect_new_queries replaces active_query each iteration, so only the most recent query is retained and all prior queries are silently dropped. The comment on line 143 says "broadcast the message to all active queries," and the method is named broadcast_to_queries (plural), but the struct only holds a single Option — meaning in a multi-receiver benchmark scenario, only the last peer to connect actually receives data. If broadcasting to all connected peers is intended, active_query needs to be a collection.
Additional Locations (1)
a86c6c8 to
b16cd92
Compare
471fd3f to
a929882
Compare
a929882 to
a354cd2
Compare
b16cd92 to
55e1b2d
Compare
| break; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
1ms timeout per send inflates benchmark latency metrics
Medium Severity
collect_new_queries uses a tokio::time::timeout of 1ms in its while loop. When no queries are pending (the common steady-state case), the final iteration always blocks for 1ms before the timeout fails and the loop exits. Since collect_new_queries is called on every send_message invocation, this adds at least 1ms of artificial latency to every send. In a stress test benchmark, this caps throughput at ~1000 msg/s and inflates the BROADCAST_MESSAGE_SEND_DELAY_SECONDS metric recorded by the caller, producing misleading benchmark results.
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Silent message drop when no active query exists
Medium Severity
broadcast_to_queries silently discards the message when active_query is None — no warning or error is logged. The caller in handlers.rs unconditionally increments BROADCAST_MESSAGE_COUNT and records send delay after calling send_message, so dropped messages are counted as successfully sent. This produces inaccurate benchmark metrics. The other protocol variants either panic or log errors on failure, making this inconsistency especially easy to miss during debugging.
a354cd2 to
a153019
Compare
55e1b2d to
8067a3e
Compare
a153019 to
4541183
Compare
8067a3e to
4015380
Compare



Note
Medium Risk
Adds new stateful async sending behavior for the
reversed-sqmrbenchmark protocol, which could change stress-test traffic patterns and introduce subtle timing issues (timeouts/query replacement), but is confined to benchmarking code.Overview
Adds a new
MessageSender::ReveresedSqmrvariant backed byReveresedSqmrSender, which tracks the latest incoming SQMR server query and replies to it with outgoing broadcast messages.send_messagenow supports this mode by opportunistically polling for new queries (short timeout), replacing the active query, and sending responses with improvedinfo/trace/errorlogging; protocol channel registration forreversed-sqmrremainstodo!().Written by Cursor Bugbot for commit 4541183. This will update automatically on new commits. Configure here.