Should the latency measurements include the results from unsuccessful queries? #56
DenhamPreen
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
👋 Introduced myself here
In doing some initial exploring of flood it looks like the latency metrics include unsuccessful queries too which don't allow for apples-to-apples latency comparisons. The behavior I was expecting was that if a request was unsuccessful then it wouldn't be included in the latency.
Here is the request and summary that leads me to this understanding.
┌────────────────────────┐ │ Load test: eth_getLogs │ └────────────────────────┘ - sample rates: [10, 100] - sample duration: 10 - extra args: None - output directory: /home/flood/control Gathering node data... ────────────────────── [WARNING] ctc config file does not exist; use `ctc setup` on command line to generate a config file [WARNING] ctc config file does not exist; use `ctc setup` on command line to generate a config file [WARNING] ctc config file does not exist; use `ctc setup` on command line to generate a config file node │ url │ metadata ─────────┼────────────────────────────┼───────────── llama │ https://eth.llamarpc.com │ rpc-proxy ─────────┼────────────────────────────┼───────────── ankr │ https://rpc.ankr.com/eth │ ─────────┼────────────────────────────┼───────────── │ │ Geth drpc │ https://eth.drpc.org │ v10.0.0 │ │ drpc Running load tests... ───────────────────── [2024-02-25 08:21:57] Starting [2024-02-25 08:21:57] Running load test for llama [2024-02-25 08:21:58] Running attack at rate = 10 rps [2024-02-25 08:22:09] Running attack at rate = 100 rps [2024-02-25 08:22:23] Running load test for ankr [2024-02-25 08:22:24] Running attack at rate = 10 rps [2024-02-25 08:22:34] Running attack at rate = 100 rps [2024-02-25 08:22:44] Running load test for drpc [2024-02-25 08:22:44] Running attack at rate = 10 rps [2024-02-25 08:22:56] Running attack at rate = 100 rps [2024-02-25 08:23:21] Load tests completed. Saving results to output directory... ───────────────────────────────────── - test.json - results.json - figures Summarizing performance metrics... ────────────────────────────────── ┌─────────────────┐ │ success vs load │ └─────────────────┘ rate (rps) │ llama │ ankr │ drpc ──────────────┼──────────┼────────┼────────── 10 │ 100.0% │ 0.0% │ 100.0% 100 │ 5.2% │ 0.0% │ 83.9% ┌────────────────────┐ │ throughput vs load │ └────────────────────┘ rate (rps) │ llama (rps) │ ankr (rps) │ drpc (rps) ──────────────┼───────────────┼──────────────┼────────────── 10 │ 9.542747 │ 0.000000 │ 9.151187 [2024-02-25 08:23:22] Load tests completed. Saving results to output directory... ───────────────────────────────────── - test.json - results.json - figures Summarizing performance metrics... ────────────────────────────────── ┌─────────────────┐ │ success vs load │ └─────────────────┘ rate (rps) │ llama │ ankr │ drpc ──────────────┼──────────┼────────┼────────── 10 │ 100.0% │ 0.0% │ 100.0% 100 │ 5.2% │ 0.0% │ 83.9% ┌────────────────────┐ │ throughput vs load │ └────────────────────┘ rate (rps) │ llama (rps) │ ankr (rps) │ drpc (rps) ──────────────┼───────────────┼──────────────┼────────────── 10 │ 9.542747 │ 0.000000 │ 9.151187 100 │ 3.854604 │ 0.000000 │ 64.204534 ┌─────────────┐ │ p90 vs load │ └─────────────┘ rate (rps) │ llama (s) │ ankr (s) │ drpc (s) ──────────────┼─────────────┼────────────┼──────────── 10 │ 2.224673 │ 0.184944 │ 1.585222 100 │ 0.308716 │ 1.973127 │ 8.537348The findings from the above are that ankr was completely unsuccessful (maybe down, maybe rate limited from a prior req), thus I assume the [p90 vs load] results are based on the latency of the unsuccessful requests.
Knowing that the ankr requests had a 0% success rate I can safely ignore the latency results however for the llama rpc tests at a rate of 100 rps the success rate was only 5.2% and the latency 0.309 s, leading me to think the llama rpc was much faster however I think this metric is skewed by the fact that unsuccessful requests have a much smaller latency than successful requests.
The discussion point I'm opening is on whether the latency measurements should only include successful requests. I would say the results of [p90 vs load] in isolation are not a good representative of the latency as you can't compare in relation to other rpc's without them having the same success rates.
Additional context: I am running via docker, testing against free RPC endpoints sourced from chainlist
Beta Was this translation helpful? Give feedback.
All reactions