performance testing: number of subscriptions vs. function latency#111
performance testing: number of subscriptions vs. function latency#111diana-qing wants to merge 34 commits intostanford-esrg:mainfrom
Conversation
…already profiled the given sub count
| "examples/log_ssh", | ||
| "examples/streaming", | ||
| "examples/streaming", | ||
| "examples/ip_subs", |
There was a problem hiding this comment.
Can you move this into the tests/perf folder?
| @@ -0,0 +1,49 @@ | |||
| use retina_core::{Runtime, config::load_config}; | |||
There was a problem hiding this comment.
Add a README for this example.
| @@ -0,0 +1,31 @@ | |||
| # Performance Testing | |||
There was a problem hiding this comment.
Add an intro with the high-level motivation for this and what it does! What you shared at the EOQ lab meeting was great.
There was a problem hiding this comment.
Also mention the initial testing you did to ensure that this approach is accurate!
Here's my best understanding of what you found:
- You compared results with Retina's current timing infrastructure, which inlines cycle counts. You found that the uprobes add a constant overhead. That is, this will accurately surface patterns for the use-case of comparing function latency across different implementations or applications.
- You can't run this at super high throughputs. IIRC, we were able to handle ~5Gbps of live traffic (unless you got more on the passive box). This gives plenty of data points for saying something about function latency.
- You confirmed this separates entry/exit points by thread, so it'll be accurate even if there are multiple cores. (IMO this was a bit unclear in the documentation.)
| ``` | ||
|
|
||
| ## Number of Subscriptions vs. Function Latency | ||
| `generate_ip_subs.py` shards the IPv4 address space into `n` subnets to generate `n` Retina subscriptions, where `n` is passed in by the user. The subscriptions are written to `spec.toml`. |
There was a problem hiding this comment.
Maybe clarify that this is a sample / basic application and more can easily be added. The main goal of your project was to set up the infrastructure.
|
|
||
| `run_app.py` runs the `ip_subs` application and measures how the latency of a function changes as the number of subscriptions changes. It generates subscriptions using `generate_ip_subs.py`, then runs `ip_subs` with these subscriptions and measures latency using `func_latency.py`. The latencies are written to `stats/ip_subs_latency_stats.csv` and plots on the number of subscriptions vs. latency for different stats (e.g. average, 99th percentile) can be found in the `figs` directory. The `stats` and `figs` directory get created by the script if they don't already exist. | ||
|
|
||
| When running `run_app.py`, you can specify which function to profile, the number of subscriptions, and the config file path. For example, to measure the latency of the `process_packet` function in online mode when the number of subscriptions is 64 and 256, you can run: |
There was a problem hiding this comment.
Probably mention that you can profile multiple functions, but because it just records entry/exit timestamps, keep in mind that profiling functions that overlap will cause interference. (You observed this!)
|
|
||
| `func_latency.py` uses bcc to profile function latency when running an application by attaching eBPF programs to uprobes at the entry and exit point of functions. Latency is measured in nanoseconds by default. The code for profiling function latency was based on the [example provided by bcc](https://github.com/iovisor/bcc/blob/master/tools/funclatency.py). | ||
|
|
||
| `run_app.py` runs the `ip_subs` application and measures how the latency of a function changes as the number of subscriptions changes. It generates subscriptions using `generate_ip_subs.py`, then runs `ip_subs` with these subscriptions and measures latency using `func_latency.py`. The latencies are written to `stats/ip_subs_latency_stats.csv` and plots on the number of subscriptions vs. latency for different stats (e.g. average, 99th percentile) can be found in the `figs` directory. The `stats` and `figs` directory get created by the script if they don't already exist. |
There was a problem hiding this comment.
Should this be run from a specific directory within the Retina repo?
| @@ -0,0 +1,206 @@ | |||
| # code for profiling function latency with bcc based on https://github.com/iovisor/bcc/blob/master/tools/funclatency.py | |||
There was a problem hiding this comment.
We had talked a bit about managing output a couple of weeks ago in online mode by consuming the subprocess output and filtering it before printing:
- Making it so that the "samples lost" alert isn't printed
- Consuming the output and printing the updates on Gbps processed, packets lost, etc.
Did you try this and run into challenges? (I think this is not critical for accuracy, but it is extremely helpful for usability if it's reasonably easy to do.)
| @@ -0,0 +1,49 @@ | |||
| import argparse | |||
| @@ -0,0 +1,128 @@ | |||
| import argparse | |||
This PR adds scripts to measure how the latency of a function when running the
ip_subsapplication changes as the number of subscriptions changes.