Skip to content

Commit 570ebaa

Browse files
committed
add working test examples, some docs, small code changes
1 parent 305088e commit 570ebaa

File tree

23 files changed

+1880
-296
lines changed

23 files changed

+1880
-296
lines changed

book/src/SUMMARY.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,14 +69,27 @@
6969
- [Profile](./libs/wasp/components/profile.md)
7070
- [Sampler](./libs/wasp/components/sampler.md)
7171
- [Schedule](./libs/wasp/components/schedule.md)
72+
- [BenchSpy](./libs/wasp/benchspy/overview.md)
73+
- [Getting started](./libs/wasp/benchspy/getting_started.md)
74+
- [Your first test](./libs/wasp/benchspy/first_test.md)
75+
- [Simplest metrics](./libs/wasp/benchspy/simplest_metrics.md)
76+
- [Standard Loki metrics](./libs/wasp/benchspy/loki_std.md)
77+
- [Custom Loki metrics](./libs/wasp/benchspy/loki_custom.md)
78+
- [Standard Prometheus metrics](./libs/wasp/benchspy/prometheus_std.md)
79+
- [Custom Prometheus metrics](./libs/wasp/benchspy/prometheus_custom.md)
80+
- [Defining a new report]()
81+
- [Adding new QueryExecutor]()
82+
- [Adding new storage]()
83+
- [Adding new standard load metric]()
84+
- [Adding new standard resource metric]()
7285
- [How to](./libs/wasp/how-to/overview.md)
7386
- [Start local observability stack](./libs/wasp/how-to/start_local_observability_stack.md)
7487
- [Try it out quickly](./libs/wasp/how-to/run_included_tests.md)
7588
- [Chose between RPS and VUs](./libs/wasp/how-to/chose_rps_vu.md)
7689
- [Define NFRs and check alerts](./libs/wasp/how-to/define_nfr_check_alerts.md)
7790
- [Use labels](./libs/wasp/how-to/use_labels.md)
7891
- [Incorporate load tests in your workflow](./libs/wasp/how-to/incorporate_load_tests.md)
79-
- [Reuse dashboard components](./libs/wasp/how-to/reuse_dashboard_components.md)
92+
- [Reuse dashboard components](./libs/wasp/how-to/reuse_dashboard_components.md)
8093
- [Parallelize load](./libs/wasp/how-to/parallelise_load.md)
8194
- [Debug Loki errors](./libs/wasp/how-to/debug_loki_errors.md)
8295
- [Havoc](./libs/havoc.md)
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# BenchSpy - Your first test
2+
3+
Let's start with a simplest case, which doesn't require you to have any of the observability stack, but only `WASP` and the application you are testing.
4+
`BenchSpy` comes with some built-in `QueryExecutors` each of which additionaly has predefined metrics that you can use. One of these executors is the
5+
`GeneratorQueryExecutor` that fetches metrics directly from `WASP` generators.
6+
7+
Our first test will follow the following logic:
8+
* Run a simple load test
9+
* Generate the performance report and store it
10+
* Run the load again
11+
* Generate a new report and compare it to the previous one
12+
13+
We will use some very simplified assertions, used only for the sake of example, and expect the performance to remain unchanged.
14+
15+
Let's start by defining and running a generator that will use a mocked service:
16+
```go
17+
gen, err := wasp.NewGenerator(&wasp.Config{
18+
T: t,
19+
GenName: "vu",
20+
CallTimeout: 100 * time.Millisecond,
21+
LoadType: wasp.VU,
22+
Schedule: wasp.Plain(10, 15*time.Second),
23+
VU: wasp.NewMockVU(&wasp.MockVirtualUserConfig{
24+
CallSleep: 50 * time.Millisecond,
25+
}),
26+
})
27+
require.NoError(t, err)
28+
gen.Run(true)
29+
```
30+
31+
Now that we have load data, let's generate a baseline performance report and store it in the local storage:
32+
```go
33+
fetchCtx, cancelFn := context.WithTimeout(context.Background(), 60*time.Second)
34+
defer cancelFn()
35+
36+
baseLineReport, err := benchspy.NewStandardReport(
37+
// random hash, this should be commit or hash of the Application Under Test (AUT)
38+
"e7fc5826a572c09f8b93df3b9f674113372ce924",
39+
// use built-in queries for an executor that fetches data directly from the WASP generator
40+
benchspy.WithStandardQueryExecutorType(benchspy.StandardQueryExecutor_Generator),
41+
// WASP generators
42+
benchspy.WithGenerators(gen),
43+
)
44+
require.NoError(t, err, "failed to create original report")
45+
46+
fetchErr := baseLineReport.FetchData(fetchCtx)
47+
require.NoError(t, fetchErr, "failed to fetch data for original report")
48+
49+
path, storeErr := baseLineReport.Store()
50+
require.NoError(t, storeErr, "failed to store current report", path)
51+
```
52+
53+
> [!NOTE]
54+
> There's quite a lot to unpack here and you are enouraged to read more about build-in `QueryExecutors` and
55+
> standard metrics each comes with [here](./built_in_query_executors.md) and about the `StandardReport` [here](./standard_report.md).
56+
>
57+
> For now, it's enough for you to know that standard metrics that `StandardQueryExecutor_Generator` comes with are following:
58+
> * median latency
59+
> * p95 latency (95th percentile)
60+
> * error rate
61+
62+
With baseline report ready let's run the load test again, but this time let's use a wrapper function
63+
that will automatically load the previous report, generate a new one and make sure that they are actually comparable.
64+
```go
65+
// define a new generator using the same config values
66+
newGen, err := wasp.NewGenerator(&wasp.Config{
67+
T: t,
68+
GenName: "vu",
69+
CallTimeout: 100 * time.Millisecond,
70+
LoadType: wasp.VU,
71+
Schedule: wasp.Plain(10, 15*time.Second),
72+
VU: wasp.NewMockVU(&wasp.MockVirtualUserConfig{
73+
CallSleep: 50 * time.Millisecond,
74+
}),
75+
})
76+
require.NoError(t, err)
77+
78+
// run the load
79+
newGen.Run(true)
80+
81+
fetchCtx, cancelFn = context.WithTimeout(context.Background(), 60*time.Second)
82+
defer cancelFn()
83+
84+
// currentReport is the report that we just created (baseLineReport)
85+
currentReport, previousReport, err := benchspy.FetchNewStandardReportAndLoadLatestPrevious(
86+
fetchCtx,
87+
"e7fc5826a572c09f8b93df3b9f674113372ce925",
88+
benchspy.WithStandardQueryExecutorType(benchspy.StandardQueryExecutor_Generator),
89+
benchspy.WithGenerators(newGen),
90+
)
91+
require.NoError(t, err, "failed to fetch current report or load the previous one")
92+
```
93+
94+
> [!NOTE]
95+
> In real-world case, once you have the first report generated you should only need to use
96+
> `benchspy.FetchNewStandardReportAndLoadLatestPrevious` function.
97+
98+
Okay, so we have two reports now, that's great, but how do we make sure that application's performance is as expected?
99+
You'll find out in the [next chapter](./first_test_comparison.md).
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# BenchSpy - Getting started
2+
3+
All of the following examples assume that you have access to following applications:
4+
* Grafana
5+
* Loki
6+
* Prometheus
7+
8+
> [!NOTE]
9+
> The easiest way to run them locally is by using CTFv2's [observability stack](../../../framework/observability/observability_stack.md).
10+
> Just remember to first install the `CTF CLI` as described in [CTFv2 Getting Started](../../../framework/getting_started.md) chapter.
11+
12+
Since BenchSpy is tightly couplesd with WASP it's highly recommended that you [get familiar with it first](../overview.md), if you haven't yet.
13+
14+
Ready? [Let's go!](./first_test.md)
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# BenchSpy - Custom Loki metrics
2+
3+
In this chapter we will see how to use custom LogQl queries in the performance report. For this more advanced use case
4+
we will need to compose the performance report manually.
5+
6+
Load-generation part is the same as in the standard Loki metrics example and thus will be skipped.
7+
8+
Let's define two illustrative metrics now:
9+
* `vu_over_time` - rate of virtual users generated by WASP, 10 seconds window
10+
* `responses_over_time` - number of AUT's responses, 1 second window
11+
12+
```go
13+
lokiQueryExecutor := benchspy.NewLokiQueryExecutor(
14+
map[string]string{
15+
"vu_over_time": fmt.Sprintf("max_over_time({branch=~\"%s\", commit=~\"%s\", go_test_name=~\"%s\", test_data_type=~\"stats\", gen_name=~\"%s\"} | json | unwrap current_instances [10s]) by (node_id, go_test_name, gen_name)", label, label, t.Name(), gen.Cfg.GenName),
16+
"responses_over_time": fmt.Sprintf("sum(count_over_time({branch=~\"%s\", commit=~\"%s\", go_test_name=~\"%s\", test_data_type=~\"responses\", gen_name=~\"%s\"} [1s])) by (node_id, go_test_name, gen_name)", label, label, t.Name(), gen.Cfg.GenName),
17+
},
18+
gen.Cfg.LokiConfig,
19+
)
20+
```
21+
22+
> [!NOTE]
23+
> These LogQl queries are using standard labels that `WASP` uses when sending data to Loki.
24+
25+
And create a `StandardReport` using our custom queries:
26+
```go
27+
baseLineReport, err := benchspy.NewStandardReport(
28+
"2d1fa3532656c51991c0212afce5f80d2914e34e",
29+
// notice the different functional option used to pass custom executors
30+
benchspy.WithQueryExecutors(lokiQueryExecutor),
31+
benchspy.WithGenerators(gen),
32+
)
33+
require.NoError(t, err, "failed to create baseline report")
34+
```
35+
36+
The rest of the code remains basically unchanged (apart from the name of metrics we are asserting on). You can find the full example [here](...).
37+
38+
Now it's time to look at the last of the bundled `QueryExecutors`. Proceed to the [next chapter to read about Prometheus](./prometheus.md).
Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# BenchSpy - Standard Loki metrics
2+
3+
> [!NOTE]
4+
> This example assumes you have access to Loki and Grafana instances. If you don't
5+
> find out how to launch them using CTFv2's [observability stack](../../../framework/observability/observability_stack.md).
6+
7+
Our Loki example, will vary from the previous one in just a couple of details:
8+
* generator will have Loki config
9+
* standard query executor type will be `benchspy.StandardQueryExecutor_Loki`
10+
* we will cast all results to `[]string`
11+
* and calculate medians for all metrics
12+
13+
Ready?
14+
15+
Let's define new load generation first:
16+
```go
17+
label := "benchspy-std"
18+
19+
gen, err := wasp.NewGenerator(&wasp.Config{
20+
T: t,
21+
// read Loki config from environment
22+
LokiConfig: wasp.NewEnvLokiConfig(),
23+
GenName: "vu",
24+
// set unique labels
25+
Labels: map[string]string{
26+
"branch": label,
27+
"commit": label,
28+
},
29+
CallTimeout: 100 * time.Millisecond,
30+
LoadType: wasp.VU,
31+
Schedule: wasp.Plain(10, 15*time.Second),
32+
VU: wasp.NewMockVU(&wasp.MockVirtualUserConfig{
33+
CallSleep: 50 * time.Millisecond,
34+
}),
35+
})
36+
require.NoError(t, err)
37+
```
38+
39+
Now let's run the generator and save baseline report:
40+
```go
41+
gen.Run(true)
42+
43+
fetchCtx, cancelFn := context.WithTimeout(context.Background(), 60*time.Second)
44+
defer cancelFn()
45+
46+
baseLineReport, err := benchspy.NewStandardReport(
47+
"c2cf545d733eef8bad51d685fcb302e277d7ca14",
48+
// notice the different standard executor type
49+
benchspy.WithStandardQueryExecutorType(benchspy.StandardQueryExecutor_Loki),
50+
benchspy.WithGenerators(gen),
51+
)
52+
require.NoError(t, err, "failed to create original report")
53+
54+
fetchErr := baseLineReport.FetchData(fetchCtx)
55+
require.NoError(t, fetchErr, "failed to fetch data for original report")
56+
57+
path, storeErr := baseLineReport.Store()
58+
require.NoError(t, storeErr, "failed to store current report", path)
59+
```
60+
61+
Since next steps are very similar to the ones used in the first test we will skip them and jump straight
62+
to metrics comparison.
63+
64+
By default, `LokiQueryExecutor` returns `[]string` data type, so let's use dedicated convenience functions
65+
to cast them from `interface{}` to string slice:
66+
```go
67+
currentAsStringSlice := benchspy.MustAllLokiResults(currentReport)
68+
previousAsStringSlice := benchspy.MustAllLokiResults(previousReport)
69+
```
70+
71+
And finally, time to compare metrics. Since we have a `[]string` we will first convert it to `[]float64` and
72+
then calculate a median and assume it hasn't changed by more than 1%. Again, remember that this is just an illustration.
73+
You should decide yourself what's the best way to assert the metrics.
74+
75+
```go
76+
var compareMedian = func(metricName string) {
77+
require.NotEmpty(t, currentAsStringSlice[metricName], "%s results were missing from current report", metricName)
78+
require.NotEmpty(t, previousAsStringSlice[metricName], "%s results were missing from previous report", metricName)
79+
80+
currentFloatSlice, err := benchspy.StringSliceToFloat64Slice(currentAsStringSlice[metricName])
81+
require.NoError(t, err, "failed to convert %s results to float64 slice", metricName)
82+
currentMedian := benchspy.CalculatePercentile(currentFloatSlice, 0.5)
83+
84+
previousFloatSlice, err := benchspy.StringSliceToFloat64Slice(previousAsStringSlice[metricName])
85+
require.NoError(t, err, "failed to convert %s results to float64 slice", metricName)
86+
previousMedian := benchspy.CalculatePercentile(previousFloatSlice, 0.5)
87+
88+
var diffPrecentage float64
89+
if previousMedian != 0 {
90+
diffPrecentage = (currentMedian - previousMedian) / previousMedian * 100
91+
} else {
92+
diffPrecentage = currentMedian * 100
93+
}
94+
assert.LessOrEqual(t, math.Abs(diffPrecentage), 1.0, "%s medians are more than 1% different", metricName, fmt.Sprintf("%.4f", diffPrecentage))
95+
}
96+
97+
compareMedian(string(benchspy.MedianLatency))
98+
compareMedian(string(benchspy.Percentile95Latency))
99+
compareMedian(string(benchspy.ErrorRate))
100+
```
101+
102+
We have used standard metrics, which are the same as in the first test, now let's see how you can use your custom LogQl queries.
103+
104+
You can find the full example [here](...).
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# BenchSpy
2+
3+
BenchSpy (short for benchmark spy) is a WASP-coupled tool that allows for easy comparison of various performance metrics.
4+
It supports three types of data sources:
5+
* `Loki`
6+
* `Prometheus`
7+
* `WASP generators`
8+
9+
And can be easily extended to support additional ones.
10+
11+
Since it's main goal is comparison of performance between various releases or versions of applications (for example, to catch performance degradation)
12+
it is `Git`-aware and is able to automatically find the latest relevant performance report.
13+
14+
It doesn't come with any comparation logic, other than making sure that performance reports are comparable (e.g. they mesure the same metrics in the same way),
15+
leaving total freedom to the user.
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# BenchSpy - Custom Prometheus metrics
2+
3+
Similarly to what we have done with Loki, we can use custom metrics with Prometheus.
4+
5+
Most of the code is the same as in previous example. Differences start with the need to manually
6+
create a `PrometheusQueryExecutor` with our custom queries:
7+
8+
```go
9+
// no need to not pass name regexp pattern
10+
// we provide them directly in custom queries
11+
promConfig := benchspy.NewPrometheusConfig()
12+
13+
customPrometheus, err := benchspy.NewPrometheusQueryExecutor(
14+
map[string]string{
15+
// scalar value
16+
"95p_cpu_all_containers": "scalar(quantile(0.95, rate(container_cpu_usage_seconds_total{name=~\"node[^0]\"}[5m])) * 100)",
17+
// matrix value
18+
"cpu_rate_by_container": "rate(container_cpu_usage_seconds_total{name=~\"node[^0]\"}[1m])[30m:1m]",
19+
},
20+
*promConfig,
21+
)
22+
```
23+
24+
Then we pass them as custom query executor:
25+
```go
26+
baseLineReport, err := benchspy.NewStandardReport(
27+
"91ee9e3c903d52de12f3d0c1a07ac3c2a6d141fb",
28+
benchspy.WithQueryExecutors(customPrometheus),
29+
benchspy.WithGenerators(gen),
30+
)
31+
require.NoError(t, err, "failed to create baseline report")
32+
```
33+
34+
> [!NOTE]
35+
> Notice that when using custom Prometheus queries we don't need to pass the `PrometheusConfig`
36+
> to `NewStandardReport()`, because we have already set it when creating `PrometheusQueryExecutor`.
37+
38+
Fetching of current and previous report remain unchanged, just like getting Prometheus metrics cast
39+
to it's specific type:
40+
```go
41+
currentAsValues := benchspy.MustAllPrometheusResults(currentReport)
42+
previousAsValues := benchspy.MustAllPrometheusResults(previousReport)
43+
44+
assert.Equal(t, len(currentAsValues), len(previousAsValues), "number of metrics in results should be the same")
45+
```
46+
47+
But now comes another difference. All standard query results were instances of `model.Vector`. Our two custom queries
48+
introduce two new types:
49+
* `model.Matrix`
50+
* `*model.Scalar`
51+
52+
And these differences are reflected in further casting that we do, before getting final metrics:
53+
```go
54+
current95CPUUsage := currentAsValues["95p_cpu_all_containers"]
55+
previous95CPUUsage := previousAsValues["95p_cpu_all_containers"]
56+
57+
assert.Equal(t, current95CPUUsage.Type(), previous95CPUUsage.Type(), "types of metrics should be the same")
58+
assert.IsType(t, current95CPUUsage, &model.Scalar{}, "current metric should be a scalar")
59+
60+
currentCPUByContainer := currentAsValues["cpu_rate_by_container"]
61+
previousCPUByContainer := previousAsValues["cpu_rate_by_container"]
62+
63+
assert.Equal(t, currentCPUByContainer.Type(), previousCPUByContainer.Type(), "types of metrics should be the same")
64+
assert.IsType(t, currentCPUByContainer, model.Matrix{}, "current metric should be a scalar")
65+
66+
current95CPUUsageAsMatrix := currentCPUByContainer.(model.Matrix)
67+
previous95CPUUsageAsMatrix := currentCPUByContainer.(model.Matrix)
68+
69+
assert.Equal(t, len(current95CPUUsageAsMatrix), len(previous95CPUUsageAsMatrix), "number of samples in matrices should be the same")
70+
```
71+
72+
> [!NOTE]
73+
> When casting to Prometheus' final types it's crucial to remember that two types have pointer receivers and the other two value receivers.
74+
>
75+
> Pointer receivers:
76+
> * `*model.String`
77+
> * `*model.Scalar`
78+
>
79+
> Value receivers:
80+
> * `model.Vector`
81+
> * `model.Matrix`

0 commit comments

Comments
 (0)