You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let's start with a simplest case, which doesn't require you to have any of the observability stack, but only `WASP` and the application you are testing.
4
-
`BenchSpy` comes with some built-in `QueryExecutors` each of which additionaly has predefined metrics that you can use. One of these executors is the
5
-
`DirectQueryExecutor` that fetches metrics directly from `WASP` generators.
3
+
Let's start with the simplest case, which doesn't require any part of the observability stack—only `WASP` and the application you are testing.
4
+
`BenchSpy` comes with built-in `QueryExecutors`, each of which also has predefined metrics that you can use. One of these executors is the`DirectQueryExecutor`, which fetches metrics directly from `WASP` generators,
5
+
which means you can run it with Loki.
6
6
7
-
Our first test will follow the following logic:
8
-
* Run a simple load test
9
-
* Generate the performance report and store it
10
-
* Run the load again
11
-
* Generate a new report and compare it to the previous one
7
+
> [!NOTE]
8
+
> Not sure whether to use `Loki` or `Direct` query executors? [Read this!](./loki_dillema.md)
9
+
10
+
## Test Overview
11
+
12
+
Our first test will follow this logic:
13
+
- Run a simple load test.
14
+
- Generate a performance report and store it.
15
+
- Run the load test again.
16
+
- Generate a new report and compare it to the previous one.
17
+
18
+
We'll use very simplified assertions for this example and expect the performance to remain unchanged.
12
19
13
-
We will use some very simplified assertions, used only for the sake of example, and expect the performance to remain unchanged.
20
+
### Step 1: Define and Run a Generator
21
+
22
+
Let's start by defining and running a generator that uses a mocked service:
14
23
15
-
Let's start by defining and running a generator that will use a mocked service:
16
24
```go
17
25
gen, err:= wasp.NewGenerator(&wasp.Config{
18
26
T: t,
@@ -28,39 +36,43 @@ require.NoError(t, err)
28
36
gen.Run(true)
29
37
```
30
38
31
-
Now that we have load data, let's generate a baseline performance report and store it in the local storage:
require.NoError(t, fetchErr, "failed to fetch data for original report")
58
+
require.NoError(t, fetchErr, "failed to fetch data for baseline report")
48
59
49
60
path, storeErr:= baseLineReport.Store()
50
-
require.NoError(t, storeErr, "failed to store current report", path)
61
+
require.NoError(t, storeErr, "failed to store baseline report", path)
51
62
```
52
63
53
64
> [!NOTE]
54
-
> There's quite a lot to unpack here and you are enouraged to read more about build-in `QueryExecutors` and
55
-
> standard metrics each comes with [here](./built_in_query_executors.md) and about the `StandardReport`[here](./standard_report.md).
65
+
> There's a lot to unpack here, and you're encouraged to read more about the built-in `QueryExecutors` and the standard metrics they provide as well as about the `StandardReport`[here](./reports/standard_report.md).
56
66
>
57
-
> For now, it's enough for you to know that standard metrics that `StandardQueryExecutor_Generator` comes with are following:
58
-
> * median latency
59
-
> * p95 latency (95th percentile)
60
-
> * error rate
67
+
> For now, it's enough to know that the standard metrics provided by `StandardQueryExecutor_Direct` include:
68
+
> - Median latency
69
+
> - P95 latency (95th percentile)
70
+
> - Error rate
71
+
72
+
### Step 3: Run the Test Again and Compare Reports
73
+
74
+
With the baseline report ready, let's run the load test again. This time, we'll use a wrapper function to automatically load the previous report, generate a new one, and ensure they are comparable.
61
75
62
-
With baseline report ready let's run the load test again, but this time let's use a wrapper function
63
-
that will automatically load the previous report, generate a new one and make sure that they are actually comparable.
64
76
```go
65
77
// define a new generator using the same config values
66
78
newGen, err:= wasp.NewGenerator(&wasp.Config{
@@ -84,6 +96,7 @@ defer cancelFn()
84
96
// currentReport is the report that we just created (baseLineReport)
> In a real-world case, once you've generated the first report, you should only need to use the `benchspy.FetchNewStandardReportAndLoadLatestPrevious` function.
109
+
110
+
### What's Next?
97
111
98
-
Okay, so we have two reports now, that's great, but how do we make sure that application's performance is as expected?
99
-
You'll find out in the [next chapter](./first_test_comparison.md).
112
+
Now that we have two reports, how do we ensure that the application's performance meets expectations?
113
+
Find out in the [next chapter](./simplest_metrics.md).
In this chapter we will see how to use custom LogQl queries in the performance report. For this more advanced use case
4
-
we will need to compose the performance report manually.
3
+
In this chapter, we’ll explore how to use custom `LogQL` queries in the performance report. For this more advanced use case, we’ll manually compose the performance report.
5
4
6
-
Load-generation part is the same as in the standard Loki metrics example and thus will be skipped.
5
+
The load generation part is the same as in the standard Loki metrics example and will be skipped.
7
6
8
-
Let's define two illustrative metrics now:
9
-
*`vu_over_time` - rate of virtual users generated by WASP, 10 seconds window
10
-
*`responses_over_time` - number of AUT's responses, 1 second window
7
+
## Defining Custom Metrics
8
+
9
+
Let’s define two illustrative metrics:
10
+
-**`vu_over_time`**: The rate of virtual users generated by WASP, using a 10-second window.
11
+
-**`responses_over_time`**: The number of AUT's responses, using a 1-second window.
> These LogQl queries are using standard labels that `WASP` uses when sending data to Loki.
24
+
> These `LogQL` queries use the standard labels that `WASP` applies when sending data to Loki.
25
+
26
+
## Creating a `StandardReport` with Custom Queries
27
+
28
+
Now, let’s create a `StandardReport` using our custom queries:
24
29
25
-
And create a `StandardReport` using our custom queries:
26
30
```go
27
31
baseLineReport, err:= benchspy.NewStandardReport(
28
32
"2d1fa3532656c51991c0212afce5f80d2914e34e",
29
-
// notice the different functional option used to pass custom executors
33
+
// notice the different functional option used to pass Loki executor with custom queries
30
34
benchspy.WithQueryExecutors(lokiQueryExecutor),
31
35
benchspy.WithGenerators(gen),
32
36
)
33
37
require.NoError(t, err, "failed to create baseline report")
34
38
```
35
39
36
-
The rest of the code remains basically unchanged (apart from the name of metrics we are asserting on). You can find the full example [here](...).
40
+
## Wrapping Up
37
41
38
-
Now it's time to look at the last of the bundled `QueryExecutors`. Proceed to the [next chapter to read about Prometheus](./prometheus.md).
42
+
The rest of the code remains unchanged, except for the names of the metrics being asserted. You can find the full example [here](...).
43
+
44
+
Now it’s time to look at the last of the bundled `QueryExecutors`. Proceed to the [next chapter to read about Prometheus](./prometheus_std.md).
45
+
46
+
> [!NOTE]
47
+
> You can find the full example [here](https://github.com/smartcontractkit/chainlink-testing-framework/tree/main/wasp/examples/benchspy/loki_query_executor/loki_query_executor_test.go).
You might be asking yourself whether you should use `Loki` or `Direct` query executor if all you
4
-
need are basic latency metrics.
3
+
You might be wondering whether to use the `Loki` or `Direct` query executor if all you need are basic latency metrics.
5
4
6
-
As a rule of thumb, if all you need is a single number that describes the median latency or error rate
7
-
and you are not interested in directly comparing time series, minimum or maximum values or any kinds
8
-
of more advanced calculation on raw data, then you should go with the `Direct`.
5
+
## Rule of Thumb
9
6
10
-
Why?
7
+
If all you need is a single number, such as the median latency or error rate, and you're not interested in:
8
+
- Comparing time series directly,
9
+
- Examining minimum or maximum values, or
10
+
- Performing advanced calculations on raw data,
11
11
12
-
Because it returns a single value for each of standard metrics using the same raw data that Loki would use
13
-
(it accesses the data stored in the `WASP`'s generator that would later be pushed to Loki).
14
-
This way you can run your load test without a Loki instance and save yourself the need of calculating the
15
-
median and 95th percentile latency or the error ratio.
12
+
then you should opt for the `Direct` query executor.
13
+
14
+
## Why Choose `Direct`?
15
+
16
+
The `Direct` executor returns a single value for each standard metric using the same raw data that Loki would use. It accesses data stored in the `WASP` generator, which is later pushed to Loki.
17
+
18
+
This means you can:
19
+
- Run your load test without a Loki instance.
20
+
- Avoid calculating metrics like the median, 95th percentile latency, or error ratio yourself.
21
+
22
+
By using `Direct`, you save resources and simplify the process when advanced analysis isn't required.
> This example assumes you have access to Loki and Grafana instances. If you don't
5
-
> find out how to launch them using CTFv2's [observability stack](../../../framework/observability/observability_stack.md).
3
+
> [!WARNING]
4
+
> This example assumes you have access to Loki and Grafana instances. If you don't, learn how to launch them using CTFv2's [observability stack](../../../framework/observability/observability_stack.md).
6
5
7
-
Our Loki example, will vary from the previous one in just a couple of details:
8
-
*generator will have Loki config
9
-
*standard query executor type will be `benchspy.StandardQueryExecutor_Loki`
10
-
* we will cast all results to `[]string`
11
-
* and calculate medians for all metrics
6
+
In this example, our Loki workflow will differ from the previous one in just a few details:
7
+
- The generator will include a Loki configuration.
8
+
- The standard query executor type will be `benchspy.StandardQueryExecutor_Loki`.
require.NoError(t, fetchErr, "failed to fetch data for original report")
58
+
require.NoError(t, fetchErr, "failed to fetch data for baseline report")
56
59
57
60
path, storeErr:= baseLineReport.Store()
58
-
require.NoError(t, storeErr, "failed to store current report", path)
61
+
require.NoError(t, storeErr, "failed to store baseline report", path)
59
62
```
60
63
61
-
Since next steps are very similar to the ones used in the first test we will skip them and jump straight
62
-
to metrics comparison.
64
+
## Step 3: Skip to Metrics Comparison
65
+
66
+
Since the next steps are very similar to those in the first test, we’ll skip them and go straight to metrics comparison.
67
+
68
+
By default, the `LokiQueryExecutor` returns results as the `[]string` data type. Let’s use dedicated convenience functions to cast them from `interface{}` to string slices:
63
69
64
-
By default, `LokiQueryExecutor` returns `[]string` data type, so let's use dedicated convenience functions
And finally, time to compare metrics. Since we have a `[]string` we will first convert it to `[]float64` and
72
-
then calculate a median and assume it hasn't changed by more than 1%. Again, remember that this is just an illustration.
73
-
You should decide yourself what's the best way to assert the metrics.
75
+
## Step 4: Compare Metrics
76
+
77
+
Now, let’s compare metrics. Since we have `[]string`, we’ll first convert it to `[]float64`, calculate the median, and ensure the difference between the medians is less than 1%. Again, this is just an example—you should decide the best way to validate your metrics.
74
78
75
79
```go
76
80
varcompareMedian = func(metricName string) {
@@ -85,23 +89,23 @@ var compareMedian = func(metricName string) {
85
89
require.NoError(t, err, "failed to convert %s results to float64 slice", metricName)
We have used standard metrics, which are the same as in the first test, now let's see how you can use your custom LogQl queries.
106
+
## What’s Next?
103
107
104
-
> [!NOTE]
105
-
> Don't know whether to use `Loki` or `Direct` query executors? [Read this!](./loki_dillema.md)
108
+
In this example, we used standard metrics, which are the same as in the first test. Now, [let’s explore how to use your custom LogQL queries](./loki_custom.md).
106
109
107
-
You can find the full example [here](...).
110
+
> [!NOTE]
111
+
> You can find the full example [here](https://github.com/smartcontractkit/chainlink-testing-framework/tree/main/wasp/examples/benchspy/loki_query_executor/loki_query_executor_test.go).
BenchSpy (short for benchmark spy) is a WASP-coupled tool that allows for easy comparison of various performance metrics.
3
+
BenchSpy (short for Benchmark Spy) is a [WASP](../overview.md)-coupled tool designed for easy comparison of various performance metrics.
4
4
5
-
It's main characteristics are:
6
-
* three built-in data sources:
7
-
*`Loki`
8
-
*`Prometheus`
9
-
*`Direct`
10
-
* standard/pre-defined metrics for each data source
11
-
* ease of extensibility with custom metrics
12
-
* ability to load latest performance report based on Git history
13
-
*88% unit test coverage
5
+
## Key Features
6
+
-**Three built-in data sources**:
7
+
-`Loki`
8
+
-`Prometheus`
9
+
-`Direct`
10
+
-**Standard/pre-defined metrics** for each data source.
11
+
-**Ease of extensibility** with custom metrics.
12
+
-**Ability to load the latest performance report** based on Git history.
13
+
-**88% unit test coverage**.
14
14
15
-
It doesn't come with any comparation logic, other than making sure that performance reports are comparable (e.g. they mesure the same metrics in the same way),
16
-
leaving total freedom to the user.
15
+
BenchSpy does not include any built-in comparison logic beyond ensuring that performance reports are comparable (e.g., they measure the same metrics in the same way), offering complete freedom to the user for interpretation and analysis.
0 commit comments