Skip to content

Commit 5a1c3e7

Browse files
committed
smoother docs
1 parent 358108a commit 5a1c3e7

File tree

20 files changed

+441
-352
lines changed

20 files changed

+441
-352
lines changed
Lines changed: 45 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,26 @@
1-
# BenchSpy - Your first test
1+
# BenchSpy - Your First Test
22

3-
Let's start with a simplest case, which doesn't require you to have any of the observability stack, but only `WASP` and the application you are testing.
4-
`BenchSpy` comes with some built-in `QueryExecutors` each of which additionaly has predefined metrics that you can use. One of these executors is the
5-
`DirectQueryExecutor` that fetches metrics directly from `WASP` generators.
3+
Let's start with the simplest case, which doesn't require any part of the observability stackonly `WASP` and the application you are testing.
4+
`BenchSpy` comes with built-in `QueryExecutors`, each of which also has predefined metrics that you can use. One of these executors is the `DirectQueryExecutor`, which fetches metrics directly from `WASP` generators,
5+
which means you can run it with Loki.
66

7-
Our first test will follow the following logic:
8-
* Run a simple load test
9-
* Generate the performance report and store it
10-
* Run the load again
11-
* Generate a new report and compare it to the previous one
7+
> [!NOTE]
8+
> Not sure whether to use `Loki` or `Direct` query executors? [Read this!](./loki_dillema.md)
9+
10+
## Test Overview
11+
12+
Our first test will follow this logic:
13+
- Run a simple load test.
14+
- Generate a performance report and store it.
15+
- Run the load test again.
16+
- Generate a new report and compare it to the previous one.
17+
18+
We'll use very simplified assertions for this example and expect the performance to remain unchanged.
1219

13-
We will use some very simplified assertions, used only for the sake of example, and expect the performance to remain unchanged.
20+
### Step 1: Define and Run a Generator
21+
22+
Let's start by defining and running a generator that uses a mocked service:
1423

15-
Let's start by defining and running a generator that will use a mocked service:
1624
```go
1725
gen, err := wasp.NewGenerator(&wasp.Config{
1826
T: t,
@@ -28,39 +36,43 @@ require.NoError(t, err)
2836
gen.Run(true)
2937
```
3038

31-
Now that we have load data, let's generate a baseline performance report and store it in the local storage:
32-
```go
33-
fetchCtx, cancelFn := context.WithTimeout(context.Background(), 60*time.Second)
34-
defer cancelFn()
39+
### Step 2: Generate a Baseline Performance Report
3540

41+
With load data available, let's generate a baseline performance report and store it in local storage:
42+
43+
```go
3644
baseLineReport, err := benchspy.NewStandardReport(
37-
// random hash, this should be commit or hash of the Application Under Test (AUT)
45+
// random hash, this should be the commit or hash of the Application Under Test (AUT)
3846
"e7fc5826a572c09f8b93df3b9f674113372ce924",
3947
// use built-in queries for an executor that fetches data directly from the WASP generator
4048
benchspy.WithStandardQueries(benchspy.StandardQueryExecutor_Direct),
4149
// WASP generators
4250
benchspy.WithGenerators(gen),
4351
)
44-
require.NoError(t, err, "failed to create original report")
52+
require.NoError(t, err, "failed to create baseline report")
53+
54+
fetchCtx, cancelFn := context.WithTimeout(context.Background(), 60*time.Second)
55+
defer cancelFn()
4556

4657
fetchErr := baseLineReport.FetchData(fetchCtx)
47-
require.NoError(t, fetchErr, "failed to fetch data for original report")
58+
require.NoError(t, fetchErr, "failed to fetch data for baseline report")
4859

4960
path, storeErr := baseLineReport.Store()
50-
require.NoError(t, storeErr, "failed to store current report", path)
61+
require.NoError(t, storeErr, "failed to store baseline report", path)
5162
```
5263

5364
> [!NOTE]
54-
> There's quite a lot to unpack here and you are enouraged to read more about build-in `QueryExecutors` and
55-
> standard metrics each comes with [here](./built_in_query_executors.md) and about the `StandardReport` [here](./standard_report.md).
65+
> There's a lot to unpack here, and you're encouraged to read more about the built-in `QueryExecutors` and the standard metrics they provide as well as about the `StandardReport` [here](./reports/standard_report.md).
5666
>
57-
> For now, it's enough for you to know that standard metrics that `StandardQueryExecutor_Generator` comes with are following:
58-
> * median latency
59-
> * p95 latency (95th percentile)
60-
> * error rate
67+
> For now, it's enough to know that the standard metrics provided by `StandardQueryExecutor_Direct` include:
68+
> - Median latency
69+
> - P95 latency (95th percentile)
70+
> - Error rate
71+
72+
### Step 3: Run the Test Again and Compare Reports
73+
74+
With the baseline report ready, let's run the load test again. This time, we'll use a wrapper function to automatically load the previous report, generate a new one, and ensure they are comparable.
6175

62-
With baseline report ready let's run the load test again, but this time let's use a wrapper function
63-
that will automatically load the previous report, generate a new one and make sure that they are actually comparable.
6476
```go
6577
// define a new generator using the same config values
6678
newGen, err := wasp.NewGenerator(&wasp.Config{
@@ -84,6 +96,7 @@ defer cancelFn()
8496
// currentReport is the report that we just created (baseLineReport)
8597
currentReport, previousReport, err := benchspy.FetchNewStandardReportAndLoadLatestPrevious(
8698
fetchCtx,
99+
// commit or tag of the new application version
87100
"e7fc5826a572c09f8b93df3b9f674113372ce925",
88101
benchspy.WithStandardQueries(benchspy.StandardQueryExecutor_Direct),
89102
benchspy.WithGenerators(newGen),
@@ -92,8 +105,9 @@ require.NoError(t, err, "failed to fetch current report or load the previous one
92105
```
93106

94107
> [!NOTE]
95-
> In real-world case, once you have the first report generated you should only need to use
96-
> `benchspy.FetchNewStandardReportAndLoadLatestPrevious` function.
108+
> In a real-world case, once you've generated the first report, you should only need to use the `benchspy.FetchNewStandardReportAndLoadLatestPrevious` function.
109+
110+
### What's Next?
97111

98-
Okay, so we have two reports now, that's great, but how do we make sure that application's performance is as expected?
99-
You'll find out in the [next chapter](./first_test_comparison.md).
112+
Now that we have two reports, how do we ensure that the application's performance meets expectations?
113+
Find out in the [next chapter](./simplest_metrics.md).
Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
1-
# BenchSpy - Getting started
1+
# BenchSpy - Getting Started
22

3-
All of the following examples assume that you have access to following applications:
4-
* Grafana
5-
* Loki
6-
* Prometheus
3+
The following examples assume you have access to the following applications:
4+
- Grafana
5+
- Loki
6+
- Prometheus
77

88
> [!NOTE]
9-
> The easiest way to run them locally is by using CTFv2's [observability stack](../../../framework/observability/observability_stack.md).
10-
> Just remember to first install the `CTF CLI` as described in [CTFv2 Getting Started](../../../framework/getting_started.md) chapter.
9+
> The easiest way to run these locally is by using CTFv2's [observability stack](../../../framework/observability/observability_stack.md).
10+
> Be sure to install the `CTF CLI` first, as described in the [CTFv2 Getting Started](../../../framework/getting_started.md) guide.
1111
12-
Since BenchSpy is tightly couplesd with WASP it's highly recommended that you [get familiar with it first](../overview.md), if you haven't yet.
12+
Since BenchSpy is tightly coupled with WASP, we highly recommend that you [get familiar with it first](../overview.md) if you haven't already.
1313

14-
Ready? [Let's go!](./first_test.md)
14+
Ready? [Let's get started!](./first_test.md)
Lines changed: 21 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,14 @@
1-
# BenchSpy - Custom Loki metrics
1+
# BenchSpy - Custom Loki Metrics
22

3-
In this chapter we will see how to use custom LogQl queries in the performance report. For this more advanced use case
4-
we will need to compose the performance report manually.
3+
In this chapter, we’ll explore how to use custom `LogQL` queries in the performance report. For this more advanced use case, we’ll manually compose the performance report.
54

6-
Load-generation part is the same as in the standard Loki metrics example and thus will be skipped.
5+
The load generation part is the same as in the standard Loki metrics example and will be skipped.
76

8-
Let's define two illustrative metrics now:
9-
* `vu_over_time` - rate of virtual users generated by WASP, 10 seconds window
10-
* `responses_over_time` - number of AUT's responses, 1 second window
7+
## Defining Custom Metrics
8+
9+
Let’s define two illustrative metrics:
10+
- **`vu_over_time`**: The rate of virtual users generated by WASP, using a 10-second window.
11+
- **`responses_over_time`**: The number of AUT's responses, using a 1-second window.
1112

1213
```go
1314
lokiQueryExecutor := benchspy.NewLokiQueryExecutor(
@@ -20,19 +21,27 @@ lokiQueryExecutor := benchspy.NewLokiQueryExecutor(
2021
```
2122

2223
> [!NOTE]
23-
> These LogQl queries are using standard labels that `WASP` uses when sending data to Loki.
24+
> These `LogQL` queries use the standard labels that `WASP` applies when sending data to Loki.
25+
26+
## Creating a `StandardReport` with Custom Queries
27+
28+
Now, let’s create a `StandardReport` using our custom queries:
2429

25-
And create a `StandardReport` using our custom queries:
2630
```go
2731
baseLineReport, err := benchspy.NewStandardReport(
2832
"2d1fa3532656c51991c0212afce5f80d2914e34e",
29-
// notice the different functional option used to pass custom executors
33+
// notice the different functional option used to pass Loki executor with custom queries
3034
benchspy.WithQueryExecutors(lokiQueryExecutor),
3135
benchspy.WithGenerators(gen),
3236
)
3337
require.NoError(t, err, "failed to create baseline report")
3438
```
3539

36-
The rest of the code remains basically unchanged (apart from the name of metrics we are asserting on). You can find the full example [here](...).
40+
## Wrapping Up
3741

38-
Now it's time to look at the last of the bundled `QueryExecutors`. Proceed to the [next chapter to read about Prometheus](./prometheus.md).
42+
The rest of the code remains unchanged, except for the names of the metrics being asserted. You can find the full example [here](...).
43+
44+
Now it’s time to look at the last of the bundled `QueryExecutors`. Proceed to the [next chapter to read about Prometheus](./prometheus_std.md).
45+
46+
> [!NOTE]
47+
> You can find the full example [here](https://github.com/smartcontractkit/chainlink-testing-framework/tree/main/wasp/examples/benchspy/loki_query_executor/loki_query_executor_test.go).
Lines changed: 18 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,22 @@
1-
# BenchSpy - To Loki or not to Loki?
1+
# BenchSpy - To Loki or Not to Loki?
22

3-
You might be asking yourself whether you should use `Loki` or `Direct` query executor if all you
4-
need are basic latency metrics.
3+
You might be wondering whether to use the `Loki` or `Direct` query executor if all you need are basic latency metrics.
54

6-
As a rule of thumb, if all you need is a single number that describes the median latency or error rate
7-
and you are not interested in directly comparing time series, minimum or maximum values or any kinds
8-
of more advanced calculation on raw data, then you should go with the `Direct`.
5+
## Rule of Thumb
96

10-
Why?
7+
If all you need is a single number, such as the median latency or error rate, and you're not interested in:
8+
- Comparing time series directly,
9+
- Examining minimum or maximum values, or
10+
- Performing advanced calculations on raw data,
1111

12-
Because it returns a single value for each of standard metrics using the same raw data that Loki would use
13-
(it accesses the data stored in the `WASP`'s generator that would later be pushed to Loki).
14-
This way you can run your load test without a Loki instance and save yourself the need of calculating the
15-
median and 95th percentile latency or the error ratio.
12+
then you should opt for the `Direct` query executor.
13+
14+
## Why Choose `Direct`?
15+
16+
The `Direct` executor returns a single value for each standard metric using the same raw data that Loki would use. It accesses data stored in the `WASP` generator, which is later pushed to Loki.
17+
18+
This means you can:
19+
- Run your load test without a Loki instance.
20+
- Avoid calculating metrics like the median, 95th percentile latency, or error ratio yourself.
21+
22+
By using `Direct`, you save resources and simplify the process when advanced analysis isn't required.
Lines changed: 37 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,20 @@
1-
# BenchSpy - Standard Loki metrics
1+
# BenchSpy - Standard Loki Metrics
22

3-
> [!NOTE]
4-
> This example assumes you have access to Loki and Grafana instances. If you don't
5-
> find out how to launch them using CTFv2's [observability stack](../../../framework/observability/observability_stack.md).
3+
> [!WARNING]
4+
> This example assumes you have access to Loki and Grafana instances. If you don't, learn how to launch them using CTFv2's [observability stack](../../../framework/observability/observability_stack.md).
65
7-
Our Loki example, will vary from the previous one in just a couple of details:
8-
* generator will have Loki config
9-
* standard query executor type will be `benchspy.StandardQueryExecutor_Loki`
10-
* we will cast all results to `[]string`
11-
* and calculate medians for all metrics
6+
In this example, our Loki workflow will differ from the previous one in just a few details:
7+
- The generator will include a Loki configuration.
8+
- The standard query executor type will be `benchspy.StandardQueryExecutor_Loki`.
9+
- All results will be cast to `[]string`.
10+
- We'll calculate medians for all metrics.
1211

1312
Ready?
1413

15-
Let's define new load generation first:
14+
## Step 1: Define a New Load Generator
15+
16+
Let's start by defining a new load generator:
17+
1618
```go
1719
label := "benchspy-std"
1820

@@ -36,41 +38,43 @@ gen, err := wasp.NewGenerator(&wasp.Config{
3638
require.NoError(t, err)
3739
```
3840

39-
Now let's run the generator and save baseline report:
41+
## Step 2: Run the Generator and Save the Baseline Report
42+
4043
```go
4144
gen.Run(true)
4245

43-
fetchCtx, cancelFn := context.WithTimeout(context.Background(), 60*time.Second)
44-
defer cancelFn()
45-
4646
baseLineReport, err := benchspy.NewStandardReport(
4747
"c2cf545d733eef8bad51d685fcb302e277d7ca14",
48-
// notice the different standard executor type
48+
// notice the different standard query executor type
4949
benchspy.WithStandardQueries(benchspy.StandardQueryExecutor_Loki),
5050
benchspy.WithGenerators(gen),
5151
)
52-
require.NoError(t, err, "failed to create original report")
52+
require.NoError(t, err, "failed to create baseline report")
53+
54+
fetchCtx, cancelFn := context.WithTimeout(context.Background(), 60*time.Second)
55+
defer cancelFn()
5356

5457
fetchErr := baseLineReport.FetchData(fetchCtx)
55-
require.NoError(t, fetchErr, "failed to fetch data for original report")
58+
require.NoError(t, fetchErr, "failed to fetch data for baseline report")
5659

5760
path, storeErr := baseLineReport.Store()
58-
require.NoError(t, storeErr, "failed to store current report", path)
61+
require.NoError(t, storeErr, "failed to store baseline report", path)
5962
```
6063

61-
Since next steps are very similar to the ones used in the first test we will skip them and jump straight
62-
to metrics comparison.
64+
## Step 3: Skip to Metrics Comparison
65+
66+
Since the next steps are very similar to those in the first test, we’ll skip them and go straight to metrics comparison.
67+
68+
By default, the `LokiQueryExecutor` returns results as the `[]string` data type. Let’s use dedicated convenience functions to cast them from `interface{}` to string slices:
6369

64-
By default, `LokiQueryExecutor` returns `[]string` data type, so let's use dedicated convenience functions
65-
to cast them from `interface{}` to string slice:
6670
```go
6771
currentAsStringSlice := benchspy.MustAllLokiResults(currentReport)
6872
previousAsStringSlice := benchspy.MustAllLokiResults(previousReport)
6973
```
7074

71-
And finally, time to compare metrics. Since we have a `[]string` we will first convert it to `[]float64` and
72-
then calculate a median and assume it hasn't changed by more than 1%. Again, remember that this is just an illustration.
73-
You should decide yourself what's the best way to assert the metrics.
75+
## Step 4: Compare Metrics
76+
77+
Now, let’s compare metrics. Since we have `[]string`, we’ll first convert it to `[]float64`, calculate the median, and ensure the difference between the medians is less than 1%. Again, this is just an example—you should decide the best way to validate your metrics.
7478

7579
```go
7680
var compareMedian = func(metricName string) {
@@ -85,23 +89,23 @@ var compareMedian = func(metricName string) {
8589
require.NoError(t, err, "failed to convert %s results to float64 slice", metricName)
8690
previousMedian := benchspy.CalculatePercentile(previousFloatSlice, 0.5)
8791

88-
var diffPrecentage float64
92+
var diffPercentage float64
8993
if previousMedian != 0 {
90-
diffPrecentage = (currentMedian - previousMedian) / previousMedian * 100
94+
diffPercentage = (currentMedian - previousMedian) / previousMedian * 100
9195
} else {
92-
diffPrecentage = currentMedian * 100
96+
diffPercentage = 100
9397
}
94-
assert.LessOrEqual(t, math.Abs(diffPrecentage), 1.0, "%s medians are more than 1% different", metricName, fmt.Sprintf("%.4f", diffPrecentage))
98+
assert.LessOrEqual(t, math.Abs(diffPercentage), 1.0, "%s medians are more than 1% different", metricName, fmt.Sprintf("%.4f", diffPercentage))
9599
}
96100

97101
compareMedian(string(benchspy.MedianLatency))
98102
compareMedian(string(benchspy.Percentile95Latency))
99103
compareMedian(string(benchspy.ErrorRate))
100104
```
101105

102-
We have used standard metrics, which are the same as in the first test, now let's see how you can use your custom LogQl queries.
106+
## What’s Next?
103107

104-
> [!NOTE]
105-
> Don't know whether to use `Loki` or `Direct` query executors? [Read this!](./loki_dillema.md)
108+
In this example, we used standard metrics, which are the same as in the first test. Now, [let’s explore how to use your custom LogQL queries](./loki_custom.md).
106109

107-
You can find the full example [here](...).
110+
> [!NOTE]
111+
> You can find the full example [here](https://github.com/smartcontractkit/chainlink-testing-framework/tree/main/wasp/examples/benchspy/loki_query_executor/loki_query_executor_test.go).
Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,15 @@
11
# BenchSpy
22

3-
BenchSpy (short for benchmark spy) is a WASP-coupled tool that allows for easy comparison of various performance metrics.
3+
BenchSpy (short for Benchmark Spy) is a [WASP](../overview.md)-coupled tool designed for easy comparison of various performance metrics.
44

5-
It's main characteristics are:
6-
* three built-in data sources:
7-
* `Loki`
8-
* `Prometheus`
9-
* `Direct`
10-
* standard/pre-defined metrics for each data source
11-
* ease of extensibility with custom metrics
12-
* ability to load latest performance report based on Git history
13-
* 88% unit test coverage
5+
## Key Features
6+
- **Three built-in data sources**:
7+
- `Loki`
8+
- `Prometheus`
9+
- `Direct`
10+
- **Standard/pre-defined metrics** for each data source.
11+
- **Ease of extensibility** with custom metrics.
12+
- **Ability to load the latest performance report** based on Git history.
13+
- **88% unit test coverage**.
1414

15-
It doesn't come with any comparation logic, other than making sure that performance reports are comparable (e.g. they mesure the same metrics in the same way),
16-
leaving total freedom to the user.
15+
BenchSpy does not include any built-in comparison logic beyond ensuring that performance reports are comparable (e.g., they measure the same metrics in the same way), offering complete freedom to the user for interpretation and analysis.

0 commit comments

Comments
 (0)