Skip to content

Commit c33c8c8

Browse files
authored
Merge pull request #325 from harriscr/ch_wip_documentation
Documentation for the post processing tools
2 parents 48d3f1f + f20b8b9 commit c33c8c8

File tree

4 files changed

+188
-0
lines changed

4 files changed

+188
-0
lines changed

post_processing/README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# Post Processing of CBT results
2+
3+
## Description
4+
A set of tools that can be used to post process the data from any run of CBT. It provides a report in github markdown,
5+
and optionally pdf, format that contains a set of hockey-stick curves generated from the CBT run.
6+
The tool set consists of three separate tools that can be run stand-alone. The eventual aim is to integrate the post
7+
processing into CBT once more benchmark types are supported.
8+
9+
There are three components to the post processing which are:
10+
11+
* [formatter](formatter/README.md)
12+
* [plotter](plotter/README.md)
13+
* [reports](reports/README.md)
14+
15+
16+
## Suppoted benchmark tools
17+
This list will be added to as extra benchmark tools are supported.
18+
* fio
19+
20+
## Dependencies
21+
These post processing changes include some new dependencies to be run correctly
22+
23+
### python dependencies
24+
The following python modules are dependencies for this work:
25+
* matplotlib
26+
* mdutils
27+
28+
Both have been added to the requirements.txt file in the CBT project.
29+
30+
### Dependencies for pdf report generation
31+
To generate a report in pdf format there are 2 additional requirements
32+
33+
A working install of tex is required on the base operating system, which can be installed using the package manager.
34+
For Red Hat based OSes this can be achieved by running `yum install texlive`
35+
36+
[Pandoc](https://pandoc.org/), which can be installed on most Linux distributions using the included package manager.
37+
For Red Hat based OSes use `yum install pandoc`
38+
39+
The minimum pandoc level tested is `2.14.0.3` which is available for RHEL 9
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Formatter
2+
3+
The formatter converts CBT output json files into the correct format for the rest of the post processing. It is
4+
a json file of the format:
5+
6+
```
7+
{
8+
<queue_depth>: {
9+
bandwidth_bytes: <value>
10+
blocksize: <value>
11+
io_bytes: <value>
12+
iops: <value>
13+
latency: <value>
14+
number_of_jobs: <value>
15+
percentage_reads: <value>
16+
percentage_writes: <value>
17+
runtime_seconds: <value>
18+
std_deviation: <value>
19+
total_ios: <value>
20+
}
21+
...
22+
<queue_depth_n> {
23+
24+
}
25+
maximum_bandwidth: <value>
26+
latency_at_max_bandwidth: <value>
27+
maximum_iops: <value>
28+
latency_at_max_iops: <value>
29+
}
30+
```
31+
A single file will be produced per block size used for the benchmark run.
32+
33+
## Standalone script
34+
A wrapper script has been provided for the formatter
35+
```
36+
fio_common_output_wrapper.py --archive=<archive_directory>
37+
--results_file_root=<file_root>
38+
```
39+
where
40+
- `--archive` Required. the archive directory given to CBT for the benchmark run.
41+
- `--results_file_root` Optional. the name of the results file to process, without the extension. This defaults to `json_output`,
42+
which is the default for CBT runs, if not specified
43+
44+
Full help text is provided by using `--help` with the script
45+
46+
## Output
47+
A directory called `visualisation` will be created in the directory specified by `--archive` that contains all the processed files.
48+
There will be one file per blocksize used for the benchmark run.
49+
50+
## Example
51+
52+
```bash
53+
PYTHONPATH=/cbt /cbt/tools/fio_common_output_wrapper.py --archive="/tmp/ch_cbt_run" --results_file_root="ch_json_result"
54+
```

post_processing/plotter/README.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Plotter
2+
Draws the hockey stick plots for a benchmark run from the data produced by the formatter. These are png files, with one
3+
plot produced per block size used.
4+
5+
There is also a python class that will produce comparison plots of two or more different CBT runs for one or more block
6+
sizes.
7+
Due to the tools used there are only 6 unique colours available for the plot lines, so it is recommended to limit the
8+
comparison to 6 or less files or directories.
9+
10+
## Standalone script
11+
A wrapper script is only provided to produce comparison plots.
12+
```
13+
plot_comparison.py --files=<comma_separated_list_of_files_to_compare>
14+
--directories=<comma_separated_list_of_directories_to_compare>
15+
--output_directory=<full_path_to_directory_to_store_plot>
16+
--labels="<comma_separated_list_of_labels>
17+
```
18+
where
19+
- `--output_directory` Required. The full path to a directory to store the plots. Will be created if it doesn't exist
20+
- `--files` Optional. A comma separated list of files to plot on a single axis
21+
- `--directories` Optional. A comma separated list of directories to plot. A single plot will be produced per blocksize
22+
- `--labels` Optional. Comma separated list of labels to use for the lines on the comparison plot, in the same order as
23+
--file or --directories.
24+
25+
One of `--files` or `--directories` must be provided.
26+
27+
Full help text is provided by using `--help` with the script
28+
29+
## Example
30+
31+
```bash
32+
PYTHONPATH=/cbt /cbt/tools/plot_comparison.py --directories="/tmp/ch_cbt_main_run,/tmp/ch_cbt_sandbox_run" --output_directory="/tmp/main_sb_comparisons"
33+
```

post_processing/reports/README.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Reports
2+
3+
Produces a report in github markdown, and optionally pdf format that includes a summary table and the relevant
4+
plots from the CBT run.
5+
6+
## Output
7+
A report in github markdown format with a plots directory containing the required plots. The report and plots directory
8+
can be uploaded directly to github as-is and the links will be maintained.
9+
10+
Optionally a report in pdf format can also be created.
11+
12+
Due to the tools used there are only 6 unique colours available for the plot lines, so it is recommended to limit the
13+
comparison to 6 or less files or directories. During testing we found that more than four directories can start rendering
14+
the pdf report unreadable, so it is not recommended to create a pdf report to compare data from more than four
15+
benchmark runs.
16+
17+
## Standalone scripts
18+
There are actually 2 scripts provided as wrappers for the report generation:
19+
* generate_performance_report.py
20+
* generate_comparison_performance_report.py
21+
22+
### generate_performance_report
23+
Creates a performance report for a single benchmark run. The results must first have had the formatter run on them.
24+
25+
```
26+
generate_performance_report.py --archive=<full_path_to_results_directory>
27+
--output_directory=<full_path_to_directory_to_store_report>
28+
--create_pdf
29+
```
30+
31+
where:
32+
- `--archive` Required. The archive directory containing the files from the formatter
33+
- `--output_directory` Required. The directory to store the markdown report file and relevant plots.
34+
- `--create_pdf` Optional. Create a pdf report
35+
36+
Full help text is provided by using `--help` with the scripts
37+
38+
#### Example
39+
```bash
40+
PYTHONPATH=/cbt /cbt/tools/generate_performance_report.py --archive="/tmp/ch_cbt_main_run" --output_directory="/tmp/reports/main" --create_pdf
41+
```
42+
43+
### generate_comparison_performance_report.py
44+
Creates a report comparing 2 or more benchmark runs. The report will only include plots and results for formatted files
45+
that are common in all the directories.
46+
47+
```
48+
generate_comparison_performance_report.py --baseline=<full_path_to_archive_directory_to_use_as_baseline>
49+
--archives=<full_path_to_results_directories_to_compare>
50+
--output_directory=<full_path_to_directory_to_store_report>
51+
--create_pdf
52+
```
53+
where
54+
- `--baseline` Required. The full path to the baseline results for the comparison
55+
- `--archives` Required. A comma-separated list of directories containing results to compare to the baseline
56+
- `--output_directory` Required. The directory to store the markdown report file and relevant plots.
57+
- `--create_pdf` Optional. Create a pdf report
58+
59+
#### Examples
60+
```bash
61+
PYTHONPATH=/cbt /cbt/tools/generate_comparison_performance_report.py --baseline="/tmp/ch_cbt_main_run" --archives="/tmp/ch_sandbox/" --output_directory="/tmp/reports/main" --create_pdf
62+
```

0 commit comments

Comments
 (0)