Skip to content

Commit 2fa4516

Browse files
committed
docs: add sysbench TPS variability documentation
Add some guidance on what the heck this sysbench TPS variability thing is all about with some sample images of results you can get. Signed-off-by: Luis Chamberlain <[email protected]>
1 parent 67167d4 commit 2fa4516

File tree

6 files changed

+157
-0
lines changed

6 files changed

+157
-0
lines changed

README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Table of Contents
99
* [Start testing NFS with in 2 commands](#start-testing-nfs-with-in-2-commands)
1010
* [Runs some kernel selftests in a parallel manner](#runs-some-kernel-selftests-in-a-parallel-manner)
1111
* [CXL](#cxl)
12+
* [sysbench](#sysbench)
1213
* [kdevops chats](#kdevops-chats)
1314
* [kdevops on discord](#kdevops-on-discord)
1415
* [kdevops IRC](#kdevops-irc)
@@ -170,6 +171,13 @@ to guests and create custom topologies. kdevops let you build and install
170171
the latest CXL enabled qemu version as well for you. For more details
171172
refer to [kdevops cxl docs](docs/cxl.md)
172173

174+
### sysbench
175+
176+
kdevops supports automation of sysbench tests on VMs with or without
177+
[PCIe passthrough](docs/libvirt-pcie-passthrough.md) and different cloud
178+
providers. For details refer to the
179+
[kdevops sysbench documentation](docs/sysbench/sysbench.md).
180+
173181
## kdevops chats
174182

175183
We use discord and IRC. Right now we have more folks on discord than on IRC.
132 KB
Loading
547 KB
Loading

docs/sysbench/outliers_plot.png

289 KB
Loading

docs/sysbench/sysbench.md

Lines changed: 149 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
Sysbench workflow on LBS
2+
========================
3+
4+
[Sysbench](https://github.com/akopytov/sysbench)
5+
[FOSDEM 2017 PDF slides](https://archive.fosdem.org/2017/schedule/event/sysbench/attachments/slides/1519/export/events/attachments/sysbench/slides/1519/sysbench.pdf)
6+
is a scriptable multi-threaded benchmark tool used for benchmarking databases.
7+
8+
kdevops supports automating tests with sysbench, and the first test it supports
9+
is one focused on quantifying TPS variability results when one disable the
10+
MySQL double write buffer only across a series of different filesystems. kdevops
11+
makes adding support for testing TPS variability on different filesystems
12+
easy, and allows you to easily run these tests on bare metal, virtualized
13+
guests (optionally with
14+
[kdevops PCIe passthrough](../libvirt-pcie-passthrough.md))
15+
and on different cloud providers which kdevops supports.
16+
17+
Support was based on transcribing an initial simple shell docker proof of
18+
concept
19+
[plot-sysbench](https://github.com/mcgrof/plot-sysbench)
20+
into kdevops.
21+
Results of using
22+
[plot-sysbench](https://github.com/mcgrof/plot-sysbench)
23+
on AWS i4i.4xlarge instance was used with debian-12 image, docker MySQL and
24+
sysbench images and plots for a 12 hour run collected below to demonstrate what
25+
you should expect with automation from kdevops. Initial testing with of the TPS
26+
variability workflow on kdevops has been done with VMs only, but enough work is
27+
in place to easily verify / extend it with
28+
[kdevops PCIe passthrough](../libvirt-pcie-passthrough.md)
29+
and cloud support, it should just be a matter of a few Kconfig changes, and then
30+
also for cloud making sure we move around some folders into partitions with more
31+
space like the docker folders as some cloud instances use a small root disk. For
32+
demonstration purposes image results from
33+
[plot-sysbench](https://github.com/mcgrof/plot-sysbench) are used for now but
34+
the same should easily be possible with kdevops on the cloud as well.
35+
36+
# Running TPS variability tests
37+
38+
Just use:
39+
40+
```bash
41+
make defconfig-sysbench-mysql-atomic-tps-variability
42+
make -j$(nproc)
43+
make bringup
44+
make sysbench
45+
make sysbench-test
46+
```
47+
48+
The results will be placed in workflows/sysbench/results/
49+
50+
## Bringup tests
51+
52+
Below are the bringup methods tested so far which should work well
53+
54+
* VMs with guestfs
55+
56+
## TODO
57+
58+
* Test VMs with PCI passthrough
59+
* Test at least one cloud provider
60+
* Add PostgreSQL support
61+
62+
# Definitions
63+
64+
## TPS variability
65+
66+
We define the TPS variability the square of the standard deviation.
67+
68+
## TPS outliers
69+
70+
Outliers are TPS values 1.5 outside
71+
[IQR](https://en.wikipedia.org/wiki/Interquartile_range).
72+
There is likely a better value other than 1.5, a database expert should provide
73+
input here.
74+
75+
# Example image results from kdevops
76+
77+
## TPS changes
78+
79+
### xfs 16k vs ext4 bigalloc 16k - 12 hour MySQL run
80+
81+
This compares XFS with 16k block size filesystem against ext4 using a 4k
82+
block size and 16k bigalloc cluster size.
83+
84+
<img src="ext4-bigalloc-16k-Vs-xfs-16k-reflink-24-tables-512-threads-aws-i4i-4xlarge.png" align=center alt="xfs 16k vs ext4 bigalloc">
85+
86+
### xfs 16k the effects of disabling the double write buffer
87+
88+
This compares running XFS with a 16k block size filesystem on two nodes, with
89+
one node with the double write buffer enabled Vs on the other node the double
90+
write buffer disabled.
91+
92+
TPS results:
93+
94+
<img src="xfs-16k-Vs-xfs-16k-doublewrite-vs-nodoublewrite.png" align=center alt="xfs 16k vs ext4 bigalloc">
95+
96+
Visualizing TPS variability: since TPS variability is defined as the square of
97+
the standard deviation, this image shows us what the standard deviation is
98+
using a bell curve. The standard deviation starts at the center peak of the
99+
curve and is the length from the center to the right or left of the same
100+
colored vertical dotted line.
101+
102+
<img src="combined_hist_bell_curve.png" align=center alt="xfs 16k vs ext4 bigalloc">
103+
104+
TPS variability factor change: this just squares the standard deviation. But
105+
it visualizes the delta in terms of factors of the difference in TPS
106+
variability.
107+
108+
<img src="variance_bar.png" align=center alt="xfs 16k vs ext4 bigalloc">
109+
110+
Visualizing TPS outliers: using the definition of an outlier above, this
111+
visualizes the outliers, those outside the box.
112+
113+
<img src="outliers_plot.png" align=center alt="xfs 16k vs ext4 bigalloc">
114+
115+
kdevops has a way to do a factor analysis.
116+
117+
# Hacking
118+
119+
Things to know if you're going to hack on this.
120+
121+
## Use of Kconfig for support
122+
123+
Everything is defined through Kconfig, specially now that
124+
kdevops supports an extention of kconfig which lets us be selective over
125+
which Kconfig symbols we want to propagate onto kdevops extra_vars.yaml.
126+
Look for "output yaml" on Kconfig files.
127+
128+
Adding different filesystems and filesystems configurations should mostly be
129+
a matter of Kconfig edits.
130+
131+
To review support for PCI passthrough or cloud support all you need to do is
132+
extend modify SYSBENCH_DEVICE so the correct drive is used in
133+
workflows/sysbench/Kconfig.fs. For example right now:
134+
135+
* if libvirt is used and virtio is used /dev/disk/by-id/virtio-kdevops1 is used
136+
* if libvirt is used and nvme is used /dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_kdevops1 is used
137+
* if the AWS m5ad_4xlarge instance is used /dev/nvme2n1 is used and
138+
* if OCI is used the sparse volume defined in TERRAFORM_OCI_SPARSE_VOLUME_DEVICE_FILE_NAME
139+
is used.
140+
141+
Experience shows at least that AWS needs also some pre-run work to ensure all
142+
extra data for docker is on a partition which won't fill /. Future work to
143+
kdevops should be done for cloud providers per type of target instance to
144+
adjust data.
145+
146+
Since we already have support for testing fstests with real NVMe drives with
147+
[kdevops PCIe passthrough](../libvirt-pcie-passthrough.md) support, it should
148+
easily be possible to leverage that as a way to also support for
149+
[kdevops PCIe passthrough](../libvirt-pcie-passthrough.md) for sysbench testing.

docs/sysbench/variance_bar.png

37.3 KB
Loading

0 commit comments

Comments
 (0)