Skip to content

Commit dc4b278

Browse files
authored
Merge pull request #74 from ekuric/fio_final
final fio update
2 parents c4d8b8a + 14894bc commit dc4b278

File tree

7 files changed

+730
-0
lines changed

7 files changed

+730
-0
lines changed

docs/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
| [Conformance](conformance.md) | OCP/Kubernetes e2e tests | None |
1717
| [Namespaces per cluster](namespaces-per-cluster.md) | Maximum Namespaces | None |
1818
| [Services per namespace](services-per-namespace.md) | Maximum services per namespace | None |
19+
| [FIO I/O test](fio.md) | FIO I/O test - stress storage backend | Privileged Containers, Working storage class |
1920

2021
* Baseline job without a tooled cluster just idles a cluster. The goal is to capture resource consumption over a period of time to characterize resource requirements thus tooling is required. (For now)
2122

@@ -47,3 +48,4 @@ Each workload will implement a form of pass/fail criteria in order to flag if th
4748
| [Conformance](conformance.md) | No |
4849
| [Namespaces per cluster](namespaces-per-cluster.md) | Yes: Exit code, Test Duration |
4950
| [Services per namespace](services-per-namespace.md) | Yes: Exit code, Test Duration |
51+
| [FIO I/O test](fio.md) | No |

docs/fio.md

Lines changed: 273 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,273 @@
1+
# FIO I/O Workload
2+
3+
The FIO I/O workload playbook is `workloads/fio.yml` and will run the FIO I/O workload on your cluster.
4+
FIO I/O workload test is designed to stress storage backend used by
5+
application pods. FIO I/O workload test supports all FIO test types : seq read /
6+
write, random read/write, randrw, rw.
7+
8+
9+
Requirements:
10+
11+
* Working storage backend and working storage class
12+
13+
FIO I/O test will use dynamic provisioning to allocate PVC and attach it to Pod
14+
so it requirement to have working storage class prior starting this test.
15+
16+
* `PBENCH_SERVER` variable to set to valid pbench server
17+
18+
How to setup pbench server is described in pbench
19+
[documentation](https://distributed-system-analysis.github.io/pbench/doc/server/installation.html)
20+
21+
22+
Running from CLI:
23+
24+
```sh
25+
$ cp workloads/inventory.example inventory
26+
$ # Add orchestration host to inventory
27+
$ # Edit vars in workloads/vars/fio.yml or define Environment vars (See below)
28+
$ time ansible-playbook -vv -i inventory workloads/fio.yml
29+
```
30+
31+
## Environment variables
32+
33+
### PUBLIC_KEY
34+
35+
Default: `~/.ssh/id_rsa.pub`
36+
Public ssh key file for Ansible.
37+
38+
### PRIVATE_KEY
39+
40+
Default: `~/.ssh/id_rsa`
41+
Private ssh key file for Ansible.
42+
43+
### ORCHESTRATION_USER
44+
45+
Default: `root`
46+
User for Ansible to log in as. Must authenticate with PUBLIC_KEY/PRIVATE_KEY.
47+
48+
### WORKLOAD_IMAGE
49+
50+
Default: `quay.io/openshift-scale/scale-ci-workload`
51+
Container image that runs the workload script.
52+
53+
### WORKLOAD_JOB_NODE_SELECTOR
54+
55+
Default: `false`
56+
Enables/disables the node selector that places the workload job on the `workload` node.
57+
58+
### WORKLOAD_JOB_TAINT
59+
60+
Default: `false`
61+
Enables/disables the toleration on the workload job to permit the `workload` taint.
62+
63+
### WORKLOAD_JOB_PRIVILEGED
64+
65+
Default: `false`
66+
Enables/disables running the workload pod as privileged.
67+
68+
### KUBECONFIG_FILE
69+
70+
Default: `~/.kube/config`
71+
Location of kubeconfig on orchestration host.
72+
73+
### ENABLE_PBENCH_AGENTS
74+
75+
Default: `false`
76+
Enables/disables the collection of pbench data on the pbench agent Pods. These Pods are deployed by the tooling playbook
77+
and if this option is enabled then tooling playbook needs to be executed prior to this test.
78+
79+
### ENABLE_PBENCH_COPY
80+
81+
Default: `true`
82+
83+
Enables/disables the copying of pbench data to a remote results server for further analysis.
84+
As of now, this test requires valid `PBENCH_SERVER` where it will copy results at end of test.
85+
86+
### PBENCH_SSH_PRIVATE_KEY_FILE
87+
88+
Default: `~/.ssh/id_rsa`
89+
Location of ssh private key to authenticate to the pbench results server.
90+
91+
### PBENCH_SSH_PUBLIC_KEY_FILE
92+
93+
Default: `~/.ssh/id_rsa.pub`
94+
Location of the ssh public key to authenticate to the pbench results server.
95+
96+
### PBENCH_SERVER
97+
98+
Default: There is no public default.
99+
DNS address of the pbench results server.
100+
101+
### SCALE_CI_RESULTS_TOKEN
102+
103+
Default: There is no public default.
104+
Future use for pbench and prometheus scraper to place results into git repo that holds results data.
105+
106+
### JOB_COMPLETION_POLL_ATTEMPTS
107+
108+
Default: `10000`
109+
110+
Number of retries for Ansible to poll if the workload job has completed.
111+
Poll attempts delay 10s between polls with some additional time taken for each polling action depending on the orchestration host setup.
112+
FIO I/O test for many pods and big file sizes can run for hours and either we rise `JOB_COMPLETION_POLL_ATTEMPTS` to
113+
higt value, or remove fully checking for `JOB_COMPLETION_POLL_ATTEMPTS` for FIO I/O test.
114+
115+
116+
### FIOTEST_PREFIX
117+
118+
Default: `fiotest`
119+
120+
Prefix to use for FIO I/O test
121+
122+
### FIOTEST_CLEANUP
123+
124+
Default: `true`
125+
If set to `true` test project will be removed at end of test.
126+
127+
### FIOTEST_BASENAME
128+
129+
Default: `fiotest`
130+
Basename used by cluster loader for the project(s) it creates.
131+
132+
### FIOTEST_MAXPODS
133+
134+
Default: `1`
135+
Maximum number of Pods that FIO I/O test will create for test
136+
137+
### FIOTEST_POD_IMAGE
138+
139+
Default: `quay.io/openshift-scale/scale-ci-fio:latest`
140+
141+
Container image to use for FIO Pods
142+
143+
### FIOTEST_STEPSIZE
144+
145+
Default: `1`
146+
Number of Pods for cluster loader will create before waiting for Pods to become running.
147+
148+
### FIOTEST_PAUSE
149+
150+
Default: `0`
151+
Period of time (in seconds) for cluster loader to pause after creating Pods and waiting for them to be "Running" state.
152+
When `FIOTEST_PAUSE` is zero, cluster loader will create pods in fastest possible manner.
153+
154+
### FIOTEST_STORAGE_SIZE
155+
156+
Default: `2Gi`
157+
158+
`FIOTEST_STORAGE_SIZE` defines size of PVC which will be created and mounted to Pod. It is important to notice that this
159+
cannot be smaller than `FIOTEST_FILESIZE`
160+
161+
### FIOTEST_STORAGECLASS
162+
163+
Default: ``
164+
165+
This parameter defines what storageclass to use to dynamically allocate PVC. It is expected that storage class is
166+
present and functional in order for test to work.
167+
Storage class name will be different and depends on environment, common storage class names are
168+
169+
* AWS - gp2
170+
* Azure - managed-premium
171+
* OSP - standard
172+
* CEPH - csi-cephfs/csi-rbd
173+
174+
175+
### FIOTEST_ACCESS_MODES
176+
177+
Default: `ReadWriteOnce`
178+
179+
`FIOTEST_ACCESS_MODES` is responsible for PVC access mode. This parameter will accept one of `ReadWriteOnce` ,
180+
`ReadWriteMany` or `ReadOnlyMany`. It is important to understand that particular access mode must be supported
181+
by storage used for test.
182+
183+
### FIOTEST_BS
184+
185+
Default: `4`
186+
187+
Fio block size.
188+
189+
### FIOTEST_FILENAME
190+
191+
Default: `/mnt/pvcmount/f2`
192+
193+
FIO file to write. PVC is mounted inside FIO pod to `/mnt/pvcmount` and thus inside this mount point fio file is
194+
created. This ensures that I/O operations are executed against PVC.
195+
196+
### FIOTEST_FILESIZE
197+
198+
Default: `1GB`
199+
200+
FIO file size and its size cannot exceed `FIOTEST_STORAGE_SIZE`.
201+
202+
### FIOTEST_RUNTIME
203+
204+
Default: `60`
205+
206+
FIO test runtime
207+
208+
### FIOTEST_DIRECT
209+
210+
Default: `1`
211+
212+
From `man fio` - If value is true, use non-buffered I/O.
213+
This is usually O_DIRECT.
214+
215+
### FIOTEST_IODEPTH
216+
217+
Default: `1`
218+
219+
Number of I/O units to keep in flight against the file
220+
221+
### FIOTEST_TESTTYPE
222+
223+
Default: `read`
224+
225+
FIO test type to execute. Default is `read`, supported are : read,write,randread,randwrite,randrw,rw
226+
227+
### FIOTEST_SAMPLES
228+
229+
Default: `1`
230+
231+
Running one iteration of test can give misleading results, and it is recommended to run multiple iterations to catch up
232+
deviations and anomalies. Test result will show best iteration.
233+
234+
### FIOTEST_NODESELECTOR
235+
236+
Default: ""
237+
238+
For cases when it is necessary to have FIO pods to be assigned to already labeled nodes with specific label
239+
`FIOTEST_NODESELECTOR` allows to specify desired label.
240+
FIO I/O test does not label nodes, it expect that labels are already assigned to nodes.
241+
242+
243+
244+
### Smoke test variables
245+
246+
```
247+
FIOTEST_PREFIX=fiotest
248+
FIOTEST_CLEANUP=true
249+
FIOTEST_BASENAME=fiotest
250+
ENABLE_PBENCH_COPY=true
251+
FIOTEST_MAXPODS=1
252+
FIOTEST_POD_IMAGE="quay.io/openshift-scale/scale-ci-fio"
253+
FIOTEST_STEPSIZE=1
254+
FIOTEST_PAUSE=0
255+
FIOTEST_STORAGE_SIZE="2Gi"
256+
FIOTEST_STORAGECLASS=gp2
257+
FIOTEST_ACCESS_MODES="ReadWriteOnce"
258+
FIOTEST_BS=4
259+
FIOTEST_FILENAME="/mnt/pvcmount/f2"
260+
FIOTEST_FILESIZE="1GB"
261+
FIOTEST_RUNTIME=600
262+
FIOTEST_DIRECT=1
263+
FIOTEST_IODEPTH=1
264+
FIOTEST_TESTTYPE=write
265+
FIOTEST_SAMPLES=1
266+
267+
```
268+
269+
### Additional observations
270+
For combination with many test pods and FIO file size in range of GBs - FIO I/O test can take long to finish and it can
271+
happen that `JOB_COMPLETION_POLL_ATTEMPTS` expires before test is done. `JOB_COMPLETION_POLL_ATTEMPTS` is increased to
272+
10000 what does not affect test duration. It is necessary to to decide is `JOB_COMPLETION_POLL_ATTEMPTS` relevant for
273+
FIO I/O test and can it be fully ignored.

0 commit comments

Comments
 (0)