Skip to content

Commit 1039e24

Browse files
committed
updates for BZ2065695
1 parent b28ffe9 commit 1039e24

15 files changed

+919
-1195
lines changed

_topic_maps/_topic_map.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2338,6 +2338,8 @@ Topics:
23382338
- Name: Low latency tuning
23392339
File: cnf-low-latency-tuning
23402340
Distros: openshift-origin,openshift-enterprise
2341+
- Name: Performing latency tests for platform verification
2342+
File: cnf-performing-platform-verification-latency-tests
23412343
- Name: Improving cluster stability in high latency environments using worker latency profiles
23422344
File: scaling-worker-latency-profiles
23432345
- Name: Topology Aware Lifecycle Manager for cluster updates
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * scalability_and_performance/cnf-performing-platform-verification-latency-tests.adoc
4+
5+
:_content-type: CONCEPT
6+
[id="discovery-mode_{context}"]
7+
= About discovery mode for latency tests
8+
9+
Use discovery mode to validate the functionality of a cluster without altering its configuration. Existing environment configurations are used for the tests. The tests can find the configuration items needed and use those items to execute the tests. If resources needed to run a specific test are not found, the test is skipped, providing an appropriate message to the user. After the tests are finished, no cleanup of the pre-configured configuration items is done, and the test environment can be immediately used for another test run.
10+
11+
[IMPORTANT]
12+
====
13+
When running the latency tests, **always** run the tests with `-e DISCOVERY_MODE=true` and `-ginkgo.focus` set to the appropriate latency test. If you do not run the latency tests in discovery mode, your existing live cluster performance profile configuration will be modified by the test run.
14+
====
15+
16+
[discrete]
17+
=== Limiting the nodes used during tests
18+
19+
The nodes on which the tests are executed can be limited by specifying a `NODES_SELECTOR` environment variable, for example, `-e NODES_SELECTOR=node-role.kubernetes.io/worker-cnf`. Any resources created by the test are limited to nodes with matching labels.
20+
21+
[NOTE]
22+
====
23+
If you want to override the default worker pool, pass the `-e ROLE_WORKER_CNF=<custom_worker_pool>` variable to the command specifying an appropriate label.
24+
====

modules/cnf-measuring-latency.adoc

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * scalability_and_performance/cnf-performing-platform-verification-latency-tests.adoc
4+
5+
:_content-type: CONCEPT
6+
[id="cnf-measuring-latency_{context}"]
7+
= Measuring latency
8+
9+
The `cnf-tests` image uses three tools to measure the latency of the system:
10+
11+
* `hwlatdetect`
12+
* `cyclictest`
13+
* `oslat`
14+
15+
Each tool has a specific use. Use the tools in sequence to achieve reliable test results.
16+
17+
hwlatdetect:: Measures the baseline that the bare-metal hardware can achieve. Before proceeding with the next latency test, ensure that the latency reported by `hwlatdetect` meets the required threshold because you cannot fix hardware latency spikes by operating system tuning.
18+
19+
cyclictest:: Verifies the real-time kernel scheduler latency after `hwlatdetect` passes validation. The `cyclictest` tool schedules a repeated timer and measures the difference between the desired and the actual trigger times. The difference can uncover basic issues with the tuning caused by interrupts or process priorities. The tool must run on a real-time kernel.
20+
21+
oslat:: Behaves similarly to a CPU-intensive DPDK application and measures all the interruptions and disruptions to the busy loop that simulates CPU heavy data processing.
22+
23+
The tests introduce the following environment variables:
24+
25+
.Latency test environment variables
26+
[cols="1,3", options="header"]
27+
|====
28+
|Environment variables
29+
|Description
30+
31+
|`LATENCY_TEST_DELAY`
32+
|Specifies the amount of time in seconds after which the test starts running. You can use the variable to allow the CPU manager reconcile loop to update the default CPU pool. The default value is 0.
33+
34+
|`LATENCY_TEST_CPUS`
35+
|Specifies the number of CPUs that the pod running the latency tests uses. If you do not set the variable, the default configuration includes all isolated CPUs.
36+
37+
|`LATENCY_TEST_RUNTIME`
38+
|Specifies the amount of time in seconds that the latency test must run. The default value is 300 seconds.
39+
40+
|`HWLATDETECT_MAXIMUM_LATENCY`
41+
|Specifies the maximum acceptable hardware latency in microseconds for the workload and operating system. If you do not set the value of `HWLATDETECT_MAXIMUM_LATENCY` or `MAXIMUM_LATENCY`, the tool compares the default expected threshold (20μs) and the actual maximum latency in the tool itself. Then, the test fails or succeeds accordingly.
42+
43+
|`CYCLICTEST_MAXIMUM_LATENCY`
44+
|Specifies the maximum latency in microseconds that all threads expect before waking up during the `cyclictest` run. If you do not set the value of `CYCLICTEST_MAXIMUM_LATENCY` or `MAXIMUM_LATENCY`, the tool skips the comparison of the expected and the actual maximum latency.
45+
46+
|`OSLAT_MAXIMUM_LATENCY`
47+
|Specifies the maximum acceptable latency in microseconds for the `oslat` test results. If you do not set the value of `OSLAT_MAXIMUM_LATENCY` or `MAXIMUM_LATENCY`, the tool skips the comparison of the expected and the actual maximum latency.
48+
49+
|`MAXIMUM_LATENCY`
50+
|Unified variable that specifies the maximum acceptable latency in microseconds. Applicable for all available latency tools.
51+
52+
|====
53+
54+
[NOTE]
55+
====
56+
Variables that are specific to a latency tool take precedence over unified variables. For example, if `OSLAT_MAXIMUM_LATENCY` is set to 30 microseconds and `MAXIMUM_LATENCY` is set to 10 microseconds, the `oslat` test will run with maximum acceptable latency of 30 microseconds.
57+
====
Lines changed: 177 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,177 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * scalability_and_performance/cnf-performing-platform-verification-latency-tests.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="cnf-performing-end-to-end-tests-disconnected-mode_{context}"]
7+
= Running latency tests in a disconnected cluster
8+
9+
The CNF tests image can run tests in a disconnected cluster that is not able to reach external registries. This requires two steps:
10+
11+
. Mirroring the `cnf-tests` image to the custom disconnected registry.
12+
13+
. Instructing the tests to consume the images from the custom disconnected registry.
14+
15+
[discrete]
16+
[id="cnf-performing-end-to-end-tests-mirroring-images-to-custom-registry_{context}"]
17+
== Mirroring the images to a custom registry accessible from the cluster
18+
19+
A `mirror` executable is shipped in the image to provide the input required by `oc` to mirror the test image to a local registry.
20+
21+
. Run this command from an intermediate machine that has access to the cluster and link:https://catalog.redhat.com/software/containers/explore[registry.redhat.io]:
22+
+
23+
[source,terminal,subs="attributes+"]
24+
----
25+
$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \
26+
registry.redhat.io/openshift4/cnf-tests-rhel8:v{product-version} \
27+
/usr/bin/mirror -registry <disconnected_registry> | oc image mirror -f -
28+
----
29+
+
30+
where:
31+
+
32+
--
33+
<disconnected_registry> :: Is the disconnected mirror registry you have configured, for example, `my.local.registry:5000/`.
34+
--
35+
36+
. When you have mirrored the `cnf-tests` image into the disconnected registry, you must override the original registry used to fetch the images when running the tests, for example:
37+
+
38+
[source,terminal,subs="attributes+"]
39+
----
40+
$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \
41+
-e DISCOVERY_MODE=true -e IMAGE_REGISTRY="<disconnected_registry>" \
42+
-e CNF_TESTS_IMAGE="cnf-tests-rhel8:v{product-version}" \
43+
/usr/bin/test-run.sh -ginkgo.focus="\[performance\]\ Latency\ Test"
44+
----
45+
46+
[discrete]
47+
[id="cnf-performing-end-to-end-tests-image-parameters_{context}"]
48+
== Configuring the tests to consume images from a custom registry
49+
50+
You can run the latency tests using a custom test image and image registry using `CNF_TESTS_IMAGE` and `IMAGE_REGISTRY` variables.
51+
52+
* To configure the latency tests to use a custom test image and image registry, run the following command:
53+
+
54+
[source,terminal,subs="attributes+"]
55+
----
56+
$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \
57+
-e IMAGE_REGISTRY="<custom_image_registry>" \
58+
-e CNF_TESTS_IMAGE="<custom_cnf-tests_image>" \
59+
registry.redhat.io/openshift4/cnf-tests-rhel8:v{product-version} /usr/bin/test-run.sh
60+
----
61+
+
62+
where:
63+
+
64+
--
65+
<custom_image_registry> :: is the custom image registry, for example, `custom.registry:5000/`.
66+
<custom_cnf-tests_image> :: is the custom cnf-tests image, for example, `custom-cnf-tests-image:latest`.
67+
--
68+
69+
[discrete]
70+
[id="cnf-performing-end-to-end-tests-mirroring-to-cluster-internal-registry_{context}"]
71+
== Mirroring images to the cluster internal registry
72+
73+
{product-title} provides a built-in container image registry, which runs as a standard workload on the cluster.
74+
75+
.Procedure
76+
77+
. Gain external access to the registry by exposing it with a route:
78+
+
79+
[source,terminal]
80+
----
81+
$ oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"defaultRoute":true}}' --type=merge
82+
----
83+
84+
. Fetch the registry endpoint by running the following command:
85+
+
86+
[source,terminal]
87+
----
88+
$ REGISTRY=$(oc get route default-route -n openshift-image-registry --template='{{ .spec.host }}')
89+
----
90+
91+
. Create a namespace for exposing the images:
92+
+
93+
[source,terminal]
94+
----
95+
$ oc create ns cnftests
96+
----
97+
98+
. Make the image stream available to all the namespaces used for tests. This is required to allow the tests namespaces to fetch the images from the `cnf-tests` image stream. Run the following commands:
99+
+
100+
[source,terminal]
101+
----
102+
$ oc policy add-role-to-user system:image-puller system:serviceaccount:cnf-features-testing:default --namespace=cnftests
103+
----
104+
+
105+
[source,terminal]
106+
----
107+
$ oc policy add-role-to-user system:image-puller system:serviceaccount:performance-addon-operators-testing:default --namespace=cnftests
108+
----
109+
110+
. Retrieve the docker secret name and auth token by running the following commands:
111+
+
112+
[source,terminal]
113+
----
114+
$ SECRET=$(oc -n cnftests get secret | grep builder-docker | awk {'print $1'}
115+
----
116+
+
117+
[source,terminal]
118+
----
119+
$ TOKEN=$(oc -n cnftests get secret $SECRET -o jsonpath="{.data['\.dockercfg']}" | base64 --decode | jq '.["image-registry.openshift-image-registry.svc:5000"].auth')
120+
----
121+
122+
. Create a `dockerauth.json` file, for example:
123+
+
124+
[source,bash]
125+
----
126+
$ echo "{\"auths\": { \"$REGISTRY\": { \"auth\": $TOKEN } }}" > dockerauth.json
127+
----
128+
129+
. Do the image mirroring:
130+
+
131+
[source,terminal,subs="attributes+"]
132+
----
133+
$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \
134+
registry.redhat.io/openshift4/cnf-tests-rhel8:{product-version} \
135+
/usr/bin/mirror -registry $REGISTRY/cnftests | oc image mirror --insecure=true \
136+
-a=$(pwd)/dockerauth.json -f -
137+
----
138+
139+
. Run the tests:
140+
+
141+
[source,terminal,subs="attributes+"]
142+
----
143+
$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \
144+
-e DISCOVERY_MODE=true -e IMAGE_REGISTRY=image-registry.openshift-image-registry.svc:5000/cnftests \
145+
cnf-tests-local:latest /usr/bin/test-run.sh -ginkgo.focus="\[performance\]\ Latency\ Test"
146+
----
147+
148+
[discrete]
149+
[id="mirroring-different-set-of-images_{context}"]
150+
== Mirroring a different set of test images
151+
152+
You can optionally change the default upstream images that are mirrored for the latency tests.
153+
154+
.Procedure
155+
156+
. The `mirror` command tries to mirror the upstream images by default. This can be overridden by passing a file with the following format to the image:
157+
+
158+
159+
[source,yaml,subs="attributes+"]
160+
----
161+
[
162+
{
163+
"registry": "public.registry.io:5000",
164+
"image": "imageforcnftests:{product-version}"
165+
}
166+
]
167+
----
168+
169+
. Pass the file to the `mirror` command, for example saving it locally as `images.json`. With the following command, the local path is mounted in `/kubeconfig` inside the container and that can be passed to the mirror command.
170+
+
171+
[source,terminal,subs="attributes+"]
172+
----
173+
$ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \
174+
registry.redhat.io/openshift4/cnf-tests-rhel8:v{product-version} /usr/bin/mirror \
175+
--registry "my.local.registry:5000/" --images "/kubeconfig/images.json" \
176+
| oc image mirror -f -
177+
----

0 commit comments

Comments
 (0)