Merge pull request #45299 from amolnar-rh/TELCODOCS-425

jab-rh · web-flow · commit 80c70aa6ede1 · 2022-06-23T11:10:08.000-04:00
TELCODOCS-425: Add capability to take CPUs offline via NTO
diff --git a/modules/cnf-provisioning-real-time-and-low-latency-workloads.adoc b/modules/cnf-provisioning-real-time-and-low-latency-workloads.adoc
@@ -7,15 +7,20 @@
 [id="cnf-provisioning-real-time-and-low-latency-workloads_{context}"]
 = Provisioning real-time and low latency workloads
 
-Many industries and organizations need extremely high performance computing and might require low and predictable latency, especially in the financial and telecommunications industries. For these industries, with their unique requirements, {product-title} provides a Performance Addon Operator to implement automatic tuning to achieve low latency performance and consistent response time for {product-title} applications.
+Many industries and organizations need extremely high performance computing and might require low and predictable latency, especially in the financial and telecommunications industries. For these industries, with their unique requirements, {product-title} provides the Node Tuning Operator to implement automatic tuning to achieve low latency performance and consistent response time for {product-title} applications.
 
-The cluster administrator can use this performance profile configuration to make these changes in a more reliable way. The administrator can specify whether to update the kernel to kernel-rt (real-time), reserve CPUs for cluster and operating system housekeeping duties, including pod infra containers, and isolate CPUs for application containers to run the workloads.
+The cluster administrator can use this performance profile configuration to make these changes in a more reliable way. The administrator can specify whether to update the kernel to kernel-rt (real-time), reserve CPUs for cluster and operating system housekeeping duties, including pod infra containers, isolate CPUs for application containers to run the workloads, and disable unused CPUs to reduce power consumption.
 
 [WARNING]
 ====
 The usage of execution probes in conjunction with applications that require guaranteed CPUs can cause latency spikes. It is recommended to use other probes, such as a properly configured set of network probes, as an alternative.
 ====
 
+[NOTE]
+====
+In earlier versions of {product-title}, the Performance Addon Operator was used to implement automatic tuning to achieve low latency performance for OpenShift applications. In {product-title} 4.11 and later, these functions are part of the Node Tuning Operator.
+====
+
 [id="performance-addon-operator-known-limitations-for-real-time_{context}"]
 == Known limitations for real-time
 
@@ -39,8 +44,6 @@ Establishing the right performance expectations refers to the fact that the real
 [id="performance-addon-operator-provisioning-worker-with-real-time-capabilities_{context}"]
 == Provisioning a worker with real-time capabilities
 
-. Install Performance Addon Operator to the cluster.
-
 . Optional: Add a node to the {product-title} cluster.
 See link:https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_for_real_time/8/html-single/tuning_guide/index#Setting_BIOS_parameters[Setting BIOS parameters].
 
@@ -75,7 +78,7 @@ Note that a machine config pool worker-rt is created for group of nodes that hav
 +
 [NOTE]
 ====
-You must decide which nodes are configured with real-time workloads. You could configure all of the nodes in the cluster, or a subset of the nodes. The Performance Addon Operator that expects all of the nodes are part of a dedicated machine config pool. If you use all of the nodes, you must point the Performance Addon Operator to the worker node role label. If you use a subset, you must group the nodes into a new machine config pool.
+You must decide which nodes are configured with real-time workloads. You could configure all of the nodes in the cluster, or a subset of the nodes. The Node Tuning Operator that expects all of the nodes are part of a dedicated machine config pool. If you use all of the nodes, you must point the Node Tuning Operator to the worker node role label. If you use a subset, you must group the nodes into a new machine config pool.
 ====
 . Create the `PerformanceProfile` with the proper set of housekeeping cores and `realTimeKernel: enabled: true`.
 
@@ -230,7 +233,7 @@ status:
 
 * The pod must have the `cpu-load-balancing.crio.io: true` annotation.
 
-The Performance Addon Operator is responsible for the creation of the high-performance runtime handler config snippet under relevant nodes and for creation of the high-performance runtime class under the cluster. It will have the same content as default runtime handler except it enables the CPU load balancing configuration functionality.
+The Node Tuning Operator is responsible for the creation of the high-performance runtime handler config snippet under relevant nodes and for creation of the high-performance runtime class under the cluster. It will have the same content as default runtime handler except it enables the CPU load balancing configuration functionality.
 
 To disable the CPU load balancing for the pod, the `Pod` specification must include the following fields:
 
@@ -278,4 +281,48 @@ For more information, see link:https://access.redhat.com/documentation/en-us/ope
 [id="performance-addon-operator-scheduling-workload-onto-worker-with-real-time-capabilities_{context}"]
 == Scheduling a workload onto a worker with real-time capabilities
 
-Use label selectors that match the nodes attached to the machine config pool that was configured for low latency by the Performance Addon Operator. For more information, see link:https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/[Assigning pods to nodes].
+Use label selectors that match the nodes attached to the machine config pool that was configured for low latency by the Node Tuning Operator. For more information, see link:https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/[Assigning pods to nodes].
+
+[id="node-tuning-operator-disabling-CPUs-for-power-consumption_{context}"]
+== Reducing power consumption by taking CPUs offline
+
+You can generally anticipate telecommunication workloads. When not all of the CPU resources are required, the Node Tuning Operator allows you take unused CPUs offline to reduce power consumption by manually updating the performance profile.
+
+To take unused CPUs offline, you must perform the following tasks:
+
+. Set the offline CPUs in the performance profile and save the contents of the YAML file:
++
+.Example performance profile with offlined CPUs
+[source,yaml]
+----
+apiVersion: performance.openshift.io/v2
+kind: PerformanceProfile
+metadata:
+  name: performance
+spec:
+  additionalKernelArgs:
+  - nmi_watchdog=0
+  - audit=0
+  - mce=off
+  - processor.max_cstate=1
+  - intel_idle.max_cstate=0
+  - idle=poll
+  cpu:
+    isolated: "2-23,26-47"
+    reserved: "0,1,24,25"
+    offlined: “48-59” <1>
+  nodeSelector:
+    node-role.kubernetes.io/worker-cnf: ""
+  numa:
+    topologyPolicy: single-numa-node
+  realTimeKernel:
+    enabled: true
+----
+<1> Optional. You can list CPUs in the `offlined` field to take the specified CPUs offline.
+
+. Apply the updated profile by running the following command:
++
+[source,terminal]
+----
+$ oc apply -f my-performance-profile.yaml
+----
diff --git a/modules/cnf-running-the-performance-creator-profile.adoc b/modules/cnf-running-the-performance-creator-profile.adoc
@@ -4,14 +4,14 @@
 
 :_content-type: PROCEDURE
 [id="running-the-performance-profile-profile-cluster-using-podman_{context}"]
-= Running the Performance Profile Creator using `podman`
+= Running the Performance Profile Creator using podman
 
 As a cluster administrator, you can run `podman` and the Performance Profile Creator to create a performance profile.
 
 .Prerequisites
 
 * Access to the cluster as a user with the `cluster-admin` role.
-* A cluster installed on bare metal hardware.
+* A cluster installed on bare-metal hardware.
 * A node with `podman` and OpenShift CLI (`oc`) installed.
 * Access to the Node Tuning Operator image.
 
@@ -131,7 +131,7 @@ $ cat my-performance-profile.yaml
 ----
 .Example output
 +
-[source,terminal]
+[source,yaml]
 ----
 apiVersion: performance.openshift.io/v2
 kind: PerformanceProfile
@@ -146,8 +146,9 @@ spec:
   - intel_idle.max_cstate=0
   - idle=poll
   cpu:
-    isolated: 1,3,5,7,9,11,13,15,17,19-39,41,43,45,47,49,51,53,55,57,59-79
-    reserved: 0,2,4,6,8,10,12,14,16,18,40,42,44,46,48,50,52,54,56,58
+    isolated: "1,3,5,7,9,11,13,15,17,19-39,41,43,45,47,49,51,53,55,57"
+    reserved: "0,2,4,6,8,10,12,14,16,18,40,42,44,46,48,50,52,54,56,58"
+    offlined: "59-79"
   nodeSelector:
     node-role.kubernetes.io/worker-cnf: ""
   numa:
@@ -158,11 +159,6 @@ spec:
 
 . Apply the generated profile:
 +
-[NOTE]
-====
-Install the Performance Addon Operator before applying the profile.
-====
-+
 [source,terminal]
 ----
 $ oc apply -f my-performance-profile.yaml