|
5 | 5 | [id="cnf-cpu-infra-container_{context}"]
|
6 | 6 | = Restricting CPUs for infra and application containers
|
7 | 7 |
|
8 |
| -You can reserve cores (threads) from a single NUMA node for operating system housekeeping tasks and put your workloads on another NUMA node. Partitioning the CPUs this way can prevent the housekeeping processes from impacting latency-sensitive application processes. By default, CRI-O uses all online CPUs to run infra containers on nodes, which can result in context switches and spikes in latency. |
| 8 | +Generic housekeeping and workload tasks use CPUs in a way that may impact latency-sensitive processes. By default, the container runtime uses all online CPUs to run all containers together, which can result in context switches and spikes in latency. Partitioning the CPUs prevents noisy processes from interfering with latency-sensitive processes by separating them from each other. The following table describes how processes run on a CPU after you have tuned the node using the Performance Add-On Operator: |
9 | 9 |
|
10 |
| -You can ensure that housekeeping tasks and workloads run on separate NUMA nodes by specifying two groups of CPUs in the `spec` section of the performance profile. |
| 10 | +.Process' CPU assignments |
| 11 | +[%header,cols=2*] |
| 12 | +|=== |
| 13 | +|Process type |
| 14 | +|Details |
11 | 15 |
|
12 |
| -* `isolated` - The CPUs for the application container workloads. These CPUs have the lowest latency. Processes in this group have no interruptions and can, for example, reach much higher DPDK zero packet loss bandwidth. |
| 16 | +|Burstable and best-effort pods |
| 17 | +|Runs on any CPU except where low latency workload is running |
13 | 18 |
|
14 |
| -* `reserved` - The CPUs for the cluster and operating system housekeeping duties, including pod infra containers. Threads in the `reserved` group tend to be very busy, so latency-sensitive applications should be run in the `isolated` group. See link:https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a-qos-class-of-guaranteed[Create a pod that gets assigned a QoS class of `Guaranteed`]. |
| 19 | +|Infrastructure pods |
| 20 | +|Runs on any CPU except where low latency workload is running |
| 21 | + |
| 22 | +|Interrupts |
| 23 | +|Redirects to reserved CPUs (optional in {product-title} {product-version} and later) |
| 24 | + |
| 25 | +|Kernel processes |
| 26 | +|Pins to reserved CPUs |
| 27 | + |
| 28 | +|Latency-sensitive workload pods |
| 29 | +|Pins to a specific set of exclusive CPUs from the isolated pool |
| 30 | + |
| 31 | +|OS processes/systemd services |
| 32 | +|Pins to reserved CPUs |
| 33 | +|=== |
| 34 | + |
| 35 | +The exact partitioning pattern to use depends on many factors like hardware, workload characteristics and the expected system load. Some sample use cases are as follows: |
| 36 | + |
| 37 | +* If the latency-sensitive workload uses specific hardware, such as a network interface card (NIC), ensure that the CPUs in the isolated pool are as close as possible to this hardware. At a minimum, you should place the workload in the same Non-Uniform Memory Access (NUMA) node. |
| 38 | + |
| 39 | +* The reserved pool is used for handling all interrupts. When depending on system networking, allocate a sufficiently-sized reserve pool to handle all the incoming packet interrupts. In {product-version} and later versions, workloads can optionally be labeled as sensitive. |
| 40 | + |
| 41 | +The decision regarding which specific CPUs should be used for reserved and isolated partitions requires detailed analysis and measurements. Factors like NUMA affinity of devices and memory play a role. The selection also depends on the workload architecture and the specific use case. |
| 42 | + |
| 43 | +[IMPORTANT] |
| 44 | +==== |
| 45 | +The reserved and isolated CPU pools must not overlap and together must span all available cores in the worker node. |
| 46 | +==== |
| 47 | + |
| 48 | +To ensure that housekeeping tasks and workloads do not interfere with each other, specify two groups of CPUs in the `spec` section of the performance profile. |
| 49 | + |
| 50 | +* `isolated` - Specifies the CPUs for the application container workloads. These CPUs have the lowest latency. Processes in this group have no interruptions and can, for example, reach much higher DPDK zero packet loss bandwidth. |
| 51 | + |
| 52 | +* `reserved` - Specifies the CPUs for the cluster and operating system housekeeping duties. Threads in the `reserved` group are often busy. Do not run latency-sensitive applications in the `reserved` group. Latency-sensitive applications run in the `isolated` group. |
15 | 53 |
|
16 | 54 | .Procedure
|
17 | 55 |
|
18 |
| -. Create a performance profile that is appropriate for your hardware and topology. |
| 56 | +. Create a performance profile appropriate for the environment's hardware and topology. |
19 | 57 |
|
20 | 58 | . Add the `reserved` and `isolated` parameters with the CPUs you want reserved and isolated for the infra and application containers:
|
21 | 59 | +
|
|
34 | 72 | ----
|
35 | 73 | <1> Specify which CPUs are for infra containers to perform cluster and operating system housekeeping duties.
|
36 | 74 | <2> Specify which CPUs are for application containers to run workloads.
|
37 |
| -<3> Optional: Specify a node selector to apply the performance profile to specific nodes. |
38 |
| - |
| 75 | +<3> Optional: Specify a node selector to apply the performance profile to specific nodes. |
0 commit comments