This is a new minor release of NRI Reference Plugins. It brings several new features, a number of bug fixes, end-to-end tests, and test coverage.

What's New

Balloons Policy

Cluster level visibility to CPU affinity. Configuration option agent.nodeResourceTopology: true enables observing balloons as zones in NodeResourceTopology custom resources. Furthermore, if showContainersInNrt: true is defined, information on each container, including CPU affinity, will be shown as a subzone of its balloon.

Example configuration:
```
showContainersInNrt: true
agent:
  nodeResourceTopology: true
```
Enables listing balloons and their cpusets on K8SNODE with
```
kubectl get noderesourcetopology K8SNODE -o json | jq '.zones[] | select(.type=="balloon") | {"balloon":.name, "cpuset":(.attributes[]|select(.name=="cpuset").value)}'
```
and containers with their cpusets on the same node:
```
kubectl get noderesourcetopology K8SNODE -o json | jq '.zones[] | select(.type=="allocation for container") | {"container":.name, "cpuset":(.attributes[]|select(.name=="cpuset").value)}'
```
System load balancing. Even if two containers run on disjoint sets of logical CPUs, they may nevertheless affect each others performance. This happens, for instance, if two memory-intensive containers share the same level 2 cache, or if they are compute-intenstive, use the same compute resources of a physical CPU core, and run on two hyperthreads of the same core.

New system load balancing in the balloons policy is based on classifying loads generated containers using new loadClasses configuration option. Based on the load classes associated with balloonTypes using loads, the policy allocates CPUs to new and existing balloons so that it avoids overloading level 2 caches or physical CPU cores.

Example: policy prefers selecting CPUs for all "inference engine" and "computational-fluid-dynamics" balloons within separate level 2 cache blocks to prevent cache trashing by any two of containers in these balloons.
```
balloonTypes:
- name: inference-engine
  loads:
  - memory-intensive
  ...
- name: computational-fluid-dynamics
  loads:
  - memory-intensive
...
loadClasses:
- name: memory-intensive
  level: l2cache
```

Topology Aware Policy

Improved topology hint control: the topologyhints.resource-policy.nri.io annotation key can be used to enable or disable topology hint generation for one or more containers altogether, or selectively for mounts, devices, and pod resources types.
For example:

metadata:
  annotations:
    # disable topology hint generation for all containers by default
    topologyhints.resource-policy.nri.io/pod: none
    # disable other than mount-based hints for the 'diskwriter' container
    topologyhints.resource-policy.nri.io/container.diskwriter: mounts
    # disable other than device-based hints for the 'videoencoder' container
    topologyhints.resource-policy.nri.io/container.videoencoder: devices
    # disable other than pod resource-based hints for the 'dpdk' container
    topologyhints.resource-policy.nri.io/container.dpdk: pod-resources
    # enable device and pod resource-based hints for 'networkpump' container
    topologyhints.resource-policy.nri.io/container.networkpump: devices,pod-resources

It is also possible to enable and disable topology hint generation based on mount or device path, using allow and deny lists. See the updated documentation for more details.

relaxed system topology restrictions: the policy should not refuse to start up if a NUMA node is shared by more than one pool at the same topology hierarchy level. In particular, a single NUMA node shared by all sockets should not prevent startup any more.
improved Burstable QoS class container handling: the policy now allocates memory to burstable QoS class containers based on memory request estimates. This should lower the probability for unexpected allocation failures when burstable containers are used to allocate a node close to full capacity.
better global shared allocation preference: a preferSharedCPUs: true global configuration option now applies to all containers, unless they are annotated to opt out using the prefer-shared-cpus.resource-policy.nri.io annotation.

Common Policy Improvements

Container cache and memory bandwidth allocation enables class-based management of system L2 and L3 cache and memory bandwidth. They are modeled as class-based uncountable and shareable resources. Containers can be assigned to predefined classes-of-service (CLOS), or RDT classes for short. Each class defines a specific configuration for cache and memory bandwidth allocation, which is applied to all containers within that class. The assigned container class is resolved and mapped to a CLOS in the runtime using goresctrl library. RDT control must be enabled in the runtime and the assigned classes must be defined in the runtime configuration. Otherwise the runtime might fail to create containers that are assigned to an RDT class. Refer to the containerd, cri-o, and goresctrl documentation for more details about configuration.

A container can be assigned either to an RDT class matching its pod's QoS class (BestEffort, Burstable or Guaranteed), or it can be assigned to an arbitrary class using the rdtclass.resource-policy.nri.io annotation. To enable QoS-class based default assignment you can use a configuration fragment similar to this:

apiVersion: config.nri/v1alpha1
kind: TopologyAwarePolicy # or 'BalloonsPolicy' for the 'balloons' policy
metadata:
  name: default
spec:
  ...
  control:
    rdt:
      enable: true
      usePodQoSAsDefaultClass: true

RDT class assignment is also possible using annotations. For instance, to assign the packetpump container to the highprio and the scheduler container to the midprio classes. Any other potential container in the pod will be assigned to the class matching their pd's QoS class:

metadata:
  annotations:
    rdtclass.resource-policy.nri.io/container.packetpump: highprio
    rdtclass.resource-policy.nri.io/container.scheduler: midprio

Container block I/O prioritization allows class-based control block I/O prioritization and throttling. Containers can be assigned to predefined block I/O classes. Each class defines a specific configuration of prioritization and throttling parameters which are applied to all containers assigned to the class. The assigned container class is resolved and mapped to actual parameters in the runtime using goresctrl library. Block I/O control must be enabled in the runtime and the classes must be defined in the runtime configuration. Otherwise the runtime fails to create containers that are assigned to a block I/O class. Refer to the containerd, cri-o, and goresctrl documentation for more details about configuration.

A container can be assigned either to a Block I/O class matching its pod's QoS class (BestEffort, Burstable or Guaranteed), or it can be assigned to an arbitrary class using the blockioclass.resource-policy.nri.io annotation. To enable QoS-class based default assignment you can use a configuration fragment similar to this:

apiVersion: config.nri/v1alpha1
kind: TopologyAwarePolicy
metadata:
  name: default
spec:
  ...
  control:
    blockio:
      enable: true
      usePodQoSAsDefaultClass: true

Class assignment is also possible using annotations. For instance, to assign the database container to the highprio and
the logger container to the lowprio classes. Any other potential container in the pod will be assigned to the class matching their pd's QoS class:

metadata:
  annotations:
    blockioclass.resource-policy.nri.io/container.database: highprio
    blockioclass.resource-policy.nri.io/container.logger: lowprio

What's Changed

balloons: do not require minFreq and maxFreq in CPU classes by @askervin in #455
balloons: expose balloons and optionally containers with affinity in NRT by @askervin in #469
balloons: introduce loadClasses for avoiding unwanted overloading in critical locations by @askervin in #493
topology-aware: exclude isolated CPUs from policy-picked reserved cpusets. by @klihub in #474
topology-aware: rework building the topology pool tree. by @klihub in #477
topology-aware: allocate burstable container memory by requests. by @klihub in #491
topology-aware: better semantics for globally configured shared CPU preference. by @klihub in #498
topology-aware: more consistent setup error handling. by @klihub in #502
memtierd: allow overriding go version for image build. by @klihub in #456
resmgr: improve annotated topology hint control. by @klihub in #499
resmgr: eliminate extra container state 'overlay'. by @klihub in #480
resmgr: eliminate extra RDT class 'overlay'. by @klihub in #481
resmgr: eliminate extra block I/O class 'overlay'. by @klihub in #482
resmgr: configurable RDT and block I/O class control. by @klihub in #483
system: add a helper for finding CPUs sharing caches by @askervin in #492
sysfs: only discover topology of online cpus by @marquiz in #494
pkg/udev: implement udev event reading/monitoring. by @klihub in #449
workflow: sign Helm packages and upload provenance files by @fmuyassarov in #468
[1/2] OLM workflow: add automatic OLM bundle submission. by @fmuyassarov in #460
[2/2] OLM workflow: allow both test and real submissions. by @klihub in #464
e2e: test topology-aware policy nodeResourceTopology exporting by @askervin in #465
e2e: add pure go stateful fuzz test generator by @askervin in #463

Full Changelog: v0.8.0...v0.9.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.9.3

Choose a tag to compare

Sorry, something went wrong.