Skip to content

v0.12.0

Choose a tag to compare

@github-actions github-actions released this 28 Jan 12:57
· 36 commits to main since this release
v0.12.0

This new release of NRI Reference Plugins brings a few new features to resource policy plugins.

What's New

Balloons Policy

  • Load Balancing for Composite Balloons
    Thjs feature enables creating equally many balloons from two or more balloon types for the same set of containers. For example, if balloon types B0, ..., B3 are local to similar but separate hardware resources that accelerate HPC applications, this feature enables balancing the number of balloons created from each of these types for HPC containers. Containers are assigned to a composite balloon type BC, consisting of components B0, ..., B3, and componentCreation strategy of the BC balloon type is set to balance-balloons.

Common Policy Improvements

  • Resource Annotation
    Exact resource requirements for container can be annotated on containers' pod. This can be useful in clusters with high CPU count nodes when one wants to allocate more than 256 CPUs to a single container, which is not properly detected otherwise. The newly added nri-resource-annotator webhook can be used to automate this process. You can install the webhook using the provided Helm chart with the following commands:
$ NS=kube-system; SVC=resource-annotator; CERT=~/webhook-cert
$ mkdir -p $CERT
$ openssl req -x509 -newkey rsa:2048 -sha256 -days 365 -nodes \
      -keyout $CERT/server-key.pem -out $CERT/server-crt.pem \
      -subj "/CN=$SVC.$NS.svc" -addext "subjectAltName=DNS:$SVC,DNS:$SVC.$NS,DNS:$SVC.$NS.svc"
$ helm repo add nri-plugins https://containers.github.io/nri-plugins
$ helm repo update
$ helm -n $NS install webhook nri-plugins/nri-resource-annotator \
       --set image.name=ghcr.io/containers/nri-plugins/nri-resource-annotator \
       --set service.base64Crt=$(base64 -w0 < $CERT/server-crt.pem) \
       --set service.base64Key=$(base64 -w0 < $CERT/server-key.pem)

The policies should automatically take resource annotations into account when they are present.

Other Changes

  • OpenTelemetry metrics collection
    Metrics collection has been updated to use OpenTelemetry for metrics instrumentation. In connection
    with this change, the names of some available metrics have changed when collected using Prometheus.
    In particular, metric names from the Balloons and Topology-Aware policies have changes.

What's Changed

  • balloons: extend composite balloons with a kind of load balancing by @askervin in #617
  • metrics: switch to OpenTelemetry based metrics collection. by @klihub in #600
  • resource-annotator: add resource annotator mutating webhook. by @klihub in #619
  • e2e: add global burstability limit test case. by @klihub in #607
  • scripts: fix Helm release artifact checker. by @klihub in #609
  • .github: fix unstable chart publishing with Helm v4.x. by @klihub in #610
  • e2e: run_tests.sh accepts user overrides for vars read from files by @askervin in #613
  • e2e: report skipped tests with SKIP instead of PASS in the summary by @askervin in #615
  • e2e: bump default test distro to fedora/43. by @klihub in #616
  • e2e: enable creating VMs up to 4096 CPUs and CPU hot-plugging/removing from test scripts by @askervin in #587
  • e2e: add 8-socket 4k-CPU e2e vm and related test for balloons policy by @askervin in #591
  • e2e: add 8-socket 4k-CPU test for topology-aware by @askervin in #599
  • e2e: support "cxl" in hardware topology (CXL-1) by @askervin in #611
  • e2e: enable custom kernel building, caching and installing (CXL-2) by @askervin in #612
  • e2e: add a test that hotplugs and hotremoves CXL memory (CXL-3) by @askervin in #614
  • build: ignore operator build failures for non releases by @klihub in #620
  • e2e/balloons: update expected metrics pattern. by @klihub in #621

Full Changelog: v0.11.0...v0.12.0