@@ -540,6 +540,68 @@ locality to a NUMA node is advertised by the API. Annotated allow and deny
540540lists can be used to selectively disable or enable per-resource hints, using
541541` podresapi:$RESOURCE_NAME` as the path for the resource.
542542
543+ # ## Picking CPU And Memory By Topology Hints
544+
545+ Normally topology hints are only used to pick the assigned pool for a workload.
546+ Once a pool is selected the available resources within the pool are considered
547+ equally good for satisfying the topology hints. When the policy is allocating
548+ exclusive CPUs and picking pinned memory for the workload, only other potential
549+ criteria and attributes are considered for picking the individual resources.
550+
551+ When multiple devices are allocated to a single container, it is possible that
552+ this default assumption of all resources within the pool being topologically
553+ equal is not true. If a container is allocated misaligned devices, IOW devices
554+ with different memory or CPU locality, it is possible that only some of the CPU
555+ and memory in the selected pool satisfy the device hints and therefore have the
556+ desired locality.
557+
558+ For instance when in a two-socket system with socket # 0 having NUMA nodes #0,#1
559+ and socket # 1 having NUMA nodes #2,#3, if a container is allocated two devices,
560+ one with locality to node # 0 and another with locality to node #3, the only pool
561+ fulfilling topology hints for both devices is the root node. However, half of the
562+ resources in the pool are optimal for one of the devices and the other half are
563+ not optimal for either.
564+
565+ A container can be annotated to prefer hint based selection and pinning of CPU
566+ and memory resources using the `pick-resources-by-hints.resource-policy.nri.io`
567+ annotation. For example,
568+
569+ ` ` ` yaml
570+ apiVersion: v1
571+ kind: Pod
572+ metadata:
573+ name: data-pump
574+ annotations:
575+ k8s.v1.cni.cncf.io/networks: sriov-net1
576+ prefer-isolated-cpus.resource-policy.nri.io/container.ctr0: "true"
577+ pick-resources-by-hints.resource-policy.nri.io/container.ctr0: "true"
578+ spec:
579+ containers:
580+ - name: ctr0
581+ image: dpdk-pump
582+ imagePullPolicy: Always
583+ resources:
584+ requests:
585+ cpu: 2
586+ memory: 100M
587+ vendor.com/sriov_netdevice_A: '1'
588+ vendor.com/sriov_netdevice_B: '1'
589+ limits:
590+ vendor.com/sriov_netdevice_A: '1'
591+ vendor.com/sriov_netdevice_B: '1'
592+ cpu: 2
593+ memory: 100M
594+ ` ` `
595+
596+ When annotated like that, the policy will try to pick one exclusive isolated
597+ CPU with locality to one device and another with locality to the other. It will
598+ also try to pick and pin to memory aligned with these devices. If this succeeds
599+ for all devices, the effective resources for the container will be the union of
600+ the individually picked resources. If picking resources by hints fails for any
601+ of the devices, the policy falls back to picking resource from the pool without
602+ considering device hints.
603+
604+
543605# # Container Affinity and Anti-Affinity
544606
545607# ## Introduction
0 commit comments