Skip to content

Commit 2b861e2

Browse files
committed
node: topologymgr: address PRR review comments (2)
Signed-off-by: Swati Sehgal <[email protected]>
1 parent 82138a4 commit 2b861e2

File tree

1 file changed

+19
-11
lines changed

1 file changed

+19
-11
lines changed

keps/sig-node/693-topology-manager/README.md

Lines changed: 19 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -710,7 +710,7 @@ This feature is kubelet specific, so version skew strategy is N/A.
710710

711711
- [X] Feature gate (also fill in values in `kep.yaml`)
712712
- Feature gate name: TopologyManager
713-
- Components depending on the feature gate: Topology Manager
713+
- Components depending on the feature gate: kubelet
714714

715715
Kubelet Flag for the Topology Manager Policy, which is described above. The `none` policy will be the default policy.
716716

@@ -743,15 +743,7 @@ Memory Manager and Device Manager to either admit a pod to the node or reject it
743743

744744
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
745745

746-
Yes, this feature can be disabled by specifying `TopologyManager` feature gate
747-
in the kubelet configuration. Note that disabling the feature gate requires
748-
kubelet restart for the changes to take effect. In case no pods consuming
749-
resources aligned by Topology Manager are running on the node, disabling
750-
feature gate won't cause any issue.
751-
752-
If the feature gate is being disabled on a node where such pods are running,
753-
it is the responsibliity of the cluster admin to ensure that the node is
754-
appropriately drained.
746+
Since going to stable in 1.27, the feature gate is locked on as is the standard practice in Kubernetes.
755747

756748
###### What happens if we reenable the feature if it was previously rolled back?
757749

@@ -809,13 +801,22 @@ configured.
809801
###### How can someone using this feature know that it is working for their instance?
810802

811803
- [X] Other (treat as last resort)
812-
- Details: check the kubelet metric `topology_manager_admission_requests_total` or "topology_manager_admission_duration_seconds"
804+
- Details:
805+
806+
By design, NUMA information is hidden from the end users and is only known to kubelet running on the node. In order to validate that the allocated resources are NUMA aligned, we need this information to be exposed. The only possible way is with the help of external tools that inspect the resource topology information and either expose it external to the node (e.g. [NFD topology updater](https://github.com/kubernetes-sigs/node-feature-discovery/blob/master/docs/get-started/introduction.md#nfd-topology-updater)) or use it to perform validation themselves ([numaalign](https://github.com/ffromani/numalign)). Here are a few possible options (with external help):
807+
808+
1. In case Topology manger is configured with `single-numa-node` policy and CPU Manager with `static policy` Using NFD topology updater, we can learn about the number of allocatable CPUs on a NUMA node and deploy a pod with CPUs greater than we have available on a single NUMA node. In that case, the pod would return a `TopologyAffinityError` and is visible to the end user.
809+
2. Alternatively, we can use a tool like [numaalign](https://github.com/ffromani/numalign) and run that within a pod to determine if a set of resources are aligned on the same NUMA node.
810+
813811

814812
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
815813

816814
"topology_manager_admission_duration_seconds" (which will be added as this release) can be used to determine
817815
if the resource alignment logic performed at pod admission time is taking longer than expected.
818816

817+
Measurements haven't been performed to determine the latency as this metric will be introduced in 1.27
818+
development cycle but the duration is expected to be very short most likely in the ballpark of 50-100 ms.
819+
819820
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
820821

821822
- [X] Metrics
@@ -871,6 +872,13 @@ Also, the resource alignment logic is executed at pod admission time which is pr
871872

872873
No reported or known increase in resource usage.
873874

875+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
876+
877+
No.
878+
879+
The feature is only responsble for alignment of resources. It does not use node resources like PIDs, sockets, inodes, etc.
880+
for running its alignment algorithm.
881+
874882
### Troubleshooting
875883

876884
###### How does this feature react if the API server and/or etcd is unavailable?

0 commit comments

Comments
 (0)