Skip to content

Commit 4069142

Browse files
committed
Gather pod information to check IRQ load balancing
Some intensive CPU workload could be negatively affected by the interruption of ISRs. We would like to check that IRQ affinities are properly configured to avoid problems. That means no exclusively assigned CPU is listed in any IRQ affinity list. In kubernetes version >= 1.23 PodResourcesAPI provides us with the exclusively assigned CPU for each container but in previous versions it gives us all the CPUs used by the container which in fact could be the default cpuset. So for kubernetes version < 1.23 we need extra information to get the exclusively assigned CPUs. That is the list of pods with `qosClass == Guaranteed` so we could then look for the CPUs in the PodResourcesAPI. As the way to calculate exclusively assigned CPUs for kubernetes version < 1.23 also works for kubernetes versions >=1.23, and it is a better way to provide a homogeneous way to do it, at least until kubernetes versions < 1.23 go out of support, we have decided to use this way for both cases. We fulfill the information gap we need with the bare minimum amount of pod information we need to do the calculation, that is: - pod namespace - pod name - pod qosClass and we gather this information on a per node basis to make it easy to cross-reference this information with the PodResourcesAPI info which is also on a per node basis.
1 parent bfffc10 commit 4069142

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

must-gather/collection-scripts/gather_nodes

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,8 @@ do
6969
oc exec $pod -n perf-node-gather -- gather_sysinfo --json podres --socket-path=unix:///host/podresources/kubelet.sock > $NODE_PATH/podresources.json
7070

7171
oc exec $pod -n perf-node-gather -- gather_sysinfo snapshot --debug --root=/host --output=- > $NODE_PATH/sysinfo.tgz 2> $NODE_PATH/sysinfo.log
72+
73+
oc get pods -A --field-selector spec.nodeName=$node,status.phase=Running -o go-template='[{{range $idx, $item := .items}} {{if (ne $idx 0)}},{{end}}{"namespace":"{{.metadata.namespace}}", "name":"{{.metadata.name}}", "nodeName":"{{.spec.nodeName}}", "qosClass": "{{.status.qosClass}}" }{{"\n"}}{{end}}]' > $NODE_PATH/pods_info.json
7274
done
7375

7476
# Collect journal logs for specified units for all nodes

0 commit comments

Comments
 (0)