Skip to content

Commit acb9ea1

Browse files
committed
New: support for PERFMON capability, silent mode and some extra env
debug variables
1 parent de3e7c6 commit acb9ea1

File tree

4 files changed

+21
-5
lines changed

4 files changed

+21
-5
lines changed

deployment/pcm/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,9 @@ Helm chart instructions
88
- Support for bare-metal and VM host configurations (files: [values-metal.yaml](values-metal.yaml), [values-vm.yaml](values-vm.yaml)),
99
- Ability to deploy multiple releases alongside configured differently to handle different kinds of machines (bare-metal, VM) at the [same time](#heterogeneous-mixed-vmmetal-instances-cluster),
1010
- Linux Watchdog handling (controlled with `PCM_KEEP_NMI_WATCHDOG`, `PCM_NO_AWS_WORKAROUND`, `nmiWatchdogMount` values).
11-
- Deploy to own namespace with "helm install ... **-n pcm --create-namespace**"
12-
- Silent mode (value: `silent=false`, default)
11+
- Deploy to own namespace with "helm install ... **-n pcm --create-namespace**".
12+
- Silent mode (value: `silent=false`, default).
13+
- Backward compatbile with older Linux kernels (<5.8) - (value: cap_perfmon).
1314

1415
Here are available methods in this chart of metrics collection w.r.t interfaces and required access:
1516

@@ -87,7 +88,6 @@ More information here: https://kubernetes.io/docs/tutorials/security/ns-level-ps
8788
#### 1) (Optionally) mount resctrl filesystem (for RDT metrics) to unload "msr" kernel module for validation
8889

8990
```
90-
echo 0 > /proc/sys/kernel/perf_event_paranoid
9191
mount -t resctrl resctrl /sys/fs/resctrl
9292
```
9393

deployment/pcm/templates/_helpers.tpl

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,9 +60,8 @@ securityContext:
6060
*/}}
6161
capabilities:
6262
add:
63-
- SYS_ADMIN
63+
- {{ if .Values.cap_perfmon }}PERFMON{{ else }}SYS_ADMIN{{ end }}
6464
- SYS_RAWIO
65-
#- PERFMON
6665
{{- end }}
6766
{{- end }}
6867

deployment/pcm/templates/daemonset.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,13 @@ spec:
110110
value: {{ .Values.PCM_KEEP_NMI_WATCHDOG | quote }}
111111
- name: PCM_NO_AWS_WORKAROUND
112112
value: {{ .Values.PCM_NO_AWS_WORKAROUND | quote }}
113+
- name: PCM_NO_UNCORE_PMU_DISCOVERY
114+
value: {{ .Values.PCM_NO_UNCORE_PMU_DISCOVERY | quote }}
115+
- name: PCM_PRINT_UNCORE_PMU_DISCOVERY
116+
value: {{ .Values.PCM_PRINT_UNCORE_PMU_DISCOVERY | quote }}
117+
- name: PCM_PRINT_TOPOLOGY
118+
value: {{ .Values.PCM_PRINT_TOPOLOGY | quote }}
119+
113120
{{- with .Values.probes }}
114121
livenessProbe:
115122
{{- include "pcm.probe" . | nindent 12 }}

deployment/pcm/values.yaml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,11 @@ imagePullSecrets: {}
1818
# Configures SecurityContext to not privileged (by default) so SYS_ADMIN/SYS_RAWIO capabilietes are required for running pod
1919
privileged: false
2020

21+
# Use new kernel 5.8+ PERFMON (least privileged) instead of generic SYS_ADMIN capability
22+
# !Warning requires kernel 5.8+
23+
# more info here: https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html#perf-events-access-control
24+
cap_perfmon: true
25+
2126
# Run pcm in silent mode (additional -silent argument to pcm-sensor-server binary)
2227
# Removes some of debug outputs (like warnings about unability to open some /sys... /proc... files)
2328
silent: false
@@ -72,6 +77,11 @@ PCM_NO_AWS_WORKAROUND: 0
7277
# mounting watchdog is recommened when PCM_KEEP_NMI_WATCHDOG=0 or we expect AWS workaround to be applied
7378
nmiWatchdogMount: true
7479

80+
### -------------- Other (Debugging options for uncore pmu discovery)
81+
PCM_NO_UNCORE_PMU_DISCOVERY: 0 # skip 1: this is not required for direct privileged access and with 0 ends with WARNING enumaration failed
82+
PCM_PRINT_UNCORE_PMU_DISCOVERY: 1 # show: discovered pmu
83+
PCM_PRINT_TOPOLOGY: 0 # show individual CPU topology for each core (plenty of lines)
84+
7585
### =============================== Optional POD fields no related to PCM ===============================
7686
# Pod level
7787
podAnnotations: {}

0 commit comments

Comments
 (0)