@@ -59,9 +59,8 @@ kubectl logs ds/pcm
59
59
60
60
### Requirements
61
61
62
- - Full set of metrics requires bare-metal or .metal instance (uncore metrics, RDT, energy, UPI),
63
- - Core metrics (instructions, cycles are also available) on VM instances,
64
- - /sys/fs/resctrl has to be mounted on host OS,
62
+ - Full set of metrics (uncore/UPI, RDT, energy) requires bare-metal or .metal cloud instance.
63
+ - /sys/fs/resctrl has to be mounted on host OS (for default indirect deployment method),
65
64
- pod is allowed to be run with privileged capabilities (SYS_ADMIN, SYS_RAWIO) on given namespace in other words: Pod Security Standards allow to run on privileged level,
66
65
67
66
```
@@ -77,74 +76,40 @@ More information here: https://kubernetes.io/docs/tutorials/security/ns-level-ps
77
76
78
77
### Defaults
79
78
80
- - Use Linux abstraction to access event counters (Linux Perf, resctrl) and run container in un-privileged mode.
81
- - hostPort 9738 is exposed on host, (TODO: security review)
82
- - Prometheus podMonitor is disabled
83
-
84
- #### Metric availability and requirements (devices/mounts/permissions)
85
-
86
- | Method | Used interfaces | default | Notes |
87
- | ---------------| ------------------------------------------------------------| -------- | ------------------------------------------------------------------------------------- |
88
- | indirect | perf, resctrl | v | missing energy metrics, |
89
- | direct | msr | | requires msr module and access to /dev/cpu (non trivial) or privileged access |
90
-
91
-
92
- | Metrics | Available on Hardware | Available through interface | Available through method |
93
- | --------------------- | ----------------------------- | ---------------------------- | ------------------------ |
94
- | core | bare-metal, VM (any) | msr or perf | any |
95
- | uncore (UPI) | bare-metal, VM (all sockets) | msr or perf | any |
96
- | RDT (MBW,L3OCCUP) | bare-metal, VM (all sockets) | msr or resctrl | any |
97
- | energy, temp | bare-metal (only) | msr | direct |
98
- | perf-topdown | | perf only | indirect |
99
-
100
-
101
- | Interface | Requirements | Controlled by (env/helm value) | default helm | Used by source code | Notes |
102
- | ---------------| ------------------------------------------------------------| ---------------------------------| -----------------------| ----------------------------------------------------------| -----------------------------------------------------|
103
- | perf | sys_perf_open() perf_paranoid<=0/privileged/CAP_ADMIN | PCM_NO_PERF | use perf | programPerfEvent(), PerfVirtualControlRegister() | |
104
- | perf-uncore | sys_perf_open() perf_paranoid<=0/privileged/CAP_ADMIN | PCM_USE_UNCORE_PERF | use perf for uncore | programPerfEvent(), PerfVirtualControlRegister() | |
105
- | perf-topdown | /sys/bus/event_source/devices/cpu/events | sysMount | yes | cpucounters.cpp: perfSupportsTopDown () | TODO: conflicts with sys/fs/resctrl |
106
- | RDT | uses "msr" or "resctrl" interface | PCM_NO_RDT | yes | cpucounters.cpp: isRDTDisabled ()/QOSMetricAvailable() | |
107
- | resctrl | RW: /sys/fs/resctrl | PCM_USE_RESCTRL | yes | resctrl.cpp | resctrlHostMount |
108
- | watchdog | RO/RW: /proc/sys/kernel/nmi_watchdog | PCM_KEEP_NMI_WATCHDOG | yes (tries to disable)| src/cpucounters.cpp: disableNMIWatchdog () | |
109
- | msr | RW: /dev/cpu/X/msr + privileged or CAP_ADMIN/CAP_RAWIO | PCM_NO_MSR | msr is disabled | msr.cpp: MsrHandle () | privileged or some method to access /dev/cpu |
110
- | | RW: /dev/mem | ? | msr is disabled | cpucounters.cpp: initUncoreObjects , pci.cpp: PCIHandleM () | privileged or some method to access /dev/cpu |
111
- | | RO/RW: /sys/module/msr/parameters | PCM_NO_MSR | msr is disabled | msr.cpp: MsrHandle () | sysMount |
112
- | | RW: /proc/bus/pci | PCM_USE_UNCORE_PERF | msr is disabled | pci.cpp: PCIHandle () | pciMount |
113
- | | RO: /sys/firmware/acpi/tables/MCFG | PCM_USE_UNCORE_PERF | msr is disabled | pci.cpp:PciHandle::openMcfgTable() | mcfgMount |
114
- | | energy | | | cpucounters.cpp initEnergyMonitoring() | |
79
+ - Indirect method uses Linux abstraction to access event counters (Linux Perf, resctrl) and run container in non-privileged mode.
80
+ - hostPort 9738 is exposed on host. (TODO: security review, consider TLS, together with Prometheus scrapping !!).
81
+ - Prometheus podMonitor is disabled (enabled it with --set podMonitor=true).
115
82
116
83
### Validation on local kind cluster
117
84
118
-
119
85
#### Requirements
120
86
121
- - kubectl/kind/helm/jq binaries available in PATH
122
- - docker service up and running
87
+ - kubectl/kind/helm/jq binaries available in PATH,
88
+ - docker service up and running.
89
+ - full set of metrics avaiable only bare-metal instance or Cloud .metal instance.
123
90
124
- #### 1) Optionally mount resctrl filesystem
91
+ #### 1) ( Optionally) mount resctrl filesystem (for RDT metrics)
125
92
126
93
```
127
94
mount -t resctrl resctrl /sys/fs/resctrl
128
95
```
129
96
130
97
#### 2) Create kind based Kubernetes cluster
131
98
132
-
133
99
```
134
100
kind create cluster
135
101
```
136
102
137
- ** Note** to be able to collect and test resctrl RDT metrics, kind cluster have to be created with additional mounts:
138
-
103
+ ** Note** to be able to collect and test RDT metrics through resctrl filesystem, kind cluster have to be created with additional mounts:
139
104
```
140
105
nodes:
141
106
- role: control-plane
142
107
extraMounts:
143
108
- hostPath: /sys/fs/resctrl
144
109
containerPath: /sys/fs/resctrl
145
110
```
146
- or (optionally), create kind cluster with local registry with [ this script] ( https://kind.sigs.k8s.io/docs/user/local-registry/ ) .
147
- and apply the patch using sed :
111
+ e.g. create kind cluster with local registry with [ this script] ( https://kind.sigs.k8s.io/docs/user/local-registry/ ) .
112
+ and apply the patch to enable resctrl win following way :
148
113
149
114
```
150
115
wget https://kind.sigs.k8s.io/examples/kind-with-registry.sh
@@ -156,7 +121,10 @@ nodes:\
156
121
- hostPath: /sys/fs/resctrl\
157
122
containerPath: /sys/fs/resctrl\
158
123
' kind-with-registry.sh
124
+ ```
159
125
126
+ Then create cluster using above patched script:
127
+ ```
160
128
bash kind-with-registry.sh
161
129
```
162
130
@@ -170,8 +138,7 @@ Export kind kubeconfig as default for further kubectl commands:
170
138
kind export kubeconfig
171
139
```
172
140
173
-
174
- #### 3) (Optionally) Deploy Node feature discovery
141
+ #### 3) (Optionally) Deploy Node Feature Discovery (nfd)
175
142
176
143
```
177
144
# I.a. Using Kustomize:
@@ -196,27 +163,23 @@ kubectl get sts prometheus-prometheus-kube-prometheus-prometheus
196
163
197
164
#### 5) Deploy PCM helm chart
198
165
199
- Deploy with defaults:
200
166
```
201
- # Deploy to current namespace with defaults
167
+ # a) Deploy to current namespace with defaults
202
168
helm install pcm .
203
169
204
- # Alternatively deploy with NFD and with Prometheus enabled
170
+ # b) Alternatively deploy with NFD and/or with Prometheus enabled
205
171
helm install pcm . --set podMonitor=true
206
- kubectl get podmonitor pcm
207
172
helm install pcm . --set nfd=true
208
173
209
- # Alternatively deploy with NFD and with Prometheus enabled into own "pcm" namespace
174
+ # c) Alternatively deploy with NFD and with Prometheus enabled into own "pcm" namespace
210
175
helm install pcm . --namespace pcm
211
176
```
212
177
213
- #### 6) Check metrics
178
+ #### 6) Check metrics are exported
214
179
215
180
Run proxy in background:
216
181
```
217
182
kubectl proxy &
218
- # for access from another host TODO to be remove (unsecure!!!)
219
- kubectl proxy --address 0.0.0.0 &
220
183
```
221
184
222
185
Access PCM metrics directly:
@@ -232,7 +195,7 @@ curl -Ls http://127.0.0.1:8001/api/v1/namespaces/default/pods/$podname/proxy/met
232
195
curl -Ls http://127.0.0.1:8001/api/v1/namespaces/default/pods/$podname /proxy/metrics | grep DRAM_Joules_Consumed # source: energy
233
196
```
234
197
235
- or through Prometheus UI/prom tool:
198
+ or through Prometheus UI/prom tool (requires prometheus operator to be deployed and helm install with with ` --set podMonitor=true ` ) :
236
199
```
237
200
http://127.0.0.1:8001/api/v1/namespaces/default/services/prometheus-kube-prometheus-prometheus:http-web/proxy/graph
238
201
promtool query range --step 1m http://127.0.0.1:8001/api/v1/namespaces/default/services/prometheus-kube-prometheus-prometheus:http-web/proxy 'rate(DRAM_Writes{aggregate="system"}[5m])/1e9'
@@ -265,7 +228,7 @@ helm install pcm-vm . -f values-vm.yaml
265
228
helm install pcm-metal . -f values-metal.yaml
266
229
```
267
230
268
- #### Direct as non-privileged container
231
+ #### Direct method as non-privileged container (not recommended)
269
232
270
233
** Note** PCM requires access to /dev/cpu device in read writer mode (MSR access) but it is no possible currently to mount devices in Kubernetes pods/containers in vanila Kubernetes. Please read this isses for more information https://github.com/kubernetes/kubernetes/issues/5607 .
271
234
@@ -350,7 +313,39 @@ docker push localhost:5001/pcm-local
350
313
helm install pcm . -f values-local-image.yaml
351
314
```
352
315
353
- ##### Troubleshooting
316
+ #### Troubleshooting
317
+
318
+ ##### Metric availability and requirements (devices/mounts/permissions)
319
+
320
+ | Method | Used interfaces | default | Notes |
321
+ | ---------------| ------------------------------------------------------------| -------- | ------------------------------------------------------------------------------------- |
322
+ | indirect | perf, resctrl | v | missing energy metrics, |
323
+ | direct | msr | | requires msr module and access to /dev/cpu (non trivial) or privileged access |
324
+
325
+
326
+ | Metrics | Available on Hardware | Available through interface | Available through method |
327
+ | --------------------- | ----------------------------- | ---------------------------- | ------------------------ |
328
+ | core | bare-metal, VM (any) | msr or perf | any |
329
+ | uncore (UPI) | bare-metal, VM (all sockets) | msr or perf | any |
330
+ | RDT (MBW,L3OCCUP) | bare-metal, VM (all sockets) | msr or resctrl | any |
331
+ | energy, temp | bare-metal (only) | msr | direct |
332
+ | perf-topdown | | perf only | indirect |
333
+
334
+
335
+ | Interface | Requirements | Controlled by (env/helm value) | default helm | Used by source code | Notes |
336
+ | ---------------| ------------------------------------------------------------| ---------------------------------| -----------------------| ----------------------------------------------------------| -----------------------------------------------------|
337
+ | perf | sys_perf_open() perf_paranoid<=0/privileged/CAP_ADMIN | PCM_NO_PERF | use perf | programPerfEvent(), PerfVirtualControlRegister() | |
338
+ | perf-uncore | sys_perf_open() perf_paranoid<=0/privileged/CAP_ADMIN | PCM_USE_UNCORE_PERF | use perf for uncore | programPerfEvent(), PerfVirtualControlRegister() | |
339
+ | perf-topdown | /sys/bus/event_source/devices/cpu/events | sysMount | yes | cpucounters.cpp: perfSupportsTopDown () | TODO: conflicts with sys/fs/resctrl |
340
+ | RDT | uses "msr" or "resctrl" interface | PCM_NO_RDT | yes | cpucounters.cpp: isRDTDisabled ()/QOSMetricAvailable() | |
341
+ | resctrl | RW: /sys/fs/resctrl | PCM_USE_RESCTRL | yes | resctrl.cpp | resctrlHostMount |
342
+ | watchdog | RO/RW: /proc/sys/kernel/nmi_watchdog | PCM_KEEP_NMI_WATCHDOG | yes (tries to disable)| src/cpucounters.cpp: disableNMIWatchdog () | |
343
+ | msr | RW: /dev/cpu/X/msr + privileged or CAP_ADMIN/CAP_RAWIO | PCM_NO_MSR | msr is disabled | msr.cpp: MsrHandle () | privileged or some method to access /dev/cpu |
344
+ | | RW: /dev/mem | ? | msr is disabled | cpucounters.cpp: initUncoreObjects , pci.cpp: PCIHandleM () | privileged or some method to access /dev/cpu |
345
+ | | RO/RW: /sys/module/msr/parameters | PCM_NO_MSR | msr is disabled | msr.cpp: MsrHandle () | sysMount |
346
+ | | RW: /proc/bus/pci | PCM_USE_UNCORE_PERF | msr is disabled | pci.cpp: PCIHandle () | pciMount |
347
+ | | RO: /sys/firmware/acpi/tables/MCFG | PCM_USE_UNCORE_PERF | msr is disabled | pci.cpp:PciHandle::openMcfgTable() | mcfgMount |
348
+ | | energy | | | cpucounters.cpp initEnergyMonitoring() | |
354
349
355
350
One can replace pcm-sensor-server command and run pcm or sleep to investigate issue add following arguments when install helm chart
356
351
```
0 commit comments