@@ -85,31 +85,33 @@ More information here: https://kubernetes.io/docs/tutorials/security/ns-level-ps
85
85
86
86
| Method | Used interfaces | default | Notes |
87
87
| ---------------| ------------------------------------------------------------| -------- | ------------------------------------------------------------------------------------- |
88
- | indirect | perf, resctrl | v | missing energy metrics, requires fix for /pcm/resctrl mount |
89
- | direct | msr | | requires msr module and access to /dev/cpu (non trivial) |
88
+ | indirect | perf, resctrl | v | missing energy metrics, |
89
+ | direct | msr | | requires msr module and access to /dev/cpu (non trivial) or privileged access |
90
90
91
91
92
92
| Metrics | Available on Hardware | Available through interface | Available through method |
93
93
| --------------------- | ----------------------------- | ---------------------------- | ------------------------ |
94
94
| core | bare-metal, VM (any) | msr or perf | any |
95
95
| uncore (UPI) | bare-metal, VM (all sockets) | msr or perf | any |
96
96
| RDT (MBW,L3OCCUP) | bare-metal, VM (all sockets) | msr or resctrl | any |
97
- | energy, temp | bare-metal (only) | msr | msr only! |
97
+ | energy, temp | bare-metal (only) | msr | direct |
98
+ | perf-topdown | | perf only | indirect |
98
99
99
100
100
- | Interface | Requirements | Controlled by (env/helm value) | default pcm/ helm | Used by source code | Helm Value |
101
+ | Interface | Requirements | Controlled by (env/helm value) | default helm | Used by source code | Notes |
101
102
| ---------------| ------------------------------------------------------------| ---------------------------------| -----------------------| ----------------------------------------------------------| -----------------------------------------------------|
102
- | perf | sys_perf_open() perf_paranoid<=0/privileged/CAP_ADMIN | PCM_NO_PERF | 0 / 1 (!) | programPerfEvent(), PerfVirtualControlRegister() | |
103
- | perf-uncore | sys_perf_open() perf_paranoid<=0/privileged/CAP_ADMIN | PCM_USE_UNCORE_PERF | 0 / 0 | programPerfEvent(), PerfVirtualControlRegister() | |
104
- | perf-topdown | /sys/bus/event_source/devices/cpu/events | | yes | cpucounters.cpp: perfSupportsTopDown () | sysMount ( TODO: conflicts with sys/fs) |
105
- | RDT | uses "msr" or "resctrl" interface | PCM_NO_RDT | 0 / 0 | cpucounters.cpp: isRDTDisabled ()/QOSMetricAvailable() | PCM_NO_RDT |
106
- | resctrl | RW: /sys/fs/resctrl | PCM_USE_RESCTRL | 0 / 0 | resctrl.cpp | resctrlHostMount |
103
+ | perf | sys_perf_open() perf_paranoid<=0/privileged/CAP_ADMIN | PCM_NO_PERF | use perf | programPerfEvent(), PerfVirtualControlRegister() | |
104
+ | perf-uncore | sys_perf_open() perf_paranoid<=0/privileged/CAP_ADMIN | PCM_USE_UNCORE_PERF | use perf for uncore | programPerfEvent(), PerfVirtualControlRegister() | |
105
+ | perf-topdown | /sys/bus/event_source/devices/cpu/events | sysMount | yes | cpucounters.cpp: perfSupportsTopDown () | TODO: conflicts with sys/fs/resctrl |
106
+ | RDT | uses "msr" or "resctrl" interface | PCM_NO_RDT | yes | cpucounters.cpp: isRDTDisabled ()/QOSMetricAvailable() | |
107
+ | resctrl | RW: /sys/fs/resctrl | PCM_USE_RESCTRL | yes | resctrl.cpp | resctrlHostMount |
107
108
| watchdog | RO/RW: /proc/sys/kernel/nmi_watchdog | PCM_KEEP_NMI_WATCHDOG | yes (tries to disable)| src/cpucounters.cpp: disableNMIWatchdog () | |
108
- | msr | RW: /dev/cpu/X/msr + privileged or CAP_ADMIN/CAP_RAWIO | PCM_NO_MSR | 0 / 0 | msr.cpp: MsrHandle () | privileged or values-device-injector.yaml |
109
- | | RW: /dev/mem | ? | 0 / 0 | cpucounters.cpp: initUncoreObjects , pci.cpp: PCIHandleM () | privileged or values-device-injector.yaml |
110
- | | RO/RW: /sys/module/msr/parameters | PCM_NO_MSR | 0 / 0 | msr.cpp: MsrHandle () | sysMount |
111
- | | RW: /proc/bus/pci | PCM_USE_UNCORE_PERF ??? | 0 / 0 | pci.cpp: PCIHandle () | pciMount |
112
- | | RO: /sys/firmware/acpi/tables/MCFG | PCM_USE_UNCORE_PERF ??? | 0 / 0 | pci.cpp:PciHandle::openMcfgTable() | mcfgMount |
109
+ | msr | RW: /dev/cpu/X/msr + privileged or CAP_ADMIN/CAP_RAWIO | PCM_NO_MSR | msr is disabled | msr.cpp: MsrHandle () | privileged or some method to access /dev/cpu |
110
+ | | RW: /dev/mem | ? | msr is disabled | cpucounters.cpp: initUncoreObjects , pci.cpp: PCIHandleM () | privileged or some method to access /dev/cpu |
111
+ | | RO/RW: /sys/module/msr/parameters | PCM_NO_MSR | msr is disabled | msr.cpp: MsrHandle () | sysMount |
112
+ | | RW: /proc/bus/pci | PCM_USE_UNCORE_PERF | msr is disabled | pci.cpp: PCIHandle () | pciMount |
113
+ | | RO: /sys/firmware/acpi/tables/MCFG | PCM_USE_UNCORE_PERF | msr is disabled | pci.cpp:PciHandle::openMcfgTable() | mcfgMount |
114
+ | | energy | | | cpucounters.cpp initEnergyMonitoring() | |
113
115
114
116
### Validation on local kind cluster
115
117
@@ -200,18 +202,20 @@ Deploy with defaults:
200
202
helm install pcm .
201
203
202
204
# Alternatively deploy with NFD and with Prometheus enabled
203
- helm install pcm . --set nfd=true --set podMonitor=true
205
+ helm install pcm . --set podMonitor=true
206
+ kubectl get podmonitor pcm
207
+ helm install pcm . --set nfd=true
204
208
205
209
# Alternatively deploy with NFD and with Prometheus enabled into own "pcm" namespace
206
- helm install pcm . -n pcm --set nfd=true --set podMonitor=true
210
+ helm install pcm . --namespace pcm
207
211
```
208
212
209
213
#### 6) Check metrics
210
214
211
215
Run proxy in background:
212
216
```
213
217
kubectl proxy &
214
- # for access from another host TODO to be remove
218
+ # for access from another host TODO to be remove (unsecure!!!)
215
219
kubectl proxy --address 0.0.0.0 &
216
220
```
217
221
@@ -222,7 +226,10 @@ kubectl get pods
222
226
podname=` kubectl get pod -l app.kubernetes.io/component=pcm-sensor-server -ojsonpath=' {.items[0].metadata.name}' `
223
227
224
228
curl -Ls http://127.0.0.1:8001/api/v1/namespaces/default/pods/$podname /proxy/metrics
225
- curl -Ls http://127.0.0.1:8001/api/v1/namespaces/default/pods/$podname /proxy/metrics | grep DRAM_Writes
229
+ curl -Ls http://127.0.0.1:8001/api/v1/namespaces/default/pods/$podname /proxy/metrics | grep L3_Cache_Misses # source: core
230
+ curl -Ls http://127.0.0.1:8001/api/v1/namespaces/default/pods/$podname /proxy/metrics | grep DRAM_Writes # source: uncore
231
+ curl -Ls http://127.0.0.1:8001/api/v1/namespaces/default/pods/$podname /proxy/metrics | grep Local_Memory_Bandwidth{socket=" 1" ,aggregate=" socket" ,source=" core" } # source: RDT
232
+ curl -Ls http://127.0.0.1:8001/api/v1/namespaces/default/pods/$podname /proxy/metrics | grep DRAM_Joules_Consumed # source: energy
226
233
```
227
234
228
235
or through Prometheus UI/prom tool:
0 commit comments