Skip to content

Commit 85796f1

Browse files
authored
Merge pull request #2927 from eero-t/prometheus-metrics-doc
Add missing items to Prometheus container metrics table
2 parents 19df107 + fda2a62 commit 85796f1

File tree

1 file changed

+43
-37
lines changed

1 file changed

+43
-37
lines changed

docs/storage/prometheus.md

Lines changed: 43 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -22,16 +22,16 @@ Metric name | Type | Description | Unit (where applicable) | option parameter |
2222
`container_accelerator_memory_total_bytes` | Gauge | Total accelerator memory | bytes | accelerator |
2323
`container_accelerator_memory_used_bytes` | Gauge | Total accelerator memory allocated | bytes | accelerator |
2424
`container_blkio_device_usage_total` | Counter | Blkio device bytes usage | bytes | diskIO |
25-
`container_cpu_cfs_periods_total` | Counter | Number of elapsed enforcement period intervals | | |
26-
`container_cpu_cfs_throttled_periods_total` | Counter | Number of throttled period intervals | | |
27-
`container_cpu_cfs_throttled_seconds_total` | Counter | Total time duration the container has been throttled | seconds | |
28-
`container_cpu_load_average_10s` | Gauge | Value of container cpu load average over the last 10 seconds | | |
25+
`container_cpu_cfs_periods_total` | Counter | Number of elapsed enforcement period intervals | | cpu |
26+
`container_cpu_cfs_throttled_periods_total` | Counter | Number of throttled period intervals | | cpu |
27+
`container_cpu_cfs_throttled_seconds_total` | Counter | Total time duration the container has been throttled | seconds | cpu |
28+
`container_cpu_load_average_10s` | Gauge | Value of container cpu load average over the last 10 seconds | | cpuLoad |
2929
`container_cpu_schedstat_run_periods_total` | Counter | Number of times processes of the cgroup have run on the cpu | | sched |
30-
`container_cpu_schedstat_run_seconds_total` | Counter | Time duration the processes of the container have run on the CPU | seconds | sched |
3130
`container_cpu_schedstat_runqueue_seconds_total` | Counter | Time duration processes of the container have been waiting on a runqueue | seconds | sched |
32-
`container_cpu_system_seconds_total` | Counter | Cumulative system cpu time consumed | seconds | |
33-
`container_cpu_usage_seconds_total` | Counter | Cumulative cpu time consumed | seconds | |
34-
`container_cpu_user_seconds_total` | Counter | Cumulative user cpu time consumed | seconds | |
31+
`container_cpu_schedstat_run_seconds_total` | Counter | Time duration the processes of the container have run on the CPU | seconds | sched |
32+
`container_cpu_system_seconds_total` | Counter | Cumulative system cpu time consumed | seconds | cpu |
33+
`container_cpu_usage_seconds_total` | Counter | Cumulative cpu time consumed | seconds | cpu |
34+
`container_cpu_user_seconds_total` | Counter | Cumulative user cpu time consumed | seconds | cpu |
3535
`container_file_descriptors` | Gauge | Number of open file descriptors for the container | | process |
3636
`container_fs_inodes_free` | Gauge | Number of available Inodes | | disk |
3737
`container_fs_inodes_total` | Gauge | Total number of Inodes | | disk |
@@ -40,60 +40,66 @@ Metric name | Type | Description | Unit (where applicable) | option parameter |
4040
`container_fs_io_time_weighted_seconds_total` | Counter | Cumulative weighted I/O time | seconds | diskIO |
4141
`container_fs_limit_bytes` | Gauge | Number of bytes that can be consumed by the container on this filesystem | bytes | disk |
4242
`container_fs_reads_bytes_total` | Counter | Cumulative count of bytes read | bytes | diskIO |
43-
`container_fs_reads_total` | Counter | Cumulative count of reads completed | | diskIO |
4443
`container_fs_read_seconds_total` | Counter | Cumulative count of seconds spent reading | | diskIO |
4544
`container_fs_reads_merged_total` | Counter | Cumulative count of reads merged | | diskIO |
45+
`container_fs_reads_total` | Counter | Cumulative count of reads completed | | diskIO |
4646
`container_fs_sector_reads_total` | Counter | Cumulative count of sector reads completed | | diskIO |
4747
`container_fs_sector_writes_total` | Counter | Cumulative count of sector writes completed | | diskIO |
4848
`container_fs_usage_bytes` | Gauge | Number of bytes that are consumed by the container on this filesystem | bytes | disk |
49-
`container_fs_write_seconds_total` | Counter | Cumulative count of seconds spent writing | seconds | diskIO |
5049
`container_fs_writes_bytes_total` | Counter | Cumulative count of bytes written | bytes | diskIO |
50+
`container_fs_write_seconds_total` | Counter | Cumulative count of seconds spent writing | seconds | diskIO |
5151
`container_fs_writes_merged_total` | Counter | Cumulative count of writes merged | | diskIO |
5252
`container_fs_writes_total` | Counter | Cumulative count of writes completed | | diskIO |
5353
`container_hugetlb_failcnt` | Counter | Number of hugepage usage hits limits | | hugetlb |
5454
`container_hugetlb_max_usage_bytes` | Gauge | Maximum hugepage usages recorded | bytes | hugetlb |
5555
`container_hugetlb_usage_bytes` | Gauge | Current hugepage usage | bytes | hugetlb |
56-
`container_last_seen` | Gauge | Last time a container was seen by the exporter | timestamp | |
56+
`container_last_seen` | Gauge | Last time a container was seen by the exporter | timestamp | - |
5757
`container_llc_occupancy_bytes` | Gauge | Last level cache usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes | resctrl |
5858
`container_memory_bandwidth_bytes` | Gauge | Total memory bandwidth usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes | resctrl |
5959
`container_memory_bandwidth_local_bytes` | Gauge | Local memory bandwidth usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes | resctrl |
60-
`container_memory_cache` | Gauge | Total page cache memory | bytes | |
61-
`container_memory_failcnt` | Counter | Number of memory usage hits limits | | |
62-
`container_memory_failures_total` | Counter | Cumulative count of memory allocation failures | | |
63-
`container_memory_numa_pages` | Gauge | Number of used pages per NUMA node | | memory_numa |
64-
`container_memory_max_usage_bytes` | Gauge | Maximum memory usage recorded | bytes | |
65-
`container_memory_rss` | Gauge | Size of RSS | bytes | |
66-
`container_memory_swap` | Gauge | Container swap usage | bytes | |
67-
`container_memory_mapped_file` | Gauge | Size of memory mapped files | bytes | |
60+
`container_memory_cache` | Gauge | Total page cache memory | bytes | memory |
61+
`container_memory_failcnt` | Counter | Number of memory usage hits limits | | memory |
62+
`container_memory_failures_total` | Counter | Cumulative count of memory allocation failures | | memory |
63+
`container_memory_mapped_file` | Gauge | Size of memory mapped files | bytes | memory |
64+
`container_memory_max_usage_bytes` | Gauge | Maximum memory usage recorded | bytes | memory |
6865
`container_memory_migrate` | Gauge | Memory migrate status | | cpuset |
69-
`container_memory_usage_bytes` | Gauge | Current memory usage, including all memory regardless of when it was accessed | bytes | |
70-
`container_memory_working_set_bytes` | Gauge | Current working set | bytes | |
66+
`container_memory_numa_pages` | Gauge | Number of used pages per NUMA node | | memory_numa |
67+
`container_memory_rss` | Gauge | Size of RSS | bytes | memory |
68+
`container_memory_swap` | Gauge | Container swap usage | bytes | memory |
69+
`container_memory_usage_bytes` | Gauge | Current memory usage, including all memory regardless of when it was accessed | bytes | memory |
70+
`container_memory_working_set_bytes` | Gauge | Current working set | bytes | memory |
71+
`container_network_advance_tcp_stats_total` | Gauge | advanced tcp connections statistic for container | | advtcp |
7172
`container_network_receive_bytes_total` | Counter | Cumulative count of bytes received | bytes | network |
73+
`container_network_receive_errors_total` | Counter | Cumulative count of errors encountered while receiving | | network |
7274
`container_network_receive_packets_dropped_total` | Counter | Cumulative count of packets dropped while receiving | | network |
7375
`container_network_receive_packets_total` | Counter | Cumulative count of packets received | | network |
74-
`container_network_receive_errors_total` | Counter | Cumulative count of errors encountered while receiving | | network |
76+
`container_network_tcp6_usage_total` | Gauge | tcp6 connection usage statistic for container | | tcp |
77+
`container_network_tcp_usage_total` | Gauge | tcp connection usage statistic for container | | tcp |
7578
`container_network_transmit_bytes_total` | Counter | Cumulative count of bytes transmitted | bytes | network |
76-
`container_network_transmit_packets_total` | Counter | Cumulative count of packets transmitted | | network |
77-
`container_network_transmit_packets_dropped_total` | Counter | Cumulative count of packets dropped while transmitting | | network |
7879
`container_network_transmit_errors_total` | Counter | Cumulative count of errors encountered while transmitting | | network |
79-
`container_network_tcp_usage_total` | Gauge | tcp connection usage statistic for container | | tcp |
80-
`container_network_tcp6_usage_total` | Gauge | tcp6 connection usage statistic for container | | tcp |
81-
`container_network_udp_usage_total` | Gauge | udp connection usage statistic for container | | udp |
80+
`container_network_transmit_packets_dropped_total` | Counter | Cumulative count of packets dropped while transmitting | | network |
81+
`container_network_transmit_packets_total` | Counter | Cumulative count of packets transmitted | | network |
8282
`container_network_udp6_usage_total` | Gauge | udp6 connection usage statistic for container | | udp |
83-
`container_perf_events_total` | Counter | Scaled counter of perf core event (event can be identified by `event` label and `cpu` indicates the core for which event was measured). See [perf event configuration](../runtime_options.md#perf-events). | | | libpfm
84-
`container_perf_metric_scaling_ratio` | Gauge | Scaling ratio for perf event counter (event can be identified by `event` label and `cpu` indicates the core for which event was measured). See [perf event configuration](../runtime_options.md#perf-events). | | | libpfm
83+
`container_network_udp_usage_total` | Gauge | udp connection usage statistic for container | | udp |
84+
`container_oom_events_total` | Counter | Count of out of memory events observed for the container | | oom_event |
85+
`container_perf_events_scaling_ratio` | Gauge | Scaling ratio for perf event counter (event can be identified by `event` label and `cpu` indicates the core for which event was measured). See [perf event configuration](../runtime_options.md#perf-events). | | perf_event | libpfm
86+
`container_perf_events_total` | Counter | Scaled counter of perf core event (event can be identified by `event` label and `cpu` indicates the core for which event was measured). See [perf event configuration](../runtime_options.md#perf-events). | | perf_event | libpfm
87+
`container_perf_uncore_events_scaling_ratio` | Gauge | Scaling ratio for perf uncore event counter (event can be identified by `event` label, `pmu` and `socket` lables indicate the PMU and the CPU socket for which event was measured). See [perf event configuration](../runtime_options.md#perf-events). Metric exists only for main cgroup (id="/"). | | perf_event | libpfm
88+
`container_perf_uncore_events_total` | Counter | Scaled counter of perf uncore event (event can be identified by `event` label, `pmu` and `socket` lables indicate the PMU and the CPU socket for which event was measured). See [perf event configuration](../runtime_options.md#perf-events)). Metric exists only for main cgroup (id="/").| | perf_event | libpfm
8589
`container_processes` | Gauge | Number of processes running inside the container | | process |
8690
`container_referenced_bytes` | Gauge | Container referenced bytes during last measurements cycle based on Referenced field in /proc/smaps file, with /proc/PIDs/clear_refs set to 1 after defined number of cycles configured through `referenced_reset_interval` cAdvisor parameter.</br>Warning: this is intrusive collection because can influence kernel page reclaim policy and add latency. Refer to https://github.com/brendangregg/wss#wsspl-referenced-page-flag for more details. | bytes | referenced_memory |
87-
`container_spec_cpu_period` | Gauge | CPU period of the container | | |
88-
`container_spec_cpu_quota` | Gauge | CPU quota of the container | | |
89-
`container_spec_cpu_shares` | Gauge | CPU share of the container | | |
90-
`container_spec_memory_limit_bytes` | Gauge | Memory limit for the container | bytes | |
91-
`container_spec_memory_swap_limit_bytes` | Gauge | Memory swap limit for the container | bytes | |
91+
`container_sockets` | Gauge | Number of open sockets for the container | | process |
92+
`container_spec_cpu_period` | Gauge | CPU period of the container | | - |
93+
`container_spec_cpu_quota` | Gauge | CPU quota of the container | | - |
94+
`container_spec_cpu_shares` | Gauge | CPU share of the container | | - |
95+
`container_spec_memory_limit_bytes` | Gauge | Memory limit for the container | bytes | - |
9296
`container_spec_memory_reservation_limit_bytes` | Gauge | Memory reservation limit for the container | bytes | |
97+
`container_spec_memory_swap_limit_bytes` | Gauge | Memory swap limit for the container | bytes | |
9398
`container_start_time_seconds` | Gauge | Start time of the container since unix epoch | seconds | |
94-
`container_tasks_state` | Gauge | Number of tasks in given state (`sleeping`, `running`, `stopped`, `uninterruptible`, or `ioawaiting`) | | |
95-
`container_perf_uncore_events_total` | Counter | Scaled counter of perf uncore event (event can be identified by `event` label, `pmu` and `socket` lables indicate the PMU and the CPU socket for which event was measured). See [perf event configuration](../runtime_options.md#perf-events)). Metric exists only for main cgroup (id="/").| | | libpfm
96-
`container_perf_uncore_events_scaling_ratio` | Gauge | Scaling ratio for perf uncore event counter (event can be identified by `event` label, `pmu` and `socket` lables indicate the PMU and the CPU socket for which event was measured). See [perf event configuration](../runtime_options.md#perf-events). Metric exists only for main cgroup (id="/"). | | | libpfm
99+
`container_tasks_state` | Gauge | Number of tasks in given state (`sleeping`, `running`, `stopped`, `uninterruptible`, or `ioawaiting`) | | cpuLoad |
100+
`container_threads` | Gauge | Number of threads running inside the container | | process |
101+
`container_threads_max` | Gauge | Maximum number of threads allowed inside the container | | process |
102+
`container_ulimits_soft` | Gauge | Soft ulimit values for the container root process. Unlimited if -1, except priority and nice | | process |
97103

98104
## Prometheus hardware metrics
99105

0 commit comments

Comments
 (0)