Skip to content

Commit 7f6a8e0

Browse files
committed
refactor(cpu-usage): non-blocking behaviour (interval=None + manual deltas via SQLite DB) so we get both accuracy and faster runtime
1 parent 019a127 commit 7f6a8e0

File tree

4 files changed

+284
-82
lines changed

4 files changed

+284
-82
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,7 @@ Icinga Director:
6464

6565
Monitoring Plugins:
6666

67+
* cpu-usage: non-blocking behaviour (interval=None + manual deltas via SQLite DB) so we get both accuracy and faster runtime
6768
* gitlab-health: increase timeout from 3 to 8 secs
6869
* gitlab-liveness: increase timeout from 3 to 8 secs
6970
* gitlab-readiness: increase timeout from 3 to 8 secs

check-plugins/cpu-usage/README.md

Lines changed: 22 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,18 @@
22

33
## Overview
44

5-
Returns a bunch of numbers representing the current system-wide CPU utilization as a percentage. Outputs the cpu times having \> 0% in the first line, sorted by value. In addition, the top 5 processes which consumed the most CPU time are listed. Warns only if any of `user`, `system`, `iowait` or overall `cpu-usage` is above a certain threshold within the last n checks (default: 5).
5+
Reports CPU utilization percentages for all available time categories (user, system, idle, nice, iowait, irq, softirq, steal, guest, guest_nice) plus the overall cpu-usage (100 − idle − nice).
6+
7+
Thresholds (WARN/CRIT) are checked against user, system, iowait, and cpu-usage. An alert is raised only if the threshold is exceeded for COUNT consecutive runs, suppressing short spikes and focusing on sustained load.
8+
9+
Perfdata is emitted for every field to enable full graphing. Extended stats (context switches, interrupts, etc.) are included if supported on this platform. With `--top`, the most CPU-intensive processes are also listed for quick diagnosis.
10+
11+
This check is cross-platform and works on Linux, Windows, and all psutil-supported systems.
612

713
Hints and Recommendations:
814

915
* We check system-wide CPU stats, not per-CPU.
1016
* `--count=5` (the default) while checking every minute means that the check reports a warning if any of `user`, `system`, `iowait` or overall `cpu-usage` was above a threshold in the last 5 minutes.
11-
* Check needs at least 250ms to run.
1217

1318

1419
## Fact Sheet
@@ -30,9 +35,16 @@ Hints and Recommendations:
3035
usage: cpu-usage [-h] [-V] [--always-ok] [--count COUNT] [-c CRIT] [--top TOP]
3136
[-w WARN]
3237
33-
Mainly provides utilization percentages for each specific CPU time. Takes a
34-
time period into account: the cpu usage within a certain amount of time has to
35-
be equal or above given thresholds before a warning is raised.
38+
Reports CPU utilization percentages for all available time categories (user,
39+
system, idle, nice, iowait, irq, softirq, steal, guest, guest_nice) plus the
40+
overall cpu-usage (100 − idle − nice). Thresholds (WARN/CRIT) are checked
41+
against user, system, iowait, and cpu-usage. An alert is raised only if the
42+
threshold is exceeded for COUNT consecutive runs, suppressing short spikes and
43+
focusing on sustained load. Perfdata is emitted for every field to enable full
44+
graphing. Extended stats (context switches, interrupts, etc.) are included if
45+
supported on this platform. With `--top`, the most CPU-intensive processes are
46+
also listed for quick diagnosis. This check is cross-platform and works on
47+
Linux, Windows, and all psutil-supported systems.
3648
3749
options:
3850
-h, --help show this help message and exit
@@ -42,8 +54,9 @@ options:
4254
thresholds before alerting. Default: 5
4355
-c, --critical CRIT Set the critical threshold CPU Usage Percentage.
4456
Default: 90
45-
--top TOP List x "Top processes using the most cpu time".
46-
Default: 5
57+
--top TOP List x "Top processes using the most cpu time". Use
58+
`--top=0` to disable this feature. Default: 5 on Linux,
59+
0 on Windows
4760
-w, --warning WARN Set the warning threshold CPU Usage Percentage.
4861
Default: 80
4962
```
@@ -52,7 +65,7 @@ options:
5265
## Usage Examples
5366

5467
```bash
55-
./cpu-usage --count=15 --warning=50 --critical=70
68+
./cpu-usage --count=15 --warning=50 --critical=70 --top=3
5669
```
5770

5871
Output:
@@ -62,7 +75,7 @@ Output:
6275
guest: 0.0%, iowait: 0.0%, guest_nice: 0.0%, steal: 0.0%, nice: 0.0%
6376
interrupts: 582.9M, soft_interrupts: 343.6M, ctx_switches: 1.1G
6477
65-
Top3 processes using the most cpu time:
78+
Top 3 processes using the most cpu time:
6679
1. Xorg: 2h 13m
6780
2. gnome-shell: 2h 1m
6881
3. firefox: 1h 24m

0 commit comments

Comments
 (0)