You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pipeline/inputs/gpu-metrics.md
+24-24Lines changed: 24 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,35 +1,35 @@
1
-
# GPU Metrics
1
+
# GPU metrics
2
2
3
-
The **gpu_metrics** input plugin collects GPU performance metrics from graphics cards on Linux systems. It provides real-time monitoring of GPU utilization, memory usage (VRAM), clock frequencies, power consumption, temperature, and fan speeds.
3
+
The **gpu_metrics** input plugin collects graphics processing unit (GPU) performance metrics from graphics cards on Linux systems. It provides real-time monitoring of GPU utilization, memory usage (VRAM), clock frequencies, power consumption, temperature, and fan speeds.
4
4
5
-
The plugin reads metrics directly from the Linux sysfs filesystem (`/sys/class/drm/`) without requiring external tools or libraries. Currently, **only AMD GPUs are supported** through the amdgpu kernel driver. NVIDIA and Intel GPUs are not supported at this time.
5
+
The plugin reads metrics directly from the Linux sysfs filesystem (`/sys/class/drm/`) without requiring external tools or libraries. Currently, **only AMD GPUs are supported** through the amdgpu kernel driver. NVIDIA and Intel GPUs aren't supported at this time.
6
6
7
-
## Metrics Collected
7
+
## Metrics collected
8
8
9
9
The plugin collects the following metrics for each detected GPU:
|`gpu_utilization_percent`| GPU core utilization as a percentage (0-100). Indicates how busy the GPU is processing workloads. |
14
+
|`gpu_memory_used_bytes`| Amount of video RAM (VRAM) currently in use, measured in bytes.|
15
+
|`gpu_memory_total_bytes`| Total video RAM (VRAM) capacity available on the GPU, measured in bytes.|
16
+
|`gpu_clock_mhz`| Current GPU clock frequency in MHz. This metric has multiple instances with different type labels (see [Clock metrics](#clock-metrics)). |
17
+
|`gpu_power_watts`| Current power consumption in watts. Can be disabled with enable_power false. |
18
+
|`gpu_temperature_celsius`| GPU die temperature in degrees Celsius. Can be disabled with enable_temperature false. |
19
+
|`gpu_fan_speed_rpm`| Fan rotation speed in revolutions per minute (RPM). |
20
+
|`gpu_fan_pwm_percent`| Fan PWM duty cycle as a percentage (0-100). Indicates fan intensity. |
21
21
22
-
### Clock Metrics
22
+
### Clock metrics
23
23
24
-
The gpu_clock_mhz metric is reported separately for three clock domains:
24
+
The `gpu_clock_mhz` metric is reported separately for three clock domains:
The plugin supports the following configuration parameters:
35
35
@@ -39,10 +39,10 @@ The plugin supports the following configuration parameters:
39
39
|`path_sysfs`| Path to the sysfs root directory. Typically used for testing or non-standard systems. |`/sys`|
40
40
|`cards_include`| Pattern specifying which GPU cards to monitor. Supports wildcards (*), ranges (0-3), and comma-separated lists (0,2,4). |`*`|
41
41
|`cards_exclude`| Pattern specifying which GPU cards to exclude from monitoring. Uses the same syntax as cards_include. |_none_|
42
-
|`enable_power`| Enable collection of power consumption metrics (gpu_power_watts). |`true`|
43
-
|`enable_temperature`| Enable collection of temperature metrics (gpu_temperature_celsius). |`true`|
42
+
|`enable_power`| Enable collection of power consumption metrics (`gpu_power_watts`).|`true`|
43
+
|`enable_temperature`| Enable collection of temperature metrics (`gpu_temperature_celsius`).|`true`|
44
44
45
-
## GPU Detection
45
+
## GPU detection
46
46
47
47
The GPU metrics plugin will automatically scan for supported **AMD GPUs** that are using the `amdgpu` kernel driver. GPUs using legacy drivers will be ignored.
In systems with multiple GPUs, the GPU metrics plugin will detect all AMD cards by default. You can control which GPUs you want to monitor with the `cards_include` and `cards_exclude` parameters.
65
65
@@ -76,11 +76,11 @@ Example output:
76
76
/sys/class/drm/card1/device/vendor
77
77
```
78
78
79
-
## Getting Started
79
+
## Getting started
80
80
81
81
To get GPU metrics from your system, you can run the plugin from either the command line or through the configuration file:
0 commit comments