Skip to content

Commit e6fc2f6

Browse files
authored
Merge pull request #41475 from lmandavi/CNV-8788-expose-vm-metrics
CNV-8788: Exposing custom metrics for virtual machines
2 parents 9b66dd8 + 166a4c5 commit e6fc2f6

8 files changed

+376
-0
lines changed

_topic_maps/_topic_map.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3220,6 +3220,8 @@ Topics:
32203220
File: virt-openshift-cluster-monitoring
32213221
- Name: Prometheus queries for virtual resources
32223222
File: virt-prometheus-queries
3223+
- Name: Exposing custom metrics for virtual machines
3224+
File: virt-exposing-custom-metrics-for-vms
32233225
- Name: Collecting OpenShift Virtualization data for Red Hat Support
32243226
File: virt-collecting-virt-data
32253227
Distros: openshift-enterprise
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * virt/logging_events-monitoring/virt-exposing-custom-metrics-for-vms.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="virt-accessing-node-exporter-outside-cluster_{context}"]
7+
= Accessing the node exporter service outside the cluster
8+
9+
You can access the node-exporter service outside the cluster and view the exposed metrics.
10+
11+
.Prerequisites
12+
* You have access to the cluster as a user with `cluster-admin` privileges or the `monitoring-edit` role.
13+
* You have enabled monitoring for the user-defined project by configuring the node-exporter service.
14+
15+
.Procedure
16+
17+
. Expose the node-exporter service.
18+
+
19+
[source,terminal]
20+
----
21+
$ oc expose service -n <namespace> <node_exporter_service_name>
22+
----
23+
. Obtain the FQDN (Fully Qualified Domain Name) for the route.
24+
+
25+
[source,terminal]
26+
----
27+
$ oc get route -o=custom-columns=NAME:.metadata.name,DNS:.spec.host
28+
----
29+
+
30+
.Example output
31+
[source,terminal]
32+
----
33+
NAME DNS
34+
node-exporter-service node-exporter-service-dynamation.apps.cluster.example.org
35+
----
36+
. Use the `curl` command to display metrics for the node-exporter service.
37+
+
38+
[source,terminal]
39+
----
40+
$ curl -s http://node-exporter-service-dynamation.apps.cluster.example.org/metrics
41+
----
42+
+
43+
.Example output
44+
[source,terminal]
45+
----
46+
go_gc_duration_seconds{quantile="0"} 1.5382e-05
47+
go_gc_duration_seconds{quantile="0.25"} 3.1163e-05
48+
go_gc_duration_seconds{quantile="0.5"} 3.8546e-05
49+
go_gc_duration_seconds{quantile="0.75"} 4.9139e-05
50+
go_gc_duration_seconds{quantile="1"} 0.000189423
51+
----
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * virt/logging_events-monitoring/virt-exposing-custom-metrics-for-vms.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="virt-configuring-node-exporter-service_{context}"]
7+
= Configuring the node exporter service
8+
9+
The node-exporter agent is deployed on every virtual machine in the cluster from which you want to collect metrics. Configure the node-exporter agent as a service to expose internal metrics and processes that are associated with virtual machines.
10+
11+
.Prerequisites
12+
13+
* Install the {product-title} CLI `oc`.
14+
* Log in to the cluster as a user with `cluster-admin` privileges.
15+
* Create the `cluster-monitoring-config` `ConfigMap` object in the `openshift-monitoring` project.
16+
* Configure the `user-workload-monitoring-config` `ConfigMap` object in the `openshift-user-workload-monitoring` project by setting `enableUserWorkload` to `true`.
17+
18+
.Procedure
19+
20+
. Create the `Service` YAML file. In the following example, the file is called `node-exporter-service.yaml`.
21+
+
22+
[source,yaml]
23+
----
24+
kind: Service
25+
apiVersion: v1
26+
metadata:
27+
name: node-exporter-service <1>
28+
namespace: dynamation <2>
29+
labels:
30+
servicetype: metrics <3>
31+
spec:
32+
ports:
33+
- name: exmet <4>
34+
protocol: TCP
35+
port: 9100 <5>
36+
targetPort: 9100 <6>
37+
type: ClusterIP
38+
selector:
39+
monitor: metrics <7>
40+
----
41+
<1> The node-exporter service that exposes the metrics from the virtual machines.
42+
<2> The namespace where the service is created.
43+
<3> The label for the service. The `ServiceMonitor` uses this label to match this service.
44+
<4> The name given to the port that exposes metrics on port 9100 for the `ClusterIP` service.
45+
<5> The target port used by `node-exporter-service` to listen for requests.
46+
<6> The TCP port number of the virtual machine that is configured with the `monitor` label.
47+
<7> The label used to match the virtual machine's pods. In this example, any virtual machine's pod with the label `monitor` and a value of `metrics` will be matched.
48+
49+
. Create the node-exporter service:
50+
+
51+
[source,terminal]
52+
----
53+
$ oc create -f node-exporter-service.yaml
54+
----
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * virt/logging_events-monitoring/virt-exposing-custom-metrics-for-vms.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="virt-configuring-vm-with-node-exporter-service_{context}"]
7+
= Configuring a virtual machine with the node exporter service
8+
9+
Download the `node-exporter` file on to the virtual machine. Then, create a `systemd` service that runs the node-exporter service when the virtual machine boots.
10+
11+
.Prerequisites
12+
* The pods for the component are running in the `openshift-user-workload-monitoring` project.
13+
* Grant the `monitoring-edit` role to users who need to monitor this user-defined project.
14+
15+
.Procedure
16+
17+
. Log on to the virtual machine.
18+
19+
. Download the `node-exporter` file on to the virtual machine by using the directory path that applies to the version of `node-exporter` file.
20+
+
21+
[source,terminal]
22+
----
23+
$ wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz
24+
----
25+
26+
. Extract the executable and place it in the `/usr/bin` directory.
27+
+
28+
[source,terminal]
29+
----
30+
$ sudo tar xvf node_exporter-1.3.1.linux-amd64.tar.gz \
31+
--directory /usr/bin --strip 1 "*/node_exporter"
32+
----
33+
34+
. Create a `node_exporter.service` file in this directory path: `/etc/systemd/system`. This `systemd` service file runs the node-exporter service when the virtual machine reboots.
35+
+
36+
[source,terminal]
37+
----
38+
[Unit]
39+
Description=Prometheus Metrics Exporter
40+
After=network.target
41+
StartLimitIntervalSec=0
42+
43+
[Service]
44+
Type=simple
45+
Restart=always
46+
RestartSec=1
47+
User=root
48+
ExecStart=/usr/bin/node_exporter
49+
50+
[Install]
51+
WantedBy=multi-user.target
52+
----
53+
54+
. Enable and start the `systemd` service.
55+
+
56+
[source,terminal]
57+
----
58+
$ sudo systemctl enable node_exporter.service
59+
$ sudo systemctl start node_exporter.service
60+
----
61+
62+
.Verification
63+
* Verify that the node-exporter agent is reporting metrics from the virtual machine.
64+
+
65+
[source,terminal]
66+
----
67+
$ curl http://localhost:9100/metrics
68+
----
69+
+
70+
.Example output
71+
[source,terminal]
72+
----
73+
go_gc_duration_seconds{quantile="0"} 1.5244e-05
74+
go_gc_duration_seconds{quantile="0.25"} 3.0449e-05
75+
go_gc_duration_seconds{quantile="0.5"} 3.7913e-05
76+
----
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * virt/logging_events-monitoring/virt-exposing-custom-metrics-for-vms.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="virt-creating-custom-monitoring-label-for-vms_{context}"]
7+
= Creating a custom monitoring label for virtual machines
8+
9+
To enable queries to multiple virtual machines from a single service, add a custom label in the virtual machine's YAML file.
10+
11+
.Prerequisites
12+
13+
* Install the {product-title} CLI `oc`.
14+
* Log in as a user with `cluster-admin` privileges.
15+
* Access to the web console for stop and restart a virtual machine.
16+
17+
.Procedure
18+
. Edit the `template` spec of your virtual machine configuration file. In this example, the label `monitor` has the value `metrics`.
19+
+
20+
[source,yaml]
21+
----
22+
spec:
23+
template:
24+
metadata:
25+
labels:
26+
monitor: metrics
27+
----
28+
29+
. Stop and restart the virtual machine to create a new pod with the label name given to the `monitor` label.
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * virt/logging_events-monitoring/virt-exposing-custom-metrics-for-vms.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="virt-creating-servicemonitor-resource-for-node-exporter_{context}"]
7+
= Creating a ServiceMonitor resource for the node exporter service
8+
9+
You can use a Prometheus client library and scrape metrics from the `/metrics` endpoint to access and view the metrics exposed by the node-exporter service. Use a `ServiceMonitor` custom resource definition (CRD) to monitor the node exporter service.
10+
11+
.Prerequisites
12+
13+
* You have access to the cluster as a user with `cluster-admin` privileges or the `monitoring-edit` role.
14+
* You have enabled monitoring for the user-defined project by configuring the node-exporter service.
15+
16+
.Procedure
17+
. Create a YAML file for the `ServiceMonitor` resource configuration. In this example, the service monitor matches any service with the label `metrics` and queries the `exmet` port every 30 seconds.
18+
19+
+
20+
[source,yaml]
21+
----
22+
apiVersion: monitoring.coreos.com/v1
23+
kind: ServiceMonitor
24+
metadata:
25+
labels:
26+
k8s-app: node-exporter-metrics-monitor
27+
name: node-exporter-metrics-monitor <1>
28+
namespace: dynamation <2>
29+
spec:
30+
endpoints:
31+
- interval: 30s <3>
32+
port: exmet <4>
33+
scheme: http
34+
selector:
35+
matchLabels:
36+
servicetype: metrics
37+
38+
----
39+
<1> The name of the `ServiceMonitor`.
40+
<2> The namespace where the `ServiceMonitor` is created.
41+
<3> The interval at which the port will be queried.
42+
<4> The name of the port that is queried every 30 seconds
43+
44+
. Create the `ServiceMonitor` configuration for the node-exporter service.
45+
+
46+
[source,terminal]
47+
----
48+
$ oc create -f node-exporter-metrics-monitor.yaml
49+
----
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * virt/logging_events-monitoring/virt-exposing-custom-metrics-for-vms.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="virt-querying-the-node-exporter-service-for-metrics-_{context}"]
7+
= Querying the node-exporter service for metrics
8+
9+
Metrics are exposed for virtual machines through an HTTP service endpoint under the `/metrics` canonical name. When you query for metrics, Prometheus directly scrapes the metrics from the metrics endpoint exposed by the virtual machines and presents these metrics for viewing.
10+
11+
.Prerequisites
12+
* You have access to the cluster as a user with `cluster-admin` privileges or the `monitoring-edit` role.
13+
* You have enabled monitoring for the user-defined project by configuring the node-exporter service.
14+
15+
.Procedure
16+
. Obtain the HTTP service endpoint by specifying the namespace for the service:
17+
+
18+
[source,terminal]
19+
----
20+
$ oc get service -n <namespace> <node-exporter-service>
21+
----
22+
23+
. To list all available metrics for the node-exporter service, query the `metrics` resource.
24+
+
25+
[source,terminal]
26+
----
27+
$ curl http://<172.30.226.162:9100>/metrics | grep -vE "^#|^$"
28+
----
29+
+
30+
.Example output
31+
[source,terminal]
32+
----
33+
node_arp_entries{device="eth0"} 1
34+
node_boot_time_seconds 1.643153218e+09
35+
node_context_switches_total 4.4938158e+07
36+
node_cooling_device_cur_state{name="0",type="Processor"} 0
37+
node_cooling_device_max_state{name="0",type="Processor"} 0
38+
node_cpu_guest_seconds_total{cpu="0",mode="nice"} 0
39+
node_cpu_guest_seconds_total{cpu="0",mode="user"} 0
40+
node_cpu_seconds_total{cpu="0",mode="idle"} 1.10586485e+06
41+
node_cpu_seconds_total{cpu="0",mode="iowait"} 37.61
42+
node_cpu_seconds_total{cpu="0",mode="irq"} 233.91
43+
node_cpu_seconds_total{cpu="0",mode="nice"} 551.47
44+
node_cpu_seconds_total{cpu="0",mode="softirq"} 87.3
45+
node_cpu_seconds_total{cpu="0",mode="steal"} 86.12
46+
node_cpu_seconds_total{cpu="0",mode="system"} 464.15
47+
node_cpu_seconds_total{cpu="0",mode="user"} 1075.2
48+
node_disk_discard_time_seconds_total{device="vda"} 0
49+
node_disk_discard_time_seconds_total{device="vdb"} 0
50+
node_disk_discarded_sectors_total{device="vda"} 0
51+
node_disk_discarded_sectors_total{device="vdb"} 0
52+
node_disk_discards_completed_total{device="vda"} 0
53+
node_disk_discards_completed_total{device="vdb"} 0
54+
node_disk_discards_merged_total{device="vda"} 0
55+
node_disk_discards_merged_total{device="vdb"} 0
56+
node_disk_info{device="vda",major="252",minor="0"} 1
57+
node_disk_info{device="vdb",major="252",minor="16"} 1
58+
node_disk_io_now{device="vda"} 0
59+
node_disk_io_now{device="vdb"} 0
60+
node_disk_io_time_seconds_total{device="vda"} 174
61+
node_disk_io_time_seconds_total{device="vdb"} 0.054
62+
node_disk_io_time_weighted_seconds_total{device="vda"} 259.79200000000003
63+
node_disk_io_time_weighted_seconds_total{device="vdb"} 0.039
64+
node_disk_read_bytes_total{device="vda"} 3.71867136e+08
65+
node_disk_read_bytes_total{device="vdb"} 366592
66+
node_disk_read_time_seconds_total{device="vda"} 19.128
67+
node_disk_read_time_seconds_total{device="vdb"} 0.039
68+
node_disk_reads_completed_total{device="vda"} 5619
69+
node_disk_reads_completed_total{device="vdb"} 96
70+
node_disk_reads_merged_total{device="vda"} 5
71+
node_disk_reads_merged_total{device="vdb"} 0
72+
node_disk_write_time_seconds_total{device="vda"} 240.66400000000002
73+
node_disk_write_time_seconds_total{device="vdb"} 0
74+
node_disk_writes_completed_total{device="vda"} 71584
75+
node_disk_writes_completed_total{device="vdb"} 0
76+
node_disk_writes_merged_total{device="vda"} 19761
77+
node_disk_writes_merged_total{device="vdb"} 0
78+
node_disk_written_bytes_total{device="vda"} 2.007924224e+09
79+
node_disk_written_bytes_total{device="vdb"} 0
80+
----
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
:_content-type: ASSEMBLY
2+
[id="virt-exposing-custom-metrics-for-vms"]
3+
= Exposing custom metrics for virtual machines
4+
include::modules/virt-document-attributes.adoc[]
5+
:context: virt-exposing-custom-metrics-for-vms
6+
7+
toc::[]
8+
9+
{product-title} includes a pre-configured, pre-installed, and self-updating monitoring stack that provides monitoring for core platform components. This monitoring stack is based on the Prometheus monitoring system. Prometheus is a time-series database and a rule evaluation engine for metrics.
10+
11+
In addition to using the {product-title} monitoring stack, you can enable monitoring for user-defined projects by using the CLI and query custom metrics that are exposed for virtual machines through the `node-exporter` service.
12+
13+
include::modules/virt-configuring-node-exporter-service.adoc[leveloffset=+1]
14+
include::modules/virt-configuring-vm-with-node-exporter-service.adoc[leveloffset=+1]
15+
include::modules/virt-creating-custom-monitoring-label-for-vms.adoc[leveloffset=+1]
16+
include::modules/virt-querying-the-node-exporter-service-for-metrics.adoc[leveloffset=+2]
17+
include::modules/virt-creating-servicemonitor-resource-for-node-exporter.adoc[leveloffset=+1]
18+
include::modules/virt-accessing-node-exporter-outside-cluster.adoc[leveloffset=+2]
19+
20+
[role="_additional-resources"]
21+
[id="additional-resources_virt-exposing-custom-metrics-for-vms"]
22+
== Additional resources
23+
* xref:../../monitoring/configuring-the-monitoring-stack.adoc#configuring-the-monitoring-stack[Configuring the monitoring stack]
24+
25+
* xref:../../monitoring/enabling-monitoring-for-user-defined-projects.adoc#enabling-monitoring-for-user-defined-projects[Enabling monitoring for user-defined projects]
26+
27+
* xref:../../monitoring/managing-metrics.adoc#managing-metrics[Managing metrics]
28+
29+
* xref:../../monitoring/reviewing-monitoring-dashboards.adoc#reviewing-monitoring-dashboards[Reviewing monitoring dashboards]
30+
31+
* xref:../../applications/application-health.adoc#application-health[Monitoring application health by using health checks]
32+
33+
* xref:../../nodes/pods/nodes-pods-configmaps.adoc#nodes-pods-configmaps[Creating and using config maps]
34+
35+
* xref:../../virt/virtual_machines/virt-controlling-vm-states.adoc#virt-controlling-vm-states[Controlling virtual machine states]

0 commit comments

Comments
 (0)