Skip to content

Commit a88ed23

Browse files
committed
Add metrics docs
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
1 parent 932c378 commit a88ed23

File tree

3 files changed

+120
-2
lines changed

3 files changed

+120
-2
lines changed

docs/reference/metrics.md

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
---
2+
title: Metrics
3+
---
4+
5+
import Label from '@site/src/components/Label';
6+
7+
K3s provides metrics for monitoring the health and performance of the cluster.
8+
9+
Most metrics are provided by individual components. See the following component-specific documentation for more information:
10+
* [coredns metrics](https://coredns.io/plugins/metrics/)
11+
* [etcd metrics](https://etcd.io/docs/v3.5/metrics/)
12+
13+
Additional metrics may be provided by other components. Consult the upstream project documentation for any components not listed above.
14+
15+
## Supervisor Metrics
16+
17+
When K3s is started with `supervisor-metrics: true`, metrics are exposed by the K3s process and can be accessed via the `/metrics` endpoint on each node at port `6443`:
18+
19+
```sh
20+
kubectl get --server https://NODENAME:6443 --raw /metrics
21+
```
22+
23+
Metrics exposed by the K3s supervisor process include:
24+
* K3s Cluster Management Metrics
25+
* [Lasso controller metrics](https://github.com/rancher/lasso/blob/main/README.md#lasso-controller)
26+
* [Kubernetes client and workqueue metrics](https://github.com/kubernetes/client-go/blob/master/README.md)
27+
* [Kubernetes Node Metrics](https://kubernetes.io/docs/reference/instrumentation/node-metrics/)
28+
* [Kubernetes Component Metrics](https://kubernetes.io/docs/reference/instrumentation/metrics/)
29+
* [Go runtime metrics](https://pkg.go.dev/runtime/metrics#hdr-Supported_metrics)
30+
* If the K3s embedded registry is enabled, [Spegel metrics](https://spegel.dev/docs/metrics/) and [libp2p metrics](https://github.com/libp2p/go-libp2p/blob/master/README.md)
31+
32+
K3s runs all Kubernetes components in the main K3s process.
33+
Since Kubernetes uses a single Prometheus metric registry per process, metrics for all components are available via all exposed metrics endpoints.
34+
If you scrape all the individual metrics endpoints, you may find that you are collecting duplicate metrics.
35+
It is only necessary to scrape a single K3s metric endpoint in order to get metrics for all embedded Kubernetes components.
36+
37+
## K3s Cluster Management Metrics
38+
39+
### k3s_certificate_expiration_seconds
40+
41+
Remaining lifetime in seconds of the certificate, labeled by certificate subject and usages.
42+
- Type: Gauge
43+
- Labels: <Label>subject</Label> <Label>usage</Label>
44+
45+
### k3s_loadbalancer_server_connections
46+
47+
Count of current connections to loadbalancer server, labeled by loadbalancer name and server address.
48+
- Type: Gauge
49+
- Labels: <Label>name</Label> <Label>server</Label>
50+
51+
### k3s_loadbalancer_server_health
52+
53+
Current health state of loadbalancer backend servers, labeled by loadbalancer name and server address.
54+
State is enum of 0=INVALID, 1=FAILED, 2=STANDBY, 3=UNCHECKED, 4=RECOVERING, 5=HEALTHY, 6=PREFERRED, 7=ACTIVE.
55+
- Type: Gauge
56+
- Labels: <Label>name</Label> <Label>server</Label>
57+
58+
### k3s_loadbalancer_dial_duration_seconds
59+
60+
Time in seconds taken to dial a connection to a backend server, labeled by loadbalancer name and success/failure status.
61+
- Type: Histogram
62+
- Labels: <Label>name</Label> <Label>status</Label>
63+
64+
### k3s_etcd_snapshot_save_duration_seconds
65+
66+
Total time in seconds taken to complete the etcd snapshot process, labeled by success/failure status.
67+
- Type: Histrogram
68+
- Labels: <Label>status</Label>
69+
70+
### k3s_etcd_snapshot_save_local_duration_seconds
71+
72+
Total time in seconds taken to save a local snapshot file, labeled by success/failure status.
73+
- Type: Histrogram
74+
- Labels: <Label>status</Label>
75+
76+
### k3s_etcd_snapshot_save_s3_duration_seconds
77+
78+
Total time in seconds taken to upload a snapshot file to S3, labeled by success/failure status.
79+
- Type: Histrogram
80+
- Labels: <Label>status</Label>
81+
82+
### k3s_etcd_snapshot_reconcile_duration_seconds
83+
84+
Total time in seconds taken to sync the list of etcd snapshots, labeled by success/failure status.
85+
- Type: Histrogram
86+
- Labels: <Label>status</Label>
87+
88+
### k3s_etcd_snapshot_reconcile_local_duration_seconds
89+
90+
Total time in seconds taken to list local snapshot files, labeled by success/failure status.
91+
- Type: Histrogram
92+
- Labels: <Label>status</Label>
93+
94+
### k3s_etcd_snapshot_reconcile_s3_duration_seconds
95+
96+
Total time in seconds taken to list S3 snapshot files, labeled by success/failure status.
97+
- Type: Histrogram
98+
- Labels: <Label>status</Label>

src/components/Label.js

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
import React from 'react';
2+
3+
export default function Label({children}) {
4+
return (
5+
<span
6+
style={{
7+
backgroundColor: 'var(--label-color)',
8+
color: 'var(--ifm-color-content)',
9+
padding: '0.2rem 0.5rem',
10+
margin: '0 0.2rem',
11+
borderRadius: '3px',
12+
border: '1px solid #d0d7de',
13+
whiteSpace: 'nowrap',
14+
}}>
15+
{children}
16+
</span>
17+
);
18+
}

src/css/custom.css

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@
4747
--ifm-color-primary-lightest: #086b9f;
4848
--ifm-color-secondary: #ffc61c;
4949
--ifm-color-secondary-light: #ffcd38;
50+
--label-color : #d5d5d5;
5051
--dark : #33313b;
5152
--light : #F3F3F3;
5253
}
@@ -62,8 +63,9 @@
6263
--ifm-color-secondary-dark: #054a6e;
6364
--ifm-color-secondary: #06527a;
6465
--ifm-color-secondary-light: #075a86;
66+
--label-color : #434343;
6567
--light : #33313b;
66-
--dark : #F3F3F3;
68+
--dark : #F3F3F3;
6769
}
6870

6971
[data-theme='dark'] .footer--dark {
@@ -121,4 +123,4 @@ hr {
121123
.navbar-sidebar__back {
122124
display: none;
123125
}
124-
}
126+
}

0 commit comments

Comments
 (0)