Skip to content

Commit d4e39a9

Browse files
Merge pull request #277 from lunarwhite/doc-metrics
CM-574: Enhance operand metrics doc based on latest builds
2 parents 6cfc5d7 + 52b0de6 commit d4e39a9

File tree

1 file changed

+128
-58
lines changed

1 file changed

+128
-58
lines changed

docs/operand_metrics.md

Lines changed: 128 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,17 @@
1-
## Enabling metrics and monitoring for `cert-manager`
1+
## Monitoring cert-manager Metrics with OpenShift Monitoring
22

3-
Cert-Manager exposes controller metrics in the format expected by [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator).
3+
cert-manager exposes metrics in the format expected by [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator) for all three of its core components: controller, cainjector, and webhook.
44

5-
ServiceMonitor resource needs to be created to scrape metrics from cert-manager operand, make sure Prometheus Operator is configured with required selectors.
5+
You can configure OpenShift Monitoring to collect metrics from cert-manager operands by enabling the built-in user workload monitoring stack. This allows you to monitor user-defined projects in addition to the default platform monitoring.
66

7-
`.spec.serviceMonitorNamespaceSelector` and `.spec.serviceMonitorSelector` fields of prometheus resource should contain corresponding `matchLabels: openshift.io/cluster-monitoring:true`. To verify it, we can run the following commands.
7+
### Enable User Workload Monitoring
88

9-
```sh
10-
kubectl -n monitoring get prometheus k8s --template='{{.spec.serviceMonitorNamespaceSelector}}{{"\n"}}{{.spec.serviceMonitorSelector}}{{"\n"}}'
11-
map[matchLabels:map[openshift.io/cluster-monitoring:true]]
12-
map[]
13-
```
14-
For OpenShift:
15-
```sh
16-
oc -n openshift-monitoring get prometheus k8s --template='{{.spec.serviceMonitorNamespaceSelector}}{{"\n"}}{{.spec.serviceMonitorSelector}}{{"\n"}}'
17-
map[matchLabels:map[openshift.io/cluster-monitoring:true]]
18-
map[]
19-
```
20-
Label the Operand's namespace to enable cluster monitoring in it's namespace.
21-
22-
`
23-
$ oc label namespace cert-manager openshift.io/cluster-monitoring=true
24-
`
25-
26-
Please follow the steps below to `enable the monitoring for user-defined projects` in Openshift:
9+
Cluster administrators can enable monitoring for user-defined projects by setting the `enableUserWorkload: true` field in the cluster monitoring ConfigMap object. For more details, Please look at the detailed documentation to [Configuring user workload monitoring](https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/monitoring/configuring-user-workload-monitoring).
2710

28-
Cluster administrators can enable monitoring for user-defined projects by setting the `enableUserWorkload: true` field in the cluster monitoring ConfigMap object.
29-
30-
1. Edit the cluster-monitoring-config ConfigMap object:
31-
32-
`$ oc -n openshift-monitoring edit configmap cluster-monitoring-config`
33-
34-
2. Add `enableUserWorkload: true` under data/config.yaml:
11+
1. Create or edit the ConfigMap `cluster-monitoring-config` in namespace `openshift-monitoring`.
3512

3613
```
14+
$ oc apply -f - <<EOF
3715
apiVersion: v1
3816
kind: ConfigMap
3917
metadata:
@@ -42,59 +20,151 @@ metadata:
4220
data:
4321
config.yaml: |
4422
enableUserWorkload: true
23+
EOF
4524
```
4625

47-
3. Check that the prometheus-operator, prometheus-user-workload and thanos-ruler-user-workload pods are running in the openshift-user-workload-monitoring project. It might take a short while for the pods to start:
26+
2. Wait and check that the monitoring components for user workloads are up and running in the `openshift-user-workload-monitoring` namespace.
4827

49-
`$ oc -n openshift-user-workload-monitoring get pod`
5028
```
51-
Example output
52-
53-
NAME READY STATUS RESTARTS AGE
54-
prometheus-operator-6f7b748d5b-t7nbg 2/2 Running 0 3h
55-
prometheus-user-workload-0 4/4 Running 1 3h
56-
prometheus-user-workload-1 4/4 Running 1 3h
57-
thanos-ruler-user-workload-0 3/3 Running 0 3h
58-
thanos-ruler-user-workload-1 3/3 Running 0 3h
29+
$ oc -n openshift-user-workload-monitoring get pod
30+
NAME READY STATUS RESTARTS AGE
31+
prometheus-operator-6cb6bd9588-dtzxq 2/2 Running 0 50s
32+
prometheus-user-workload-0 6/6 Running 0 48s
33+
prometheus-user-workload-1 6/6 Running 0 48s
34+
thanos-ruler-user-workload-0 4/4 Running 0 42s
35+
thanos-ruler-user-workload-1 4/4 Running 0 42s
5936
```
60-
When set to true, the enableUserWorkload parameter enables monitoring for user-defined projects in a cluster.
6137

62-
For more details, Please look at the detailed documentation to [enable the monitoring for user-defined projects in Openshift](https://docs.openshift.com/container-platform/4.11/monitoring/enabling-monitoring-for-user-defined-projects.html):
38+
You should see pods like `prometheus-operator`, `prometheus-user-workload`, and `thanos-ruler-user-workload` in a Running status.
6339

64-
4. Apply the Service Monitor in your openshift cluster.
40+
### Configure Metric Scraping for cert-manager
41+
42+
cert-manager operands (controller, webhook, and cainjector) expose Prometheus metrics on port 9402 by default via the `/metrics` service endpoint. To collect metrics from these services, you need to define how Prometheus should scrape their metrics endpoints. This is typically done using a ServiceMonitor or PodMonitor custom resource. The following example uses the ServiceMonitor for demonstration.
43+
44+
1. Check the cert-manager services in the `cert-manager` namespace.
6545

66-
`service-monitor.yaml`
6746
```
47+
$ oc -n cert-manager get service
48+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
49+
cert-manager ClusterIP 172.30.199.12 <none> 9402/TCP 54s
50+
cert-manager-cainjector ClusterIP 172.30.148.41 <none> 9402/TCP 63s
51+
cert-manager-webhook ClusterIP 172.30.100.46 <none> 443/TCP,9402/TCP 62s
52+
```
53+
54+
2. Apply a YAML manifest for the ServiceMonitor to look for services matching the specified labels within the `cert-manager` namespace and scrape metrics from their `/metrics` path on port 9402.
55+
56+
```
57+
$ oc apply -f - <<EOF
6858
apiVersion: monitoring.coreos.com/v1
6959
kind: ServiceMonitor
7060
metadata:
7161
labels:
7262
app: cert-manager
73-
app.kubernetes.io/component: controller
7463
app.kubernetes.io/instance: cert-manager
7564
app.kubernetes.io/name: cert-manager
7665
name: cert-manager
7766
namespace: cert-manager
7867
spec:
7968
endpoints:
80-
- interval: 30s
81-
port: tcp-prometheus-servicemonitor
82-
scheme: http
69+
- honorLabels: false
70+
interval: 60s
71+
path: /metrics
72+
scrapeTimeout: 30s
73+
targetPort: 9402
8374
selector:
84-
matchLabels:
85-
app.kubernetes.io/component: controller
86-
app.kubernetes.io/instance: cert-manager
87-
app.kubernetes.io/name: cert-manager
75+
matchExpressions:
76+
- key: app.kubernetes.io/name
77+
operator: In
78+
values:
79+
- cainjector
80+
- cert-manager
81+
- webhook
82+
- key: app.kubernetes.io/instance
83+
operator: In
84+
values:
85+
- cert-manager
86+
- key: app.kubernetes.io/component
87+
operator: In
88+
values:
89+
- cainjector
90+
- controller
91+
- webhook
92+
EOF
93+
```
94+
95+
Once the ServiceMonitor is in place and user workload monitoring is enabled, the Prometheus instance for user workloads will start collecting metrics from the cert-manager operands. The scraped metrics will be labeled with `job="cert-manager"`, `job="cert-manager-cainjector"`, or `job="cert-manager-webhook"` respectively.
96+
97+
You can select and view these Prometheus Targets via the OpenShift web console, by navigating to the "Observe" -> "Targets" page.
98+
99+
### Query Metrics
100+
101+
As a cluster administrator or as a user with view permissions for all projects, You can access these metrics using the command line or via the OpenShift web console. For more details, Please look at the detailed documentation to [Accessing metrics](https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/monitoring/accessing-metrics).
102+
103+
1. Retrieve a bearer token. You can use the following command to get a token for a specific service account.
104+
```
105+
$ TOKEN=$(oc create token prometheus-k8s -n openshift-monitoring)
88106
```
89-
`$ oc apply -f service-monitor.yaml -n cert-manager`
90107

91-
The 'Service Monitor' will be collecting the metrics through the cert-manager `service` and will be using the port name of the service as its endpoints port.
92-
Following [Template](https://github.com/cert-manager/cert-manager/blob/master/deploy/charts/cert-manager/templates/servicemonitor.yaml) can be used for the helm configurations.
108+
Alternatively, if you have cluster-admin access or view permissions for all projects, you might be able to use `$(oc whoami -t)` to get your own user token.
93109

94-
### Quering Metrics
110+
2. Get the OpenShift API route for Thanos Querier.
95111

96-
As a cluster administrator or as a user with view permissions for all projects, you can access metrics for all default OpenShift Container Platform and user-defined projects in the Metrics UI by using the endpoints of the `cert-manager service`.
112+
```
113+
$ URL=$(oc get route thanos-querier -n openshift-monitoring -o=jsonpath='{.status.ingress[0].host}')
114+
```
97115

98-
`$ oc describe service cert-manager -n cert-manager`
116+
3. Query the metrics using `curl`, authenticating with the bearer token. The query uses the `/api/v1/query endpoint`. The output will be in JSON format, using `| jq` for pretty JSON formatting.
117+
118+
```
119+
$ curl -s -k -H "Authorization: Bearer $TOKEN" https://$URL/api/v1/query --data-urlencode 'query={job="cert-manager"}' | jq
120+
```
121+
122+
Example output:
123+
124+
```
125+
{
126+
"status": "success",
127+
"data": {
128+
"resultType": "vector",
129+
"result": [
130+
{
131+
"metric": {
132+
"__name__": "certmanager_clock_time_seconds",
133+
"container": "cert-manager-controller",
134+
"endpoint": "9402",
135+
"instance": "10.131.0.65:9402",
136+
"job": "cert-manager",
137+
"namespace": "cert-manager",
138+
"pod": "cert-manager-b687bdddc-sv4xt",
139+
"prometheus": "openshift-user-workload-monitoring/user-workload",
140+
"service": "cert-manager"
141+
},
142+
"value": [
143+
1747897178.158,
144+
"1747897156"
145+
]
146+
},
147+
...
148+
{
149+
"metric": {
150+
"__name__": "up",
151+
"container": "cert-manager-controller",
152+
"endpoint": "9402",
153+
"instance": "10.131.0.65:9402",
154+
"job": "cert-manager",
155+
"namespace": "cert-manager",
156+
"pod": "cert-manager-b687bdddc-sv4xt",
157+
"prometheus": "openshift-user-workload-monitoring/user-workload",
158+
"service": "cert-manager"
159+
},
160+
"value": [
161+
1747897178.158,
162+
"1"
163+
]
164+
}
165+
]
166+
}
167+
}
168+
```
99169

100-
To query cert-manager controller metrics, select `ObserveMetrics` and filter the metrics of the cert-manager controller with `{instance="<Endpoints>"}` or `{endpoint="tcp-prometheus-servicemonitor"}`.
170+
In OpenShift web console, you can also view these metrics by navigating to the "Observe" -> "Metrics" page, and filter the metrics of each operands with `{job="<JobLabel>"}`, `{instance="<Endpoints>"}` or other advanced query expressions.

0 commit comments

Comments
 (0)