Merge pull request #69270 from bburt-rh/OBSDOCS-573-add-code-sample-to-set-up-limits-and-requests-for-mon-components

bburt-rh · web-flow · commit 33611e1bf439 · 2023-12-15T11:24:58.000-05:00
OBSDOCS#573: add docs for configuring resource limits and requests for monitoring components
diff --git a/modules/monitoring-about-specifying-limits-and-requests-for-monitoring-components.adoc b/modules/monitoring-about-specifying-limits-and-requests-for-monitoring-components.adoc
@@ -0,0 +1,27 @@
+// Module included in the following assemblies:
+//
+// * monitoring/configuring-the-monitoring-stack.adoc
+
+:_mod-docs-content-type: CONCEPT
+[id="about-specifying-limits-and-requests-for-monitoring-components_{context}"]
+= About specifying limits and requests for monitoring components
+
+You can configure resource limits and request settings for core platform monitoring components and for the components that monitor user-defined projects, including the following components:
+
+* Alertmanager (for core platform monitoring and for user-defined projects)
+* kube-state-metrics
+* monitoring-plugin
+* node-exporter
+* openshift-state-metrics
+* Prometheus (for core platform monitoring and for user-defined projects)
+* Prometheus Adapter
+* Prometheus Operator and its admission webhook service
+* Telemeter Client
+* Thanos Querier
+* Thanos Ruler
+
+By defining resource limits, you limit a container's resource usage, which prevents the container from exceeding the specified maximum values for CPU and memory resources.
+
+By defining resource requests, you specify that a container can be scheduled only on a node that has enough CPU and memory resources available to match the requested resources.
+
+
diff --git a/modules/monitoring-specifying-limits-and-requests-for-monitoring-components.adoc b/modules/monitoring-specifying-limits-and-requests-for-monitoring-components.adoc
@@ -0,0 +1,148 @@
+// Module included in the following assemblies:
+//
+// * monitoring/configuring-the-monitoring-stack.adoc
+
+:_mod-docs-content-type: PROCEDURE
+[id="specifying-limits-and-resource-requests-for-monitoring-components_{context}"]
+= Specifying limits and requests for monitoring components 
+
+To configure CPU and memory resources, specify values for resource limits and requests in the appropriate `ConfigMap` object for the namespace in which the monitoring component is located:
+
+* The `cluster-monitoring-config` config map in the `openshift-monitoring` namespace for core platform monitoring
+* The `user-workload-monitoring-config` config map in the `openshift-user-workload-monitoring` namespace for components that monitor user-defined projects
+
+.Prerequisites
+
+* *If you are configuring core platform monitoring components*:
+** You have access to the cluster as a user with the `cluster-admin` cluster role.
+** You have created a `ConfigMap` object named `cluster-monitoring-config`.
+* *If you are configuring components that monitor user-defined projects*:
+** You have access to the cluster as a user with the `cluster-admin` cluster role, or as a user with the `user-workload-monitoring-config-edit` role in the `openshift-user-workload-monitoring` project.
+* You have installed the OpenShift CLI (`oc`).
+
+.Procedure
+
+. To configure core platform monitoring components, edit the `cluster-monitoring-config` config map object in the `openshift-monitoring` namespace:
++
+[source,terminal]
+----
+$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
+----
+
+. Add values to define resource limits and requests for each core platform monitoring component you want to configure.
++
+[IMPORTANT]
+====
+Make sure that the value set for a limit is always higher than the value set for a request.
+Otherwise, an error will occur, and the container will not run.
+====
++
+.Example
++
+[source,yaml]
+----
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: cluster-monitoring-config
+  namespace: openshift-monitoring
+data:
+  config.yaml: |
+    alertmanagerMain:
+      resources:
+        limits:
+          cpu: 500m
+          memory: 1Gi
+        requests:
+          cpu: 200m
+          memory: 500Mi
+    prometheusK8s:
+      resources:
+        limits:
+          cpu: 500m
+          memory: 3Gi
+        requests:
+          cpu: 200m
+          memory: 500Mi
+    prometheusOperator:
+      resources:
+        limits:
+          cpu: 500m
+          memory: 1Gi
+        requests:
+          cpu: 200m
+          memory: 500Mi
+    k8sPrometheusAdapter:
+      resources:
+        limits:
+          cpu: 500m
+          memory: 1Gi
+        requests:
+          cpu: 200m
+          memory: 500Mi
+    kubeStateMetrics:
+      resources:
+        limits:
+          cpu: 500m
+          memory: 1Gi
+        requests:
+          cpu: 200m
+          memory: 500Mi
+    telemeterClient:
+      resources:
+        limits:
+          cpu: 500m
+          memory: 1Gi
+        requests:
+          cpu: 200m
+          memory: 500Mi
+    openshiftStateMetrics:
+      resources:
+        limits:
+          cpu: 500m
+          memory: 1Gi
+        requests:
+          cpu: 200m
+          memory: 500Mi
+    thanosQuerier:
+      resources:
+        limits:
+          cpu: 500m
+          memory: 1Gi
+        requests:
+          cpu: 200m
+          memory: 500Mi
+    nodeExporter:
+      resources:
+        limits:
+          cpu: 50m
+          memory: 150Mi
+        requests:
+          cpu: 20m
+          memory: 50Mi
+    monitoringPlugin:
+      resources:
+        limits:
+          cpu: 500m
+          memory: 1Gi
+        requests:
+          cpu: 200m
+          memory: 500Mi
+    prometheusOperatorAdmissionWebhook:
+      resources:
+        limits:
+          cpu: 50m
+          memory: 100Mi
+        requests:
+          cpu: 20m
+          memory: 50Mi
+----
+
+. Save the file to apply the changes automatically.
++
+[IMPORTANT]
+====
+When you save changes to the `cluster-monitoring-config` config map, the pods and other resources in the `openshift-monitoring` project might be redeployed.
+The running monitoring processes in that project might also restart.
+====
+
diff --git a/monitoring/configuring-the-monitoring-stack.adoc b/monitoring/configuring-the-monitoring-stack.adoc
@@ -119,6 +119,22 @@ include::modules/monitoring-setting-the-body-size-limit-for-metrics-scraping.ado
 * link:https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config[Prometheus scrape configuration documentation]
 endif::openshift-dedicated,openshift-rosa[]
 
+// Configuring limits and resource requests for monitoring components
+
+[id="managing-cpu-and-memory-resources-for-monitoring-components"]
+== Managing CPU and memory resources for monitoring components 
+
+You can ensure that the containers that run monitoring components have enough CPU and memory resources by specifying values for resource limits and requests for those components.
+
+You can configure these limits and requests for core platform monitoring components in the `openshift-monitoring` namespace and for the components that monitor user-defined projects in the `openshift-user-workload-monitoring` namespace.
+
+include::modules/monitoring-about-specifying-limits-and-requests-for-monitoring-components.adoc[leveloffset=+2]
+include::modules/monitoring-specifying-limits-and-requests-for-monitoring-components.adoc[leveloffset=+2]
+
+[role="_additional-resources"]
+.Additional resources
+* link:https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits[Kubernetes requests and limits documentation]
+
 // Enabling a dedicated service monitor
 include::modules/monitoring-configuring-dedicated-service-monitors.adoc[leveloffset=+1]
 include::modules/monitoring-enabling-a-dedicated-service-monitor.adoc[leveloffset=+2]