Merge pull request #52118 from bburt-rh/RHDEVDOCS-4249-cmo-config-for-topologyspreadconstraints

bburt-rh · web-flow · commit 6801b9119b86 · 2022-11-14T09:37:12.000-05:00
RHDEVDOCS-4249 - CMO config settings for topology spread constraints
diff --git a/modules/monitoring-configuring-pod-topology-spread-constraints-for-monitoring.adoc b/modules/monitoring-configuring-pod-topology-spread-constraints-for-monitoring.adoc
@@ -0,0 +1,12 @@
+// Module included in the following assemblies:
+//
+// * monitoring/configuring-the-monitoring-stack.adoc
+
+:_content-type: CONCEPT
+[id="configuring_pod_topology_spread_constraintsfor_monitoring_{context}"]
+= Configuring pod topology spread constraints for monitoring
+
+You can use pod topology spread constraints to control how Prometheus, Thanos Ruler, and Alertmanager pods are spread across a network topology when {product-title} pods are deployed in multiple availability zones.
+
+Pod topology spread constraints are suitable for controlling pod scheduling within hierarchical topologies in which nodes are spread across different infrastructure levels, such as regions and zones within those regions.
+Additionally, by being able to schedule pods in different zones, you can improve network latency in certain scenarios. 
diff --git a/modules/monitoring-setting-up-pod-topology-spread-constraints-for-alertmanager.adoc b/modules/monitoring-setting-up-pod-topology-spread-constraints-for-alertmanager.adoc
@@ -0,0 +1,69 @@
+// Module included in the following assemblies:
+//
+// * monitoring/configuring-the-monitoring-stack.adoc
+
+:_content-type: PROCEDURE
+[id="setting-up-pod-topology-spread-constraints-for-alertmanager_{context}"]
+= Setting up pod topology spread constraints for Alertmanager
+
+For core {product-title} platform monitoring, you can set up pod topology spread constraints for Alertmanager to fine tune how pod replicas are scheduled to nodes across zones. 
+Doing so helps ensure that Alertmanager pods are highly available and run more efficiently, because workloads are spread across nodes in different data centers or hierarchical infrastructure zones.
+
+You configure pod topology spread constraints for Alertmanager in the `cluster-monitoring-config` config map.
+
+.Prerequisites
+
+* You have installed the OpenShift CLI (`oc`).
+* You have access to the cluster as a user with the `cluster-admin` role.
+* You have created the `cluster-monitoring-config` `ConfigMap` object.
+
+.Procedure
+
+. Edit the `cluster-monitoring-config` `ConfigMap` object in the `openshift-monitoring` namespace:
++
+[source,terminal]
+----
+$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
+----
+
+. Add values for the following settings under `data/config.yaml/alertmanagermain` to configure pod topology spread constraints:
++
+[source,yaml]
+----
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: cluster-monitoring-config
+  namespace: openshift-monitoring
+data:
+  config.yaml: |
+    alertmanagerMain:
+      topologySpreadConstraints:
+      - maxSkew: 1 <1>
+        topologyKey: monitoring <2>
+        whenUnsatisfiable: DoNotSchedule <3>
+        labelSelector: 
+          matchLabels: <4>
+            app.kubernetes.io/name: alertmanager
+----
+<1> Specify a numeric value for `maxSkew`, which defines the degree to which pods are allowed to be unevenly distributed. 
+This field is required, and the value must be greater than zero. 
+The value specified has a different effect depending on what value you specify for `whenUnsatisfiable`.
+<2> Specify a key of node labels for `topologyKey`.
+This field is required.
+Nodes that have a label with this key and identical values are considered to be in the same topology.
+The scheduler will try to put a balanced number of pods into each domain.
+<3> Specify a value for `whenUnsatisfiable`.
+This field is required.
+Available options are `DoNotSchedule` and `ScheduleAnyway`.
+Specify `DoNotSchedule` if you want the `maxSkew` value to define the maximum difference allowed between the number of matching pods in the target topology and the global minimum. 
+Specify `ScheduleAnyway` if you want the scheduler to still schedule the pod but to give higher priority to nodes that might reduce the skew.
+<4> Specify a value for `matchLabels`. This value is used to identify the set of matching pods to which to apply the constraints.
+
+. Save the file to apply the changes automatically.
++
+[WARNING]
+====
+When you save changes to the `cluster-monitoring-config` config map, the pods and other resources in the `openshift-monitoring` project might be redeployed. 
+The running monitoring processes in that project might also restart.
+====
diff --git a/modules/monitoring-setting-up-pod-topology-spread-constraints-for-prometheus.adoc b/modules/monitoring-setting-up-pod-topology-spread-constraints-for-prometheus.adoc
@@ -0,0 +1,69 @@
+// Module included in the following assemblies:
+//
+// * monitoring/configuring-the-monitoring-stack.adoc
+
+:_content-type: PROCEDURE
+[id="setting-up-pod-topology-spread-constraints-for-prometheus_{context}"]
+= Setting up pod topology spread constraints for Prometheus
+
+For core {product-title} platform monitoring, you can set up pod topology spread constraints for Prometheus to fine tune how pod replicas are scheduled to nodes across zones.
+Doing so helps ensure that Prometheus pods are highly available and run more efficiently, because workloads are spread across nodes in different data centers or hierarchical infrastructure zones.
+
+You configure pod topology spread constraints for Prometheus in the `cluster-monitoring-config` config map.
+
+.Prerequisites
+
+* You have installed the OpenShift CLI (`oc`).
+* You have access to the cluster as a user with the `cluster-admin` role.
+* You have created the `cluster-monitoring-config` `ConfigMap` object.
+
+.Procedure
+
+. Edit the `cluster-monitoring-config` `ConfigMap` object in the `openshift-monitoring` namespace:
++
+[source,terminal]
+----
+$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
+----
+
+. Add  values for the following settings under `data/config.yaml/prometheusK8s` to configure pod topology spread constraints:
++
+[source,yaml]
+----
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: cluster-monitoring-config
+  namespace: openshift-monitoring
+data:
+  config.yaml: |
+    prometheusK8s:
+      topologySpreadConstraints:
+      - maxSkew: 1 <1>
+        topologyKey: monitoring <2>
+        whenUnsatisfiable: DoNotSchedule <3>
+        labelSelector: 
+          matchLabels: <4>
+            app.kubernetes.io/name: prometheus
+----
+<1> Specify a numeric value for `maxSkew`, which defines the degree to which pods are allowed to be unevenly distributed. 
+This field is required, and the value must be greater than zero. 
+The value specified has a different effect depending on what value you specify for `whenUnsatisfiable`.
+<2> Specify a key of node labels for `topologyKey`.
+This field is required.
+Nodes that have a label with this key and identical values are considered to be in the same topology.
+The scheduler will try to put a balanced number of pods into each domain.
+<3> Specify a value for `whenUnsatisfiable`.
+This field is required.
+Available options are `DoNotSchedule` and `ScheduleAnyway`.
+Specify `DoNotSchedule` if you want the `maxSkew` value to define the maximum difference allowed between the number of matching pods in the target topology and the global minimum. 
+Specify `ScheduleAnyway` if you want the scheduler to still schedule the pod but to give higher priority to nodes that might reduce the skew.
+<4> Specify a value for `matchLabels`. This value is used to identify the set of matching pods to which to apply the constraints.
+
+. Save the file to apply the changes automatically.
++
+[WARNING]
+====
+When you save changes to the `cluster-monitoring-config` config map, the pods and other resources in the `openshift-monitoring` project might be redeployed. 
+The running monitoring processes in that project might also restart.
+====
diff --git a/modules/monitoring-setting-up-pod-topology-spread-constraints-for-thanos-ruler.adoc b/modules/monitoring-setting-up-pod-topology-spread-constraints-for-thanos-ruler.adoc
@@ -0,0 +1,70 @@
+// Module included in the following assemblies:
+//
+// * monitoring/configuring-the-monitoring-stack.adoc
+
+:_content-type: PROCEDURE
+[id="setting-up-pod-topology-spread-constraints-for-thanos-ruler_{context}"]
+= Setting up pod topology spread constraints for Thanos Ruler
+
+For user-defined monitoring, you can set up pod topology spread constraints for Thanos Ruler to fine tune how pod replicas are scheduled to nodes across zones. 
+Doing so helps ensure that Thanos Ruler pods are highly available and run more efficiently, because workloads are spread across nodes in different data centers or hierarchical infrastructure zones.
+
+You configure pod topology spread constraints for Thanos Ruler in the `user-workload-monitoring-config` config map.
+
+.Prerequisites
+
+* You have installed the OpenShift CLI (`oc`).
+* A cluster administrator has enabled monitoring for user-defined projects.
+* You have access to the cluster as a user with the `cluster-admin` role, or as a user with the `user-workload-monitoring-config-edit` role in the `openshift-user-workload-monitoring` project.
+* You have created the `user-workload-monitoring-config` `ConfigMap` object.
+
+.Procedure
+
+. Edit the `user-workload-monitoring-config` config map in the `openshift-user-workload-monitoring` namespace:
++
+[source,terminal]
+----
+$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
+----
+
+. Add values for the following settings under `data/config.yaml/thanosRuler` to configure pod topology spread constraints:
++
+[source,yaml]
+----
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: user-workload-monitoring-config
+  namespace: openshift-user-workload-monitoring
+data:
+  config.yaml: |
+    thanosRuler:
+      topologySpreadConstraints:
+      - maxSkew: 1 <1>
+        topologyKey: monitoring <2>
+        whenUnsatisfiable: ScheduleAnyway <3>
+        labelSelector:
+          matchLabels: <4>
+            app.kubernetes.io/name: thanos-ruler
+----
+<1> Specify a numeric value for `maxSkew`, which defines the degree to which pods are allowed to be unevenly distributed. 
+This field is required, and the value must be greater than zero. 
+The value specified has a different effect depending on what value you specify for `whenUnsatisfiable`.
+<2> Specify a key of node labels for `topologyKey`.
+This field is required.
+Nodes that have a label with this key and identical values are considered to be in the same topology.
+The scheduler will try to put a balanced number of pods into each domain.
+<3> Specify a value for `whenUnsatisfiable`.
+This field is required.
+Available options are `DoNotSchedule` and `ScheduleAnyway`.
+Specify `DoNotSchedule` if you want the `maxSkew` value to define the maximum difference allowed between the number of matching pods in the target topology and the global minimum. 
+Specify `ScheduleAnyway` if you want the scheduler to still schedule the pod but to give higher priority to nodes that might reduce the skew.
+<4> Specify a value for `matchLabels`. This value is used to identify the set of matching pods to which to apply the constraints.
+
+. Save the file to apply the changes automatically.
++
+[WARNING]
+====
+When you save changes to the `user-workload-monitoring-config` config map, the pods and other resources in the `openshift-user-workload-monitoring` project might be redeployed. 
+The running monitoring processes in that project might also restart.
+====
diff --git a/monitoring/configuring-the-monitoring-stack.adoc b/monitoring/configuring-the-monitoring-stack.adoc
@@ -139,6 +139,19 @@ include::modules/monitoring-attaching-additional-labels-to-your-time-series-and-
 * See xref:../monitoring/configuring-the-monitoring-stack.adoc#preparing-to-configure-the-monitoring-stack[Preparing to configure the monitoring stack] for steps to create monitoring config maps.
 * xref:../monitoring/enabling-monitoring-for-user-defined-projects.adoc#enabling-monitoring-for-user-defined-projects[Enabling monitoring for user-defined projects]
 
+// Configuring topology spread constraints for monitoring components
+include::modules/monitoring-configuring-pod-topology-spread-constraints-for-monitoring.adoc[leveloffset=1]
+
+[role="_additional-resources"]
+.Additional resources
+
+* xref:../nodes/scheduling/nodes-scheduler-pod-topology-spread-constraints.adoc#nodes-scheduler-pod-topology-spread-constraints-about[Controlling pod placement by using pod topology spread constraints]
+* link:https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/[Kubernetes Pod Topology Spread Constraints documentation]
+
+include::modules/monitoring-setting-up-pod-topology-spread-constraints-for-prometheus.adoc[leveloffset=2]
+include::modules/monitoring-setting-up-pod-topology-spread-constraints-for-alertmanager.adoc[leveloffset=2]
+include::modules/monitoring-setting-up-pod-topology-spread-constraints-for-thanos-ruler.adoc[leveloffset=2]
+
 // Setting log levels for monitoring components
 include::modules/monitoring-setting-log-levels-for-monitoring-components.adoc[leveloffset=+1]