Merge pull request #46359 from pneedle-rh/osdocs-3303-updating-rosa-planning-considerations

pneedle-rh · web-flow · commit e6d821758817 · 2022-06-13T12:23:34.000+01:00
OSDOCS-3303 - Updating the ROSA planning considerations
diff --git a/modules/rosa-planning-cluster-maximums-environment.adoc b/modules/rosa-planning-cluster-maximums-environment.adoc
@@ -39,7 +39,7 @@ The following table lists the OpenShift Container Platform environment and confi
 |3
 |us-west-2
 
-|Worker nodes
+|Compute nodes
 |m5.2xlarge
 |8
 |32
diff --git a/modules/rosa-planning-cluster-maximums.adoc b/modules/rosa-planning-cluster-maximums.adoc
@@ -6,7 +6,14 @@
 [id="tested-cluster-maximums_{context}"]
 = ROSA tested cluster maximums
 
-The following table specifies the maximum limits for each tested type in a {product-title} cluster.
+Consider the following tested object maximums when you plan a {product-title} (ROSA) cluster installation. The table specifies the maximum limits for each tested type in a ROSA cluster.
+
+These guidelines are based on a cluster of 102 compute (also known as worker) nodes in a multiple availability zone configuration. For smaller clusters, the maximums are lower.
+
+[NOTE]
+====
+The OpenShift Container Platform version used in all of the tests is OCP 4.8.0.
+====
 
 .Tested cluster maximums
 [options="header",cols="50,50"]
diff --git a/modules/rosa-planning-considerations.adoc b/modules/rosa-planning-considerations.adoc
@@ -1,22 +1,25 @@
 
 // Module included in the following assemblies:
 //
-// rosa_planning/rosa-planning-environment.adoc
+// rosa_planning/rosa-limits-scalability.adoc
 
-[id="initial-planning-considerations_{context}"]
-= Initial planning considerations
+[id="control-plane-and-infra-node-sizing-and-scaling_{context}"]
+= Control plane and infrastructure node sizing and scaling
 
-Consider the following tested object maximums when you plan your {product-title} cluster.
+When you install a {product-title} (ROSA) cluster, the sizing of the control plane and infrastructure nodes are automatically determined by the compute node count.
 
-These guidelines are based on a cluster of 102 workers in a multi-availability zone configuration. For smaller clusters, the maximums are lower.
+If you change the number of compute nodes in your cluster after installation, the Red Hat Site Reliability Engineering (SRE) team scales the control plane and infrastructure nodes as required to maintain cluster stability.
 
-The sizing of the control plane and infrastructure nodes is dynamically calculated during the installation process, based on the number of worker nodes. If you change the number of worker nodes after the installation, control plane and infra nodes must be resized manually. Infra nodes are resized by the Red Hat SRE team, and you can link:https://access.redhat.com/[open a ticket in the Customer Portal] to request the infra node resizing.
+[id="node-sizing-during-installation_{context}"]
+== Node sizing during installation
 
-The following table lists the size of control plane and infrastructure nodes that are assigned during installation.
+During the installation process, the sizing of the control plane and infrastructure nodes are dynamically calculated. The sizing calculation is based on the number of compute nodes in a cluster.
+
+The following table lists the control plane and infrastructure node sizing that is applied during installation.
 
 [options="header",cols="3*"]
 |===
-| Number of worker nodes |Control plane size |Infrastructure node size
+| Number of compute nodes |Control plane size |Infrastructure node size
 
 |1 to 25
 |m5.2xlarge
@@ -32,11 +35,47 @@ The following table lists the size of control plane and infrastructure nodes tha
 |===
 [.small]
 --
-1. The maximum number of worker nodes on ROSA is 180
+1. The maximum number of compute nodes on ROSA is 180.
 --
 
-For larger clusters, infrastructure node sizing can become a large impacting factor to scalability. There are many factors that influence the stated thresholds, including the etcd version or storage data format.
+[id="node-scaling-after-installation_{context}"]
+== Node scaling after installation
 
-Exceeding these limits does not necessarily mean that the cluster will fail. In most cases, exceeding these numbers results in lower overall performance.
+If you change the number of compute nodes after installation, the control plane and infrastructure nodes are scaled by the Red Hat Site Reliability Engineering (SRE) team as required. The nodes are scaled to maintain platform stability.
+
+Post-installation scaling requirements for control plane and infrastructure nodes are assessed on a case-by-case basis. Node resource consumption and received alerts are taken into consideration.
+
+.Rules for control plane node resizing alerts
+
+Resizing alerts are triggered for the control plane nodes in a cluster when either of the following scenarios are true: 
+
+* Each control plane node has less than 16GiB RAM, and there are more than 25 and less than 101 compute nodes.
+* Each control plane node has less than 32GiB RAM, and there are more than 100 compute nodes.
++
+[NOTE]
+====
+The maximum number of compute nodes on ROSA is 180.
+====
+
+.Rules for infrastructure node resizing alerts
 
-The OpenShift Container Platform version used in all of the tests is OCP 4.8.0.
+Resizing alerts are triggered for the infrastructure nodes in a cluster when either of the following scenarios are true: 
+
+* Each infrastructure node has less than 16GiB RAM or less than 5 CPUs, and there are more than 25 and less than 101 compute nodes.
+* Each infrastructure node has less than 32GiB RAM or less than 9 CPUs, and there are more than 100 compute nodes.
++
+[NOTE]
+====
+The maximum number of compute nodes on ROSA is 180.
+====
+
+The SRE team might scale the control plane and infrastructure nodes for additional reasons, for example to manage an increase in resource consumption on the nodes.
+
+When scaling is applied, the customer is notified through a service log entry.
+
+[id="sizing-considerations-for-larger-clusters_{context}"]
+== Sizing considerations for larger clusters
+
+For larger clusters, infrastructure node sizing can become a significant impacting factor to scalability. There are many factors that influence the stated thresholds, including the etcd version or storage data format.
+
+Exceeding these limits does not necessarily mean that the cluster will fail. In most cases, exceeding these numbers results in lower overall performance.
diff --git a/rosa_install_access_delete_clusters/rosa_getting_started_iam/rosa-aws-prereqs.adoc b/rosa_install_access_delete_clusters/rosa_getting_started_iam/rosa-aws-prereqs.adoc
@@ -22,10 +22,10 @@ include::modules/rosa-aws-provisioned.adoc[leveloffset=+1]
 include::modules/osd-aws-privatelink-firewall-prerequisites.adoc[leveloffset=+1]
 
 == Next steps
-xref:../rosa_getting_started_iam/rosa-required-aws-service-quotas.adoc#rosa-required-aws-service-quotas[Review the required AWS service quotas]
+* xref:../rosa_getting_started_iam/rosa-required-aws-service-quotas.adoc#rosa-required-aws-service-quotas[Review the required AWS service quotas]
 
 [role="_additional-resources"]
 == Additional resources
-* See xref:../../rosa_planning/rosa-limits-scalability.adoc#initial-planning-considerations_rosa-limits-scalability[Intial Planning Considerations] for guidance on worker node count.
-* See xref:../../rosa_architecture/rosa_policy_service_definition/rosa-policy-process-security.adoc#rosa-policy-sre-access_rosa-policy-process-security[SRE access to all Red Hat OpenShift Service on AWS clusters] for information about how Red Hat site reliability engineering accesses ROSA clusters.
+* xref:../../rosa_planning/rosa-limits-scalability.adoc#rosa-limits-scalability[Limits and scalability]
+* xref:../../rosa_architecture/rosa_policy_service_definition/rosa-policy-process-security.adoc#rosa-policy-sre-access_rosa-policy-process-security[SRE access to all Red Hat OpenShift Service on AWS clusters]
 * xref:../rosa_getting_started_iam/rosa-getting-started-workflow.adoc#rosa-understanding-the-deployment-workflow[Understanding the ROSA deployment workflow]
diff --git a/rosa_planning/rosa-limits-scalability.adoc b/rosa_planning/rosa-limits-scalability.adoc
@@ -7,6 +7,13 @@ include::_attributes/attributes-openshift-dedicated.adoc[]
 
 toc::[]
 
-include::modules/rosa-planning-considerations.adoc[leveloffset=+1]
+This document details the tested cluster maximums for {product-title} (ROSA) clusters, along with information about the test environment and configuration used to test the maximums. Information about control plane and infrastructure node sizing and scaling is also provided.
+
 include::modules/rosa-planning-cluster-maximums.adoc[leveloffset=+1]
 include::modules/rosa-planning-cluster-maximums-environment.adoc[leveloffset=+1]
+include::modules/rosa-planning-considerations.adoc[leveloffset=+1]
+
+[id="next-steps_configuring-alert-notifications"]
+== Next steps
+
+* xref:../rosa_planning/rosa-planning-environment.adoc#rosa-planning-environment[Planning your environment]
diff --git a/rosa_planning/rosa-sts-aws-prereqs.adoc b/rosa_planning/rosa-sts-aws-prereqs.adoc
@@ -28,9 +28,9 @@ include::modules/rosa-aws-provisioned.adoc[leveloffset=+1]
 include::modules/osd-aws-privatelink-firewall-prerequisites.adoc[leveloffset=+1]
 
 == Next steps
-xref:../rosa_planning/rosa-sts-required-aws-service-quotas.adoc#rosa-sts-required-aws-service-quotas[Review the required AWS service quotas]
+* xref:../rosa_planning/rosa-sts-required-aws-service-quotas.adoc#rosa-sts-required-aws-service-quotas[Review the required AWS service quotas]
 
 [role="_additional-resources"]
 == Additional resources
-* See xref:../rosa_planning/rosa-limits-scalability.adoc#initial-planning-considerations_rosa-limits-scalability[Intial Planning Considerations] for guidance on worker node count.
-* See xref:../rosa_architecture/rosa_policy_service_definition/rosa-policy-process-security.adoc#rosa-policy-sre-access_rosa-policy-process-security[SRE access to all Red Hat OpenShift Service on AWS clusters] for information about how Red Hat site reliability engineering accesses ROSA clusters.
+* xref:../rosa_planning/rosa-limits-scalability.adoc#rosa-limits-scalability[Limits and scalability]
+* xref:../rosa_architecture/rosa_policy_service_definition/rosa-policy-process-security.adoc#rosa-policy-sre-access_rosa-policy-process-security[SRE access to all Red Hat OpenShift Service on AWS clusters]