address comments

fabriziopandini · fabriziopandini · commit 80610e9bb44f · 2023-03-23T12:44:00.000+01:00
diff --git a/docs/book/src/tasks/automated-machine-management/healthchecking.md b/docs/book/src/tasks/automated-machine-management/healthchecking.md
@@ -103,8 +103,7 @@ This feature is only available for KubeadmControlPlane.
 </aside>
 
 KubeadmControlPlane allows to control how remediation happen by defining an optional `remediationStrategy`;
-this feature can be used for preventing unnecessary load on infrastructure provider e.g. in case of quota problems,
-or for allowing the infrastructure provider to stabilize in case of temporary problems:
+this feature can be used for preventing unnecessary load on infrastructure provider e.g. in case of quota problems,or for allowing the infrastructure provider to stabilize in case of temporary problems.
 
 ```yaml
 apiVersion: cluster.x-k8s.io/v1beta1
@@ -119,22 +118,30 @@ spec:
     minHealthyPeriod: 2h
 ```
 
-`maxRetry` is the Max number of retries while attempting to remediate an unhealthy machine.
+`maxRetry` is the maximum number of retries while attempting to remediate an unhealthy machine.
 A retry happens when a machine that was created as a replacement for an unhealthy machine also fails.
 For example, given a control plane with three machines M1, M2, M3:
 
 - M1 become unhealthy; remediation happens, and M1-1 is created as a replacement.
 - If M1-1 (replacement of M1) has problems while bootstrapping it will become unhealthy, and then be 
-  remediated; such operation is considered a retry, remediation-retry #1.
+  remediated. This operation is considered a retry - remediation-retry #1.
 - If M1-2 (replacement of M1-1) becomes unhealthy, remediation-retry #2 will happen, etc.
 
-A retry could happen only after `retryPeriod` from the previous retry; if `retryPeriod` is not set (default), 
-a retry will happen immediately.
+A retry will only happen after the `retryPeriod` from the previous retry has elapsed. If `retryPeriod` is not set (default), a retry will happen immediately.
 
-If a machine is marked as unhealthy after `minHealthyPeriod` (default 1h) from the previous remediation expired,
-this is not considered a retry anymore because the new issue is assumed unrelated from the previous one.
+If a machine is marked as unhealthy after `minHealthyPeriod` (default 1h) has passed since the previous remediation this is no longer considered a retry because the new issue is assumed unrelated from the previous one.
 
-If `maxRetry` is not set (default), the remedation will be retried infinitely.
+If `maxRetry` is not set (default), remediation will be retried infinitely.
+
+<aside class="note">
+
+<h1> Retry again once maxRetry is exhausted</h1>
+
+If for some reasons you want to remediate once maxRetry is exhausted there are two options:
+- Temporarily increase  `maxRetry` (recommended)
+- Remove the `controlplane.cluster.x-k8s.io/remediation-for` annotation from the unhealthy machine or decrease `retryCount` in the annotation value.
+
+</aside>
 
 ## Remediation Short-Circuiting
 
@@ -218,11 +225,11 @@ Before deploying a MachineHealthCheck, please familiarise yourself with the foll
 
 - Only Machines owned by a MachineSet or a KubeadmControlPlane can be remediated by a MachineHealthCheck (since a MachineDeployment uses a MachineSet, then this includes Machines that are part of a MachineDeployment)
 - Machines managed by a KubeadmControlPlane are remediated according to [the delete-and-recreate guidelines described in the KubeadmControlPlane proposal](https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20191017-kubeadm-based-control-plane.md#remediation-using-delete-and-recreate)
-  - Following rules should be satisfied in order to start remediation of a control plane machine:
+  - The following rules should be satisfied in order to start remediation of a control plane machine:
     - One of the following apply:
         - The cluster MUST not be initialized yet (the failure happens before KCP reaches the initialized state)
         - The cluster MUST have at least two control plane machines, because this is the smallest cluster size that can be remediated.
-    - Previous remediation (delete and re-create) MUST have been completed. This rule prevents KCP to remediate more machines while the replacement for the previous machine is not yet created.
+    - Previous remediation (delete and re-create) MUST have been completed. This rule prevents KCP from remediating more machines while the replacement for the previous machine is not yet created.
     - The cluster MUST have no machines with a deletion timestamp. This rule prevents KCP taking actions while the cluster is in a transitional state.
     - Remediation MUST preserve etcd quorum. This rule ensures that we will not remove a member that would result in etcd losing a majority of members and thus become unable to field new requests (note: this rule applies only to CP already initialized and with managed etcd)
 - If the Node for a Machine is removed from the cluster, a MachineHealthCheck will consider this Machine unhealthy and remediate it immediately