Merge pull request #245129 from tomvcassidy/primaryNodeResiliencyUpdate

prmerger-automator[bot] · web-flow · commit 0b76904d19bf · 2023-07-17T14:41:34.000Z
Node standards for resiliency update
diff --git a/articles/service-fabric/how-to-managed-cluster-availability-zones.md b/articles/service-fabric/how-to-managed-cluster-availability-zones.md
@@ -26,10 +26,10 @@ Sample templates are available: [Service Fabric cross availability zone template
 
 A Service Fabric cluster distributed across Availability Zones ensures high availability of the cluster state. 
 
-The recommended topology for managed cluster requires the resources outlined below:
+The recommended topology for managed cluster requires the following resources:
 
 * The cluster SKU must be Standard
-* Primary node type should have at least nine nodes for best resiliency, but supports minimum number of six.
+* Primary node type should have at least nine nodes (3 in each AZ) for best resiliency, but supports minimum number of six (2 in each AZ).
 * Secondary node type(s) should have at least six nodes for best resiliency, but supports minimum number of three.
 
 >[!NOTE]
@@ -49,18 +49,16 @@ Sample node list depicting FD/UD formats in a virtual machine scale set spanning
 When a service is deployed on the node types that are spanning zones, the replicas are placed to ensure they land up in separate zones. This separation is ensured as the fault domain’s on the nodes present in each of these node types are configured with the zone information (i.e FD = fd:/zone1/1 etc.). For example: for five replicas or instances of a service, the distribution will be 2-2-1 and runtime will try to ensure equal distribution across AZs.
 
 **User Service Replica Configuration**:
-Stateful user services deployed on the cross-availability zone node types should be configured with this configuration: replica count with target = 9, min = 5. This configuration helps the service to be working even when one zone goes down since 6 replicas will be still up in the other two zones. An application upgrade in such a scenario will also go through.
+Stateful user services deployed on the cross-availability zone node types should be configured with this configuration: replica count with target = 9, min = 5. This configuration helps the service to be working even when one zone goes down since six replicas will be still up in the other two zones. An application upgrade in such a scenario will also go through.
 
 **Zone down scenario**:
-When a zone goes down, all the nodes in that zone appear as down. Service replicas on these nodes will also be down. Since there are replicas in the other zones, the service continues to be responsive with primary replicas failing over to the zones which are functioning. The services will appear in warning state as the target replica count is not met and the VM count is still more than the defined min target replica size. As a result, Service Fabric load balancer brings up replicas in the working zones to match the configured target replica count. At this point, the services should appear healthy. When the zone that was down comes back up, the load balancer will again spread all the service replicas evenly across all the zones.
+When a zone goes down, all the nodes in that zone appear as down. Service replicas on these nodes will also be down. Since there are replicas in the other zones, the service continues to be responsive with primary replicas failing over to the zones that are functioning. The services will appear in warning state as the target replica count is not met and the VM count is still more than the defined min target replica size. As a result, Service Fabric load balancer brings up replicas in the working zones to match the configured target replica count. At this point, the services should appear healthy. When the zone that was down comes back up, the load balancer will again spread all the service replicas evenly across all the zones.
 
 ## Networking Configuration
 For more information, see [Configure network settings for Service Fabric managed clusters](./how-to-managed-cluster-networking.md)
 
 ## Enabling a zone resilient Azure Service Fabric managed cluster
-To enable a zone resilient Azure Service Fabric managed cluster, you must include the following in the managed cluster resource definition.
-
-* The **ZonalResiliency** property, which specifies if the cluster is zone resilient or not.
+To enable a zone resilient Azure Service Fabric managed cluster, you must include the following **ZonalResiliency** property, which specifies if the cluster is zone resilient or not.
 
 ```json
 {
@@ -74,8 +72,8 @@ To enable a zone resilient Azure Service Fabric managed cluster, you must includ
 }
 ```
 
-## Migrate an existing non-zone resilient cluster to Zone Resilient (Preview) 
-Existing Service Fabric managed clusters which are not spanned across availability zones can now be migrated in-place to span availability zones. Supported scenarios include clusters created in regions that have three availability zones as well as clusters in regions where three availability zones are made available post-deployment.
+## Migrate an existing nonzone resilient cluster to Zone Resilient (Preview) 
+Existing Service Fabric managed clusters that are not spanned across availability zones can now be migrated in-place to span availability zones. Supported scenarios include clusters created in regions that have three availability zones and clusters in regions where three availability zones are made available post-deployment.
 
 Requirements:
 * Standard SKU cluster
@@ -118,16 +116,16 @@ Requirements:
    }
    ```
 
-   If the Public IP resource is not zone resilient, migration of the cluster will cause a brief loss of external connectivity. This is due to the migration setting up new Public IP and updating the cluster FQDN to the new IP. If the Public IP resource is zone resilient, migration will not modify the Public IP resource or FQDN and there will be no external connectivity impact.
+   If the Public IP resource is not zone resilient, migration of the cluster will cause a brief loss of external connectivity. This connection loss is due to the migration setting up new Public IP and updating the cluster FQDN to the new IP. If the Public IP resource is zone resilient, migration will not modify the Public IP resource nor the FQDN, and there will be no external connectivity impact.
    
 2) Initiate conversion of the underlying storage account created for managed cluster from LRS to ZRS using [customer-initiated conversion](../storage/common/redundancy-migration.md#customer-initiated-conversion). The resource group of storage account that needs to be migrated would be of the form "SFC_ClusterId"(ex SFC_9240df2f-71ab-4733-a641-53a8464d992d) under the same subscription as the managed cluster resource.
 
 3) Add a new primary node type which spans across availability zones
 
-   This step will trigger the resource provider to perform the migration of the primary node type and Public IP along with a cluster FQDN DNS update, if needed, to become zone resilient. Use the above API to understand implication of this step.
+   This step triggers the resource provider to perform the migration of the primary node type and Public IP along with a cluster FQDN DNS update, if needed, to become zone resilient. Use the above API to understand implication of this step.
 
 * Use apiVersion 2022-02-01-preview or higher.
-* Add a new primary node type to the cluster with zones parameter set to ["1", "2", "3"] as show below:
+* Add a new primary node type to the cluster with zones parameter set to ["1," "2," "3"] as show below:
 ```json
 {
   "apiVersion": "2022-02-01-preview",
@@ -194,7 +192,7 @@ Requirements:
    ```http
    POST https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ServiceFabric/managedClusters/{clusterName}/getazresiliencystatus?api-version=2022-02-01-preview
    ```
-   This should provide with response similar to:
+   This API call should provide a response similar to:
    ```json
    {
     "baseResourceStatus" :[
@@ -220,9 +218,9 @@ Requirements:
    If you run in to any problems reach out to support for assistance.
 
 ## Enable FastZonalUpdate on Service Fabric managed clusters (preview)
-Service Fabric managed clusters support faster cluster and application upgrades by reducing the max upgrade domains per availability zone. The default configuration right now can have at most 15 UDs in multiple AZ nodetype. This huge number of UDs reduced the upgrade velocity. Using the new configuration, the max UDs are reduced, which results in faster updates, keeping the safety of the upgrades intact.   
+Service Fabric managed clusters support faster cluster and application upgrades by reducing the max upgrade domains per availability zone. The default configuration right now can have at most 15 UDs in multiple AZ nodetype. This huge number of UDs reduced the upgrade velocity. The new configuration reduces the max UDs, which results in faster updates, keeping the safety of the upgrades intact.   
 
-The update should be done via ARM template by setting the zonalUpdateMode property to “fast” and then modifying a node type attribute, such as adding a node and then removing the node to each nodetype (see required steps 2 and 3 below).  The Service Fabric managed cluster resource apiVersion should be 2022-10-01-preview or later.
+The update should be done via ARM template by setting the zonalUpdateMode property to “fast” and then modifying a node type attribute, such as adding a node and then removing the node to each nodetype (see required steps 2 and 3).  The Service Fabric managed cluster resource apiVersion should be 2022-10-01-preview or later.
 
 1. Modify the ARM template with the new property mentioned above.
 ```json