Skip to content

Commit 65bea8f

Browse files
author
Alfredo Santamaria Gomez
committed
improve Acrolinx score
1 parent a3b4813 commit 65bea8f

File tree

1 file changed

+40
-40
lines changed

1 file changed

+40
-40
lines changed

articles/service-fabric/how-to-managed-cluster-availability-zones.md

Lines changed: 40 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -24,13 +24,13 @@ Sample templates are available: [Service Fabric cross availability zone template
2424
>[!NOTE]
2525
>The benefit of spanning the primary node type across availability zones is really only seen for three zones and not just two.
2626
27-
A Service Fabric cluster distributed across Availability Zones ensures high availability of the cluster state.
27+
A Service Fabric cluster distributed across Availability Zones (AZ) ensures high availability of the cluster state.
2828

2929
The recommended topology for managed cluster requires the following resources:
3030

3131
* The cluster SKU must be Standard
3232
* Primary node type should have at least nine nodes (3 in each AZ) for best resiliency, but supports minimum number of six (2 in each AZ).
33-
* Secondary node type(s) should have at least six nodes for best resiliency, but supports minimum number of three.
33+
* Secondary node types should have at least six nodes for best resiliency, but supports minimum number of three.
3434

3535
>[!NOTE]
3636
>Only 3 Availability Zone deployments are supported.
@@ -46,13 +46,13 @@ Sample node list depicting FD/UD formats in a virtual machine scale set spanning
4646
![Sample node list depicting FD/UD formats in a virtual machine scale set spanning zones.][sfmc-multi-az-nodes]
4747

4848
**Distribution of Service replicas across zones**:
49-
When a service is deployed on the node types that are spanning zones, the replicas are placed to ensure they land up in separate zones. This separation is ensured as the fault domain’s on the nodes present in each of these node types are configured with the zone information (i.e FD = fd:/zone1/1 etc.). For example: for five replicas or instances of a service, the distribution will be 2-2-1 and runtime will try to ensure equal distribution across AZs.
49+
When a service is deployed on the node types that are spanning zones, the replicas are placed to ensure they land up in separate zones. This separation is ensured as the fault domain’s on the nodes present in each of these node types are configured with the zone information (i.e FD = fd:/zone1/1 etc.). For example: for five replicas or instances of a service, the distribution is 2-2-1 and runtime tries to ensure equal distribution across AZs.
5050

5151
**User Service Replica Configuration**:
5252
Stateful user services deployed on the cross-availability zone node types should be configured with this configuration: replica count with target = 9, min = 5. This configuration helps the service to be working even when one zone goes down since six replicas will be still up in the other two zones. An application upgrade in such a scenario will also go through.
5353

5454
**Zone down scenario**:
55-
When a zone goes down, all the nodes in that zone appear as down. Service replicas on these nodes will also be down. Since there are replicas in the other zones, the service continues to be responsive with primary replicas failing over to the zones that are functioning. The services will appear in warning state as the target replica count is not met and the VM count is still more than the defined min target replica size. As a result, Service Fabric load balancer brings up replicas in the working zones to match the configured target replica count. At this point, the services should appear healthy. When the zone that was down comes back up, the load balancer will again spread all the service replicas evenly across all the zones.
55+
When a zone goes down, all the nodes in that zone appear as down. Service replicas on these nodes will also be down. Since there are replicas in the other zones, the service continues to be responsive with primary replicas failing over to the zones that are functioning. The services will appear in warning state as the target replica count is not met and the virtual machine (VM) count is still more than the defined min target replica size. As a result, Service Fabric load balancer brings up replicas in the working zones to match the configured target replica count. At this point, the services should appear healthy. When the zone that was down comes back up, the load balancer will again spread all the service replicas evenly across all the zones.
5656

5757
## Networking Configuration
5858
For more information, see [Configure network settings for Service Fabric managed clusters](./how-to-managed-cluster-networking.md)
@@ -82,19 +82,19 @@ Requirements:
8282
>[!NOTE]
8383
>Migration to a zone resilient configuration can cause a brief loss of external connectivity through the load balancer, but will not affect cluster health. This occurs when a new Public IP needs to be created in order to make the networking resilient to Zone failures. Please plan the migration accordingly.
8484
85-
1) Start with determining if there will be a new IP required and what resources need to be migrated to become zone resilient. To get the current Availability Zone resiliency state for the resources of the managed cluster use the following API call:
86-
87-
```http
88-
POST https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ServiceFabric/managedClusters/{clusterName}/getazresiliencystatus?api-version=2022-02-01-preview
89-
```
90-
Or you can use the Az Module as follows:
91-
```
92-
Select-AzSubscription -SubscriptionId {subscriptionId}
93-
Invoke-AzResourceAction -ResourceId /subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ServiceFabric/managedClusters/{clusterName} -Action getazresiliencystatus -ApiVersion 2022-02-01-preview
94-
```
95-
This should provide with response similar to:
96-
```json
97-
{
85+
1) Start with determining if a new IP is required and what resources need to be migrated to become zone resilient. To get the current Availability Zone resiliency state for the resources of the managed cluster, use the following API call:
86+
87+
```http
88+
POST https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ServiceFabric/managedClusters/{clusterName}/getazresiliencystatus?api-version=2022-02-01-preview
89+
```
90+
Or you can use the Az Module as follows:
91+
```
92+
Select-AzSubscription -SubscriptionId {subscriptionId}
93+
Invoke-AzResourceAction -ResourceId /subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ServiceFabric/managedClusters/{clusterName} -Action getazresiliencystatus -ApiVersion 2022-02-01-preview
94+
```
95+
The command should provide a response similar to:
96+
```json
97+
{
9898
"baseResourceStatus" :[
9999
{
100100
"resourceName": "sfmccluster1"
@@ -113,19 +113,19 @@ Requirements:
113113
}
114114
],
115115
"isClusterZoneResilient": false
116-
}
117-
```
116+
}
117+
```
118118
119-
If the Public IP resource is not zone resilient, migration of the cluster will cause a brief loss of external connectivity. This connection loss is due to the migration setting up new Public IP and updating the cluster FQDN to the new IP. If the Public IP resource is zone resilient, migration will not modify the Public IP resource nor the FQDN, and there will be no external connectivity impact.
119+
If the Public IP resource is not zone resilient, migration of the cluster will cause a brief loss of external connectivity. This connection loss is due to the migration setting up new Public IP and updating the cluster Fully qualified domain name (FQDN) to the new IP. If the Public IP resource is zone resilient, migration will not modify the Public IP resource nor the FQDN, and there will be no external connectivity impact.
120120
121-
2) Initiate conversion of the underlying storage account created for managed cluster from LRS to ZRS using [customer-initiated conversion](../storage/common/redundancy-migration.md#customer-initiated-conversion). The resource group of storage account that needs to be migrated would be of the form "SFC_ClusterId"(ex SFC_9240df2f-71ab-4733-a641-53a8464d992d) under the same subscription as the managed cluster resource.
121+
2) Initiate conversion of the underlying storage account created for managed cluster from Locally redundant storage (LRS) to Zone Redundant Storage (ZRS) using [customer-initiated conversion](../storage/common/redundancy-migration.md#customer-initiated-conversion). The resource group of storage account that needs to be migrated would be of the form "SFC_ClusterId"(ex SFC_9240df2f-71ab-4733-a641-53a8464d992d) under the same subscription as the managed cluster resource.
122122
123123
3) Add zones property to existing node types
124124
125-
This step configures the managed Virtual Machine Scale Set associated with the node type as zone-resilient, ensuring that any new VMs added to it will be deployed across availability zones (Zonal VMs). If the specified node type is primary, the resource provider will perform the migration of the Public IP along with a cluster FQDN DNS update, if needed, to become zone resilient. Use the `getazresiliencystatus` API above to understand implication of this step.
125+
This step configures the managed Virtual Machine Scale Set associated with the node type as zone-resilient, ensuring that any new VMs added to it will be deployed across availability zones (Zonal VMs). If the specified node type is primary, the resource provider will perform the migration of the Public IP along with a cluster FQDN DNS update, if needed, to become zone resilient. Use the `getazresiliencystatus` API to understand implication of this step.
126126
127127
* Use apiVersion 2022-02-01-preview or higher.
128-
* Add the `zones` parameter set to `["1", "2", "3"]` to existing node types as show below:
128+
* Add the `zones` parameter set to `["1", "2", "3"]` to existing node types:
129129
130130
```json
131131
{
@@ -160,18 +160,18 @@ Requirements:
160160
}
161161
```
162162

163-
5) Scale Node types to add **Zonal** nodes and remove **Regional** nodes
163+
4) Scale Node types to add **Zonal** nodes and remove **Regional** nodes
164164

165-
At this stage, the VMSS is marked as zone-resilient. Consequently, when scaling up, newly added nodes will be zonal, and when scaling down, regional nodes will be removed. This provides the flexibility to scale in any order that aligns with your capacity requirements by adjusting the `vmInstanceCount` property on the node types.
165+
At this stage, the VMSS is marked as zone-resilient. So, when scaling up, newly added nodes will be zonal, and when scaling down, regional nodes will be removed. This approach provides the flexibility to scale in any order that aligns with your capacity requirements by adjusting the `vmInstanceCount` property on the node types.
166166

167-
For example, if the initial vmInstanceCount is set to 6 (indicating 6 regional nodes), you can perform 2 deployments:
167+
For example, if the initial vmInstanceCount is set to 6 (indicating 6 regional nodes), you can perform 2 deployments:
168168
- First deployment: Increase the vmInstanceCount to 12 to add 6 **Zonal** nodes.
169169
- Second deployment: Decrease the vmInstanceCount to 6 to remove all **Regional** nodes.
170170

171-
Throughout the process, you can check the `getazresiliencystatus` API to retrieve the progress status, as illustrated below. The process is considered complete once each node type has a minimum of 6 zonal nodes and 0 regional nodes.
171+
Throughout the process, you can check the `getazresiliencystatus` API to retrieve the progress status, as illustrated below. The process is considered complete once each node type has a minimum of 6 zonal nodes and 0 regional nodes.
172172

173-
```json
174-
{
173+
```json
174+
{
175175
"baseResourceStatus" :[
176176
{
177177
"resourceName": "sfmccluster1"
@@ -197,14 +197,14 @@ Throughout the process, you can check the `getazresiliencystatus` API to retriev
197197
}
198198
],
199199
"isClusterZoneResilient": false
200-
}
201-
```
202-
>[!NOTE]
203-
> The scaling process for the primary node type will require additional time, as each addition or removal of a node will initiate a service fabric cluster upgrade.
200+
}
201+
```
202+
>[!NOTE]
203+
> The scaling process for the primary node type will require additional time, as each addition or removal of a node will initiate a service fabric cluster upgrade.
204204

205-
6) Mark the cluster resilient to zone failures
205+
5) Mark the cluster resilient to zone failures
206206

207-
This step helps in future deployments, since it ensures all future deployments of node types span across availability zones and thus cluster remains tolerant to AZ failures. Set `zonalResiliency: true` in the cluster ARM template and do a deployment to mark cluster as zone resilient and ensure all new node type deployments span across availability zones. This will only be allowed if all node types have at least 6 zonal nodes and 0 regional nodes.
207+
This step helps in future deployments, since it ensures all future deployments of node types span across availability zones and thus cluster remains tolerant to AZ failures. Set `zonalResiliency: true` in the cluster ARM template and do a deployment to mark cluster as zone resilient and ensure all new node type deployments span across availability zones. This update is only allowed if all node types have at least 6 zonal nodes and 0 regional nodes.
208208

209209
```json
210210
{
@@ -215,9 +215,9 @@ Throughout the process, you can check the `getazresiliencystatus` API to retriev
215215
```
216216
You can also see the updated status in portal under Overview -> Properties similar to `Zonal resiliency True`, once complete.
217217

218-
7) Validate all the resources are zone resilient
218+
6) Validate all the resources are zone resilient
219219

220-
To validate the Availability Zone resiliency state for the resources of the managed cluster use the following GET API call:
220+
To validate the Availability Zone resiliency state for the resources of the managed cluster, use the following GET API call:
221221

222222
```http
223223
POST https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ServiceFabric/managedClusters/{clusterName}/getazresiliencystatus?api-version=2022-02-01-preview
@@ -252,14 +252,14 @@ Throughout the process, you can check the `getazresiliencystatus` API to retriev
252252
"isClusterZoneResilient": true
253253
}
254254
```
255-
If you run in to any problems reach out to support for assistance.
255+
If you run in to any problems, reach out to support for assistance.
256256

257257
## Enable FastZonalUpdate on Service Fabric managed clusters (preview)
258-
Service Fabric managed clusters support faster cluster and application upgrades by reducing the max upgrade domains per availability zone. The default configuration right now can have at most 15 UDs in multiple AZ nodetype. This huge number of UDs reduced the upgrade velocity. The new configuration reduces the max UDs, which results in faster updates, keeping the safety of the upgrades intact.
258+
Service Fabric managed clusters support faster cluster and application upgrades by reducing the max upgrade domains per availability zone. The default configuration right now can have at most 15 upgrade domains (UDs) in multiple AZ nodetype. This huge number of UDs reduced the upgrade velocity. The new configuration reduces the max UDs, which results in faster updates, keeping the safety of the upgrades intact.
259259

260260
The update should be done via ARM template by setting the zonalUpdateMode property to “fast” and then modifying a node type attribute, such as adding a node and then removing the node to each nodetype (see required steps 2 and 3). The Service Fabric managed cluster resource apiVersion should be 2022-10-01-preview or later.
261261

262-
1. Modify the ARM template with the new property mentioned above.
262+
1. Modify the ARM template with the new property zonalUpdateMode.
263263
```json
264264
"resources": [
265265
{

0 commit comments

Comments
 (0)