Merge pull request #238635 from anantshankar17/master

prmerger-automator[bot] · web-flow · commit 35c76ad6f5e9 · 2023-05-25T12:18:13.000Z
Updating nodetype removal
diff --git a/articles/service-fabric/infrastructure-service-faq.md b/articles/service-fabric/infrastructure-service-faq.md
@@ -48,4 +48,10 @@ Platform and Tenant updates acknowledged by Service Fabric are performed by the
 All Tenant update operations in a Service Fabric cluster are approved only if determined to be safe by Service Fabric. Updates are blocked when Service Fabric can't ensure if the operations are safe. While this generally removes the need for customers to worry about if a given operation is safe or not, it's advised performing operations after understanding their impact. 
 
 ### I want to bypass Infrastructure Service and perform operations on my cluster. How do I do that?
-Bypassing Infrastructure Service for any infrastructure updates is a risky operation and isn't recommended. Engage [Service Fabric support](service-fabric-support.md) experts before deciding to perform these steps.
+Bypassing Infrastructure Service for any infrastructure updates is a risky operation and can result in stuck updates if the safety checks block the repairs from getting approved.
+In certain scenarios, if the default throttling is blocking other updates due to the existing ones not making progress, customers can opt to manually allow more updates. This can be done via the following command, after connecting to the SF cluster:
+```powershell
+   Invoke-ServiceFabricInfrastructureCommand -ServiceName "fabric:/System/InfrastructureService/<nodetype name>" -Command AllowAction:<MR_Jobid_Guid>:*:Prepare
+```
+MR_Jobid_Guid used above can be found under the "Infrastructure Jobs" tab at the root of the Service Fabric Explorer, as the JobId of the pending update.
+Engage [Service Fabric support](service-fabric-support.md) experts if the above doesn't help.
diff --git a/articles/service-fabric/service-fabric-how-to-remove-node-type.md b/articles/service-fabric/service-fabric-how-to-remove-node-type.md
@@ -10,7 +10,7 @@ ms.date: 07/14/2022
 ---
 
 # How to remove a Service Fabric node type
-This article describes how to scale an Azure Service Fabric cluster by removing an existing node type from a cluster. A Service Fabric cluster is a network-connected set of virtual or physical machines into which your microservices are deployed and managed. A machine or VM that's part of a cluster is called a node. Virtual machine scale sets are an Azure compute resource that you use to deploy and manage a collection of virtual machines as a set. Every node type that is defined in an Azure cluster is [set up as a separate scale set](service-fabric-cluster-nodetypes.md). Each node type can then be managed separately. After creating a Service Fabric cluster, you can scale a cluster horizontally by removing a node type (virtual machine scale set) and all of it's nodes.  You can scale the cluster at any time, even when workloads are running on the cluster.  As the cluster scales, your applications automatically scale as well.
+This article describes how to scale an Azure Service Fabric cluster by removing an existing node type from a cluster. A Service Fabric cluster is a network-connected set of virtual or physical machines into which your microservices are deployed and managed. A machine or VM that's part of a cluster is called a node. Virtual machine scale sets are an Azure compute resource that you use to deploy and manage a collection of virtual machines as a set. Every node type that is defined in an Azure cluster is [set up as a separate scale set](service-fabric-cluster-nodetypes.md). Each node type can then be managed separately. After creating a Service Fabric cluster, you can scale a cluster horizontally by removing a node type (virtual machine scale set) and all of its nodes.  You can scale the cluster at any time, even when workloads are running on the cluster.  As the cluster scales, your applications automatically scale as well.
 
 > [!WARNING]
 > Using this approach to remove a node type from a production cluster is
@@ -31,10 +31,10 @@ When removing a node type that is Bronze, all the nodes in the node type go down
 
 ## Remove a node type
 
-1. Please take care of this pre-requisites before you start the process.
+1. Take care of these pre-requisites before you start the process.
 
     - The cluster is healthy.
-    - There will still be sufficient capacity after the node type is removed, eg. number of nodes to place required replica count.
+    - There will still be sufficient capacity after the node type is removed, for example, number of nodes to place required replica count.
 
 2. Move all services that have placement constraints to use node type off the node type.
 
@@ -77,7 +77,7 @@ When removing a node type that is Bronze, all the nodes in the node type go down
     ```
 
     - For bronze durability, wait for all nodes to get to disabled state
-    - For silver and gold durability, some nodes will go in to disabled and the rest will be in disabling state. Check the details tab of the nodes in disabling state, if they are all stuck on ensuring quorum for Infrastructure service partitions, then it is safe to continue.
+    - For silver and gold durability, some nodes go in to disabled and the rest will be in disabling state. Check the details tab of the nodes in disabling state, if they are all stuck on ensuring quorum for Infrastructure service partitions, then it is safe to continue.
 
 5. Stop data for the node type.
 
@@ -168,7 +168,7 @@ When removing a node type that is Bronze, all the nodes in the node type go down
     },
     ```
 
-    - Deploy the modified Azure Resource Manager template. ** This step will take a while, usually up to two hours. This upgrade will change settings to the InfrastructureService, therefore a node restart is needed. In the this case `forceRestart` is ignored. 
+    - Deploy the modified Azure Resource Manager template. ** This step takes a while, usually up to two hours. This upgrade change settings to the InfrastructureService, therefore a node restart is needed. In this case `forceRestart` is ignored.
     The parameter `upgradeReplicaSetCheckTimeout` specifies the maximum time that Service Fabric waits for a partition to be in a safe state, if not already in a safe state. Once safety checks pass for all partitions on a node, Service Fabric proceeds with the upgrade on that node.
     The value for the parameter `upgradeTimeout` can be reduced to 6 hours, but for maximal safety 12 hours should be used.
 
@@ -186,7 +186,11 @@ When removing a node type that is Bronze, all the nodes in the node type go down
     
 10. Remove resources relating to the node type that are no longer in use. Example Load Balancer, and Public IP. 
 
-    - To remove these resources you can use the same PowerShell command as used in step 6 specifying the specific resource type and API version. 
+    - To remove these resources, you can use the same PowerShell command as used in step 6 specifying the specific resource type and API version.
+    - For silver and gold durability any repair task left in the cluster, which is targeting any of the nodes that were present in the nodetype that was removed, should be completed with the command:
+    ```powershell
+       Complete-ServiceFabricRepairTask -TaskId <repair task name>
+    ```
 
 > [!Note]
 > This step is optional if same Load Balancer, and IP is reused between node types.