You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Automatic instance repairs for Azure virtual machine scale sets
17
17
18
18
Enabling automatic instance repairs for Azure virtual machine scale sets helps achieve high availability for applications by maintaining a set of healthy instances. If an instance in the scale set is found to be unhealthy as reported by [Application Health extension](./virtual-machine-scale-sets-health-extension.md) or [Load balancer health probes](../load-balancer/load-balancer-custom-probe-overview.md), then this feature automatically performs instance repair by deleting the unhealthy instance and creating a new one to replace it.
19
19
20
-
> [!NOTE]
21
-
> This preview feature is provided without a service level agreement, and it's not recommended for production workloads.
22
-
23
20
## Requirements for using automatic instance repairs
24
21
25
-
**Opt in for the automatic instance repairs preview**
26
-
27
-
Use either the REST API or Azure PowerShell to opt in for the automatic instance repairs preview. These steps will register your subscription for the preview feature. Note this is only a one-time setup required for using this feature. If your subscription is already registered for automatic instance repairs preview, then you do not need to register again.
28
-
29
-
Using REST API
30
-
31
-
1. Register for the feature using [Features - Register](/rest/api/resources/features/register)
32
-
33
-
```
34
-
POST on '/subscriptions/{subscriptionId}/providers/Microsoft.Features/providers/Microsoft.Compute/features/RepairVMScaleSetInstancesPreview/register?api-version=2015-12-01'
2. Wait for a few minutes for the *State* to change to *Registered*. You can use the following API to confirm this.
49
-
50
-
```
51
-
GET on '/subscriptions/{subscriptionId}/providers/Microsoft.Features/providers/Microsoft.Compute/features/RepairVMScaleSetInstancesPreview?api-version=2015-12-01'
3. Once the *State* has changed to *Registered*, then run the following.
66
-
67
-
```
68
-
POST on '/subscriptions/{subscriptionId}/providers/Microsoft.Compute/register?api-version=2015-12-01'
69
-
```
70
-
71
-
Using Azure PowerShell
72
-
73
-
1. Register for the feature using cmdlet [Register-AzureRmResourceProvider](/powershell/module/azurerm.resources/register-azurermresourceprovider) followed by [Register-AzureRmProviderFeature](/powershell/module/azurerm.resources/register-azurermproviderfeature)
74
-
75
-
```azurepowershell-interactive
76
-
Register-AzureRmResourceProvider `
77
-
-ProviderNamespace Microsoft.Compute
78
-
79
-
Register-AzureRmProviderFeature `
80
-
-ProviderNamespace Microsoft.Compute `
81
-
-FeatureName RepairVMScaleSetInstancesPreview
82
-
```
83
-
84
-
2. Wait for a few minutes for the *RegistrationState* to change to *Registered*. You can use the following cmdlet to confirm this.
3. Once the *RegistrationState* to change to *Registered*, then run the following cmdlet.
99
-
100
-
```azurepowershell-interactive
101
-
Register-AzureRmResourceProvider `
102
-
-ProviderNamespace Microsoft.Compute
103
-
```
104
-
105
22
**Enable application health monitoring for scale set**
106
23
107
24
The scale set should have application health monitoring for instances enabled. This can be done using either [Application Health extension](./virtual-machine-scale-sets-health-extension.md) or [Load balancer health probes](../load-balancer/load-balancer-custom-probe-overview.md). Only one of these can be enabled at a time. The application health extension or the load balancer probes ping the application endpoint configured on virtual machine instances to determine the application health status. This health status is used by the scale set orchestrator to monitor instance health and perform repairs when required.
108
25
109
26
**Configure endpoint to provide health status**
110
27
111
-
Before enabling automatic instance repairs policy, ensure that the scale set instances have application endpoint configured to emit the application health status. When an instance returns status 200 (OK) on this application endpoint, then the instance is marked as “Healthy”. In all other cases, the instance is marked “Unhealthy”, including the following scenarios:
28
+
Before enabling automatic instance repairs policy, ensure that the scale set instances have application endpoint configured to emit the application health status. When an instance returns status 200 (OK) on this application endpoint, then the instance is marked as "Healthy". In all other cases, the instance is marked "Unhealthy", including the following scenarios:
112
29
113
30
- When there is no application endpoint configured inside the virtual machine instances to provide application health status
114
31
- When the application endpoint is incorrectly configured
115
32
- When the application endpoint is not reachable
116
33
117
-
For instances marked as “Unhealthy”, automatic repairs are triggered by the scale set. Ensure the application endpoint is correctly configured before enabling the automatic repairs policy in order to avoid unintended instance repairs, while the endpoint is getting configured.
34
+
For instances marked as "Unhealthy", automatic repairs are triggered by the scale set. Ensure the application endpoint is correctly configured before enabling the automatic repairs policy in order to avoid unintended instance repairs, while the endpoint is getting configured.
118
35
119
36
**Enable single placement group**
120
37
121
-
This preview is currently available only for scale sets deployed as single placement group. The property *singlePlacementGroup* should be set to *true* for your scale set to use automatic instance repairs feature. Learn more about [placement groups](./virtual-machine-scale-sets-placement-groups.md#placement-groups).
38
+
This feature is currently available only for scale sets deployed as single placement group. The property *singlePlacementGroup* should be set to *true* for your scale set to use automatic instance repairs feature. Learn more about [placement groups](./virtual-machine-scale-sets-placement-groups.md#placement-groups).
122
39
123
40
**API version**
124
41
125
42
Automatic repairs policy is supported for compute API version 2018-10-01 or higher.
126
43
127
44
**Restrictions on resource or subscription moves**
128
45
129
-
As part of this preview, resource or subscription moves are currently not supported for scale sets when automatic repairs policy is enabled.
46
+
Resource or subscription moves are currently not supported for scale sets when automatic repairs policy is enabled.
130
47
131
48
**Restriction for service fabric scale sets**
132
49
133
-
This preview feature is currently not supported for service fabric scale sets.
50
+
This feature is currently not supported for service fabric scale sets.
134
51
135
52
## How do automatic instance repairs work?
136
53
@@ -147,7 +64,7 @@ When an instance goes through a state change operation because of a PUT, PATCH o
147
64
The automatic instance repairs process works as follows:
148
65
149
66
1.[Application Health extension](./virtual-machine-scale-sets-health-extension.md) or [Load balancer health probes](../load-balancer/load-balancer-custom-probe-overview.md) ping the application endpoint inside each virtual machine in the scale set to get application health status for each instance.
150
-
2. If the endpoint responds with a status 200 (OK), then the instance is marked as “Healthy”. In all the other cases (including if the endpoint is unreachable), the instance is marked “Unhealthy”.
67
+
2. If the endpoint responds with a status 200 (OK), then the instance is marked as "Healthy". In all the other cases (including if the endpoint is unreachable), the instance is marked "Unhealthy".
151
68
3. When an instance is found to be unhealthy, the scale set triggers a repair action by deleting the unhealthy instance and creating a new one to replace it.
152
69
4. Instance repairs are performed in batches. At any given time, no more than 5% of the total instances in the scale set are repaired. If a scale set has fewer than 20 instances, the repairs are done for one unhealthy instance at a time.
153
70
5. The above process continues until all unhealthy instance in the scale set are repaired.
@@ -160,6 +77,20 @@ If an instance in a scale set is protected by applying the *[Protect from scale-
160
77
161
78
For enabling automatic repairs policy while creating a new scale set, ensure that all the [requirements](#requirements-for-using-automatic-instance-repairs) for opting in to this feature are met. The application endpoint should be correctly configured for scale set instances to avoid triggering unintended repairs while the endpoint is getting configured. For newly created scale sets, any instance repairs are performed only after waiting for the duration of grace period. To enable the automatic instance repair in a scale set, use *automaticRepairsPolicy* object in the virtual machine scale set model.
162
79
80
+
### Azure portal
81
+
82
+
The following steps enabling automatic repairs policy when creating a new scale set.
83
+
84
+
1. Go to **Virtual machine scale sets**.
85
+
1. Select **+ Add** to create a new scale set.
86
+
1. Go to the **Health** tab.
87
+
1. Locate the **Health** section.
88
+
1. Enable the **Monitor application health** option.
89
+
1. Locate the **Automatic repair policy** section.
90
+
1. Turn **On** the **Automatic repairs** option.
91
+
1. In **Grace period (min)**, enter the desired amount of time for which automatic repairs are suspended.
92
+
1. When you are done creating the new scale set, select **Review + create** button.
93
+
163
94
### REST API
164
95
165
96
The following example shows how to enable automatic instance repair in a scale set model. Use API version 2018-10-01 or higher.
@@ -193,10 +124,43 @@ New-AzVmssConfig `
193
124
-AutomaticRepairGracePeriod "PT30M"
194
125
```
195
126
127
+
### Azure CLI 2.0
128
+
129
+
The following example enables the automatic repairs policy while creating a new scale set. First create a resource group, then create a new scale set with automatic repairs policy grace period set to 30 minutes.
130
+
131
+
```azurecli-interactive
132
+
az group create --name <myResourceGroup> --location <VMSSLocation>
> When creating a new scale set using Azure CLI, you cannot enable the automatic instance repairs policy property using the app health extension. The above example uses the load balancer parameter instead.
147
+
196
148
## Enabling automatic repairs policy when updating an existing scale set
197
149
198
150
Before enabling automatic repairs policy in an existing scale set, ensure that all the [requirements](#requirements-for-using-automatic-instance-repairs) for opting in to this feature are met. The application endpoint should be correctly configured for scale set instances to avoid triggering unintended repairs while the endpoint is getting configured. To enable the automatic instance repair in a scale set, use *automaticRepairsPolicy* object in the virtual machine scale set model.
199
151
152
+
### Azure portal
153
+
154
+
You can modify the automatic repairs policy of an existing scale set through the Azure portal.
155
+
156
+
1. Go to an existing virtual machine scale set.
157
+
1. Under **Settings** in the menu on the left, select **Health and repair**.
158
+
1. Enable the **Monitor application health** option.
159
+
1. Locate the **Automatic repair policy** section.
160
+
1. Turn **On** the **Automatic repairs** option.
161
+
1. In **Grace period (min)**, enter the desired amount of time for which automatic repairs are suspended.
162
+
1. When you are done, select **Save**.
163
+
200
164
### REST API
201
165
202
166
The following example enables the policy with grace period of 40 minutes. Use API version 2018-10-01 or higher.
@@ -228,11 +192,23 @@ Update-AzVmss `
228
192
-AutomaticRepairGracePeriod "PT40M"
229
193
```
230
194
195
+
### Azure CLI 2.0
196
+
197
+
The following is an example for updating the automatic instance repairs policy of an existing scale set.
198
+
199
+
```azurecli-interactive
200
+
az vmss update \
201
+
--resource-group <myResourceGroup> \
202
+
--name <myVMScaleSet> \
203
+
--enable-automatic-repairs true \
204
+
--automatic-repairs-period 30
205
+
```
206
+
231
207
## Troubleshoot
232
208
233
209
**Failure to enable automatic repairs policy**
234
210
235
-
If you get a ‘BadRequest’ error with a message stating “Could not find member ‘automaticRepairsPolicy’ on object of type ‘properties’”, then check the API version used for virtual machine scale set. API version 2018-10-01 or higher is required for this feature.
211
+
If you get a 'BadRequest' error with a message stating "Could not find member 'automaticRepairsPolicy' on object of type 'properties'", then check the API version used for virtual machine scale set. API version 2018-10-01 or higher is required for this feature.
236
212
237
213
**Instance not getting repaired even when policy is enabled**
238
214
@@ -242,6 +218,8 @@ The instance could be in grace period. This is the amount of time to wait after
242
218
243
219
You can use the [Get Instance View API](/rest/api/compute/virtualmachinescalesetvms/getinstanceview) for instances in a virtual machine scale set to view the application health status. With Azure PowerShell, you can use the cmdlet [Get-AzVmssVM](/powershell/module/az.compute/get-azvmssvm) with the *-InstanceView* flag. The application health status is provided under the property *vmHealth*.
244
220
221
+
In the Azure portal, you can see the health status as well. Go to an existing scale set, select **Instances** from the menu on the left, and look at the **Health state** column for the health status of each scale set instance.
222
+
245
223
## Next steps
246
224
247
225
Learn how to configure [Application Health extension](./virtual-machine-scale-sets-health-extension.md) or [Load balancer health probes](../load-balancer/load-balancer-custom-probe-overview.md) for your scale sets.
0 commit comments