Skip to content

Commit 10688b2

Browse files
committed
Changed structure of Troubleshoot document. Updated overview page
1 parent 5f7e01f commit 10688b2

File tree

2 files changed

+93
-44
lines changed

2 files changed

+93
-44
lines changed

articles/virtual-machines/maintenance-configurations.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,21 +57,23 @@ This scope is integrated with [Update Manager](../update-center/overview.md), wh
5757
- The upper maintenance window is 3 hours 55 mins.
5858
- A minimum of 1 hour and 30 minutes is required for the maintenance window.
5959
- The value of **Repeat** should be at least 6 hours.
60-
- The start time for a schedule should be at least 10 minutes after the schedule's creation time.
60+
- The start time for a schedule should be at least 15 minutes after the schedule's creation time.
6161

6262
>[!NOTE]
6363
> 1. The minimum maintenance window has been increased from 1 hour 10 minutes to 1 hour 30 minutes, while the minimum repeat value has been set to 6 hours for new schedules. **Please note that your existing schedules will not get impacted; however, we strongly recommend updating existing schedules to include these new changes.**
6464
> 2. The count of characters of Resource Group name along with Maintenance Configuration name should be less than 128 characters
6565
66-
In rare cases if platform catchup host update window happens to coincide with the guest (VM) patching window and if the guest patching window don't get sufficient time to execute after host update then the system would show **Schedule timeout, waiting for an ongoing update to complete the resource** error since only a single update is allowed by the platform at a time.
66+
Maintenance Configuration provides two scheduled patching modes for In-guest VMs: Static Mode and [Dynamic Scope](../update-manager/dynamic-scope-overview.md) Mode. By default, the system operates in Static Mode if no Dynamic Scope Mode is configured. To schedule or modify the maintenance configuration in either mode, a buffer of 15 minutes prior to the scheduled patch time is required. For instance, if the patch is scheduled for 3PM, all modifications, including adding or removing VMs, altering the dynamic scope etc., should be finalized by 2:45PM.
6767

6868
To learn more about this topic, checkout [Update Manager and scheduled patching](../update-center/scheduled-patching.md)
6969

7070
> [!IMPORTANT]
7171
> If you move a resource to a different resource group or subscription, then scheduled patching for the resource stops working as this scenario is currently unsupported by the system. The team is working to provide this capability but in the meantime, as a workaround, for the resource you want to move (in static scope)
72+
>
7273
> 1. You need to remove the assignment of it
7374
> 2. Move the resource to a different resource group or subscription
7475
> 3. Recreate the assignment of it
76+
>
7577
> In the dynamic scope, the steps are similar, but after removing the assignment in step 1, you simply need to initiate or wait for the next scheduled run. This action prompts the system to completely remove the assignment, enabling you to proceed with steps 2 and 3.
7678
> If you forget/miss any one of the above mentioned steps, you can reassign the resource to original assignment and repeat the steps again sequentially.
7779
@@ -117,4 +119,5 @@ The following are the Dynamic Scope recommended limits for **each dynamic scope*
117119

118120
## Next steps
119121

120-
To learn more, see [Maintenance and updates](maintenance-and-updates.md).
122+
To troubleshoot issues, see [Troubleshoot Maintenance Configurations](troubleshoot-maintenance-configurations.md)
123+
To learn more, see [Maintenance and updates](maintenance-and-updates.md)

articles/virtual-machines/troubleshoot-maintenance-configurations.md

Lines changed: 87 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -11,87 +11,133 @@ ms.author: lnagpal
1111

1212
# Troubleshoot issues with Maintenance Configurations
1313

14-
This article describes the open and fixed issues that might occur when you use Maintenance Configurations, their scope and their mitigation steps.
14+
This article outlines common errors that may arise during the deployment or utilization of Maintenance Configuration for Scheduled Patching, along with strategies to address them effectively.
1515

16-
## Fixed Issues
16+
### Shutdown and Unresponsive VM when using `dynamic` scope in Guest Maintenance
1717

18-
#### Shutdown and Unresponsive VM in Guest Maintenance Scope
18+
#### Issue
19+
Scheduled patching doesn't install the patches on the VMs and gives an error `ShutdownOrUnresponsive`
1920

20-
##### Dynamic Scope
21+
#### Resolution
22+
It takes 12 hours to complete the cleanup process for the maintenance configuration assignment so make sure to keep the buffer of 12 hours before recreating the VM with the same name.
23+
If a new VM is recreated with the same name before the cleanup, Maintenance Configuration service will be unable to trigger the schedule.
2124

22-
It takes 12 hours to complete the cleanup process for the maintenance configuration assignment. If a new VM is recreated with the same name before the cleanup, the backend service is unable to trigger the schedule.
25+
### Shutdown and Unresponsive VM when using `static` scope in Guest Maintenance
2326

24-
##### Static Scope
27+
#### Issue
28+
Scheduled patching doesn't install the patches on the VMs and gives an error `ShutdownOrUnresponsive`
2529

26-
Ensure that the VM is up and running. If the VM was indeed up and running, and the issue persists, verify whether the VM was recreated with the same name within a 12-hour window. If so, delete all configuration assignments associated with the recreated VM and then proceed to recreate the assignments.
30+
#### Resolution
31+
In a static scope, it's crucial for customers to avoid relying on outdated VM configurations. Instead, they should prioritize re-assigning configurations after recreating instances.
2732

28-
#### Failed to create dynamic scope due to RBAC
33+
### Schedule Patching stops working after the resource is moved
2934

35+
#### Issue
36+
If a resource is moved to a different resource group or subscription, then scheduled patching for the resource stops working.
37+
38+
#### Resolution
39+
Resource move or Maintenance Configuration move capability across resource group or subscription is currently unsupported by the system. The team is working to provide this capability but in the meantime, as a workaround, for the resource you want to move (in static scope)
40+
41+
1. You need to remove the assignment of it
42+
2. Move the resource to a different resource group or subscription
43+
3. Recreate the assignment of it
44+
45+
In the dynamic scope, the steps are similar, but after removing the assignment in step 1, you simply need to initiate or wait for the next scheduled run. This action prompts the system to completely remove the assignment, enabling you to proceed with steps 2 and 3.
46+
47+
If you forget/miss any one of the above mentioned steps, you can reassign the resource to original assignment and repeat the steps again sequentially.
48+
49+
### Dynamic Scope creation fails
50+
51+
#### Issue
52+
Failed to create dynamic scope due to RBAC
53+
54+
#### Resolution
3055
In order to create a dynamic scope, user must have the permission at the subscription level or at a resource group level. Refer to the [list of permissions list for different resources](../update-manager/overview.md#permissions) for more details.
3156

32-
#### Apply Update stuck and Update not progressing
57+
### Apply Update stuck and Update not progressing
58+
59+
#### Issue
3360
**Applies to:** :heavy_check_mark: Dedicated Hosts :heavy_check_mark: VMs
61+
User initiated update stuck for long time and update is not progressing
62+
63+
#### Resolution
64+
If a resource is redeployed to a different cluster, and a pending update request is created using the old cluster value, the request becomes stuck indefinitely. If the status of the apply update operation is closed/not found, then retry after 120 hours. If the issue persist, contact the support team for further mitigation.
65+
66+
### Dedicated host updates even after Maintenance Configuration is attached
3467

35-
If a resource is redeployed to a different cluster, and a pending update request is created using the old cluster value, the request becomes stuck indefinitely. If a request is stuck for an extended period (more than 300 minutes), contact the support team for further mitigation.
68+
#### Issue
69+
Dedicated Host update not blocked by Maintenance Configuration and it gets updated even after maintenance configuration is attached
3670

37-
#### Dedicated host update even after Maintenance Configuration is attached
71+
#### Resolution
72+
If a Dedicated Host is recreated with the same name, Maintenance Configuration service retains the old Dedicated Host ID, preventing it from blocking updates. Customers can resolve this issue by removing the Maintenance Configuration and reassigning it. If the issue persists, reach out to the support team for further assistance.
3873

39-
If a Dedicated Host is recreated with the same name, the backend retains the old Dedicated Host ID, preventing it from blocking updates. Customers can resolve this issue by removing the maintenance configuration and reassigning it for mitigation. If the issue persists, reach out to the support team for further assistance.
74+
### Install patch operation fails for invalid classification type
4075

41-
#### Install patch operation failed due to invalid classification type in Maintenance Configuration
76+
#### Issue
77+
Install patch operation failed due to invalid classification type in Maintenance Configuration
4278

79+
#### Resolution
4380
Due to a previous bug, the system patch operation couldn't perform validation, and an invalid classification type was found in the Maintenance Configuration. The bug has been fixed and deployed. To address this issue, customers can update the Maintenance Configuration and set the correct classification type.
4481

45-
## Open Issues
82+
### Schedule didn't trigger
4683

47-
#### Schedule Patching stops working after the resource is moved
84+
#### Issue
85+
If a resource has two maintenance configurations with the same trigger time and an install patch configuration, and both are assigned to the same VM/resource, only one maintenance configuration triggers.
4886

49-
If you move a resource to a different resource group or subscription, then scheduled patching for the resource stops working as this scenario is currently unsupported by the system. The team is working to provide this capability but in the meantime, as a workaround, for the resource you want to move (in static scope)
50-
1. You need to remove the assignment of it
51-
2. Move the resource to a different resource group or subscription
52-
3. Recreate the assignment of it
53-
In the dynamic scope, the steps are similar, but after removing the assignment in step 1, you simply need to initiate or wait for the next scheduled run. This action prompts the system to completely remove the assignment, enabling you to proceed with steps 2 and 3.
87+
#### Resolution
88+
Please modify the start time of one of the maintenance configurations to mitigate the issue. It's a known system limitation due to which Maintenance Configuration is unable to identify which maintenance configuration triggers. The team is working on solving this limitation.
5489

55-
If you forget/miss any one of the above mentioned steps, you can reassign the resource to original assignment and repeat the steps again sequentially.
90+
### Unable to create dynamic scope (at Resource Group Level)
5691

57-
#### Schedule didn't trigger
92+
#### Issue
93+
Dynamic scope validation fails due to a null value in the location
5894

59-
If a resource has two maintenance configurations with the same trigger time and an install patch configuration, and both are assigned to the same VM/resource, only one policy triggers. This is a known bug, and it's rarely observed. To mitigate this issue, adjust the start time of the maintenance configuration.
95+
#### Resolution
96+
Due to this issue in dynamic scope validation, it results in regression in the validation process. We recommend that customers provide the required set of locations for resource group-level dynamic scope.
6097

61-
#### Unable to create dynamic scope (at Resource Group Level)
98+
### Dynamic Scope not executed and no resources patched
6299

63-
Dynamic scope validation fails due to a null value in the location, resulting in a regression in the validation process. We recommend that customers provide the required set of locations for resource group-level dynamic scope.
100+
#### Issue
101+
Dynamic scope flattening failed due to throttling, and the service is unable to determine the list of VMs associated with VM.
64102

65-
#### Dynamic Scope not executed
103+
#### Resolution
104+
This issue might be occurring due to the number of subscriptions per dynamic scope that should be less than 30. Refer to this [page](../virtual-machines/maintenance-configurations.md#service-limits) for more details on the service limits of Dynamic Scoping
66105

67-
If in your maintenance schedule, dynamic schedule isn't evaluated and no machines are patched then this error might be occurring due to the number of subscriptions per dynamic scope that should be less than 30. Dynamic scope flattening failed due to throttling, and the service is unable to determine the list of VMs associated with VM. Refer to this [page](../virtual-machines/maintenance-configurations.md#service-limits) for more details on the service limits of Dynamic Scoping
106+
### Dedicated host configuration assignment not cleaned up after Dedicated Host removal
68107

69-
#### Dedicated host configuration assignment not cleaned up after Dedicated Host removal
108+
#### Issue
109+
After deleting the dedicated hosts, configuration assignments attached to dedicated hosts still exists.
70110

71-
Before deleting a dedicated host, make sure to delete the maintenance configuration associated with it. If the dedicated host is deleted but still appears on the portal, reach out to the support team. Cleanup processes are currently in place for dedicated hosts, ensuring no impact on customers as the dedicated host no longer exists.
111+
#### Resolution
112+
Before deleting a dedicated host, make sure to delete the maintenance configuration associated with it. If the dedicated host is deleted but still appears on the portal, reach out to the support team. Cleanup processes are currently in place for dedicated hosts, ensuring no impact on customers.
72113

73-
#### Maintenance Configuration recreated with the same name and old dynamic scope appeared on portal
114+
### Unable to provide Multiple tag values for dynamic scope
74115

75-
After deleting the maintenance configuration, the system performs cleanup of all associations (static as well as dynamic). However, due to a regression from the backend, the backend system is unable to delete the dynamic scope from ARG. The portal displays configurations using ARG, and old configurations may be visible. Stale configurations in ARG will automatically be purged after 60 hours. The backend doesn't utilize any stale dynamic scope.
116+
#### Issue
117+
Portal users might not be able to provide multiple tag values for dynamic scope
76118

77-
#### Unable to provide Multiple tag values for dynamic scope
119+
#### Resolution
120+
This is a currently known limitation on the portal. The team is working on making this feature accessible on the portal as well but in the meantime, customers can use CLI/PowerShell to create dynamic scope. The system accepts multiple values for tag using CLI/PowerShell option.
78121

79-
This is a currently know limitation on the portal. The team is working on making this feature accessible on the portal as well but in the meantime, customers can use CLI/PowerShell to create dynamic scope. The system accepts multiple values for tag using CLI/PowerShell option.
122+
### Maintenance Configuration triggered again with older trigger time
80123

81-
#### Unable to remove tag from maintenance configuration
124+
#### Issue
125+
Maintenance Configuration executed again with the older trigger time, after the update
82126

83-
This is a known bug in the backend system where the customer is unable to remove tag from Maintenance Configuration. The mitigation is to remove all tags and then update the maintenance configuration. Then you can add all the previous tags defined. Removal of a single tag isn't working due to regression.
127+
#### Resolution
128+
There's a known issue in Maintenance Schedule related to the caching of old maintenance policies. If an old policy is cached and the new policy processing is moved to a new instance, the old machine may trigger the schedule with the outdated start time. It's recommended to update the Maintenance Configuration at least 1 hour before. If the issue persists, reach out to support team for further assistance.
84129

85-
#### Maintenance Configuration executes twice after policy updates (Policy trigger with old trigger time)
130+
### Schedule timeout, waiting for an ongoing update to complete the resource
86131

87-
There's a known issue in Maintenance Schedule related to the caching of old maintenance policies. If an old policy is cached and the new policy processing is moved to a new instance, the old machine may trigger the schedule with the outdated start time.
88-
It's recommended to update the Maintenance Configuration at least 1 hour before. If the issue persists, reach out to support team for further assistance.
132+
#### Issue
133+
Maintenance configuration timeout due to the host update window coinciding with the guest (VM) patching window
89134

90-
## Unsupported
135+
#### Resolution
136+
In rare cases if platform catchup host update window happens to coincide with the guest (VM) patching window and if the guest patching window don't get sufficient time to execute after host update then the system would show **Schedule timeout, waiting for an ongoing update to complete the resource** error since only a single update is allowed by the platform at a time.
91137

92-
#### Unimplemented APIs
138+
### Unimplemented APIs
93139

94-
Following is the list of APIs that aren't yet implemented and we are in the process of implementing it in the next few days
140+
Following is the list of APIs that aren't yet supported.
95141
+ Get Apply Update at Subscription Level
96142
+ Get Apply Update at Resource Group Level.
97143
+ Get Pending Update at Subscription Level

0 commit comments

Comments
 (0)