Skip to content

Commit 5ae6828

Browse files
committed
new article according to ci 117532
1 parent c071e45 commit 5ae6828

File tree

4 files changed

+172
-0
lines changed

4 files changed

+172
-0
lines changed
16.3 KB
Loading
12.3 KB
Loading

articles/virtual-machines/troubleshooting/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,8 @@
9292
href: troubleshoot-guide-windows-boot-manager-menu.md
9393
- name: VM is unresponsive due to updating
9494
href: unresponsive-vm-apply-windows-update.md
95+
- name: VM is unresponsive when applying group policy
96+
href: unresponsive-vm-apply-group-policy.md
9597
- name: Cannot connect to my VM
9698
items:
9799
- name: RDP
Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,170 @@
1+
---
2+
title: VM is unresponsive while applying policy
3+
description: This article provides steps to resolve issues where the load screen is stuck when applying a policy during boot in an Azure VM.
4+
services: virtual-machines-windows
5+
documentationcenter: ''
6+
author: TobyTu
7+
manager: dcscontentpm
8+
editor: ''
9+
tags: azure-resource-manager
10+
ms.assetid: a97393c3-351d-4324-867d-9329e31b5628
11+
ms.service: virtual-machines-windows
12+
ms.workload: infrastructure-services
13+
ms.tgt_pltfrm: na
14+
ms.topic: troubleshooting
15+
ms.date: 05/07/2020
16+
ms.author: v-mibufo
17+
---
18+
19+
# VM is unresponsive while applying ‘Group Policy Local Users & Groups’ policy
20+
21+
This article provides steps to resolve issues where the load screen is stuck when applying a policy during boot in an Azure VM.
22+
23+
## Symptoms
24+
25+
When you use [Boot diagnostics](https://docs.microsoft.com/azure/virtual-machines/troubleshooting/boot-diagnostics) to view the screenshot of the VM, you will see that the screen is stuck loading with the message: ‘Applying Group Policy Local Users and Groups policy’.
26+
27+
![Screen showing Applying Group Policy Local Users and Groups policy loading (Windows Server 2012 R2)](media/unresponsive-vm-apply-group-policy/Applying-Group-Policy.png)
28+
29+
![Screen showing Applying Group Policy Local Users and Groups policy loading (Windows Server 2012).](media/unresponsive-vm-apply-group-policy/Applying-Group-Policy2.png)
30+
31+
## Cause
32+
33+
This issue is caused by a code defect in the Windows Profile Service Dynamic Link Library (*profsvc.dll*).
34+
35+
> [!NOTE]
36+
> This applies only on Windows Server 2012 and Windows Server 2012 R2.
37+
38+
The policy being applied that won’t finish its processes is:
39+
40+
`Computer Configuration\Policies\Administrative Templates\System\User Profiles\Delete user profiles older than a specified number of days on system restart`
41+
42+
It will only get stuck if all six of the following conditions are true:
43+
44+
1. The **Delete user profiles older than a specified number of days on system restart** policy is enabled.
45+
2. You have profiles that have met the age requirements to require cleanup.
46+
3. You have any components that have registered for delete notification for profiles.
47+
4. The components make any calls (direct or indirect) that need to acquire data from the Service Control Manager (SCM) components of Windows, for example, Start, Stop, or Query information about a service.
48+
5. You have a service that configured to start as automatic.
49+
6. This same service is set to run under the context of a domain account (as opposed to using a built-in account, for example, local system).
50+
51+
### What's the code defect
52+
53+
The defect is because of the Service Control Manager (SCM) and the Profile services attempting to apply locks on one another simultaneously. Locks exist to prevent multiple services from making changes on the same data at the same time, which would cause corruption. Ordinarily multiple lock requests wouldn’t cause an issue, however since this is happening during boot, neither service can complete their processes as they are stuck waiting upon one another.
54+
55+
*OS Bug 5880648 - Service Control Manager deadlocks with the "Delete user profiles on restart" policy.*
56+
57+
There are two actions that overlap so that:
58+
59+
- Action 1 acquires the profile lock but hasn't yet acquired the SCM lock.
60+
61+
**AND**
62+
63+
- Action 2 acquires the SCM lock but hasn't yet acquired the profile lock.
64+
65+
Once this has occurred, the next attempt to acquire the second required lock hangs the action.
66+
67+
**Action 1: Old profile deletion notification (has Profile Lock, needs SCM Lock)**
68+
69+
1. First, the policy that is set to delete old profiles grabs an internal profile service lock.
70+
71+
This lock is there to prevent two threads from interacting with the profiles while the delete operation is progress.
72+
73+
2. It then finds profiles that are old enough to be deleted.
74+
3. As a step of the profile deletion, a component that has registered for notifications of the deletions of a profile tries to start a service.
75+
4. Before the Service Control Manager (SCM) starting the service, it first needs to acquire an internal SCM lock, which is held by threads in **Action 2**.
76+
77+
**Action 2: Profile load or creation for user-specific data (has SCM Lock, needs Profile Lock)**
78+
79+
1. At boot, SCM needs to first order all autostart services by their group, as well as any services that those services are dependent upon.
80+
2. SCM acquires an internal SCM lock that is used to control access to starting, stopping, or configuring services as it orders the services.
81+
3. Once the services are in order, the SCM loops through each service and starts it.
82+
4. If the service is running under the context of a domain account, a profile needs to be loaded or created for the domain account in order to store user-specific data.
83+
5. This request is sent to the Profile Service.
84+
6. The profile service needs access to the internal lock acquired in **Action 1**.
85+
86+
## Resolution
87+
88+
### Process overview
89+
90+
1. [Create and access a Repair VM](#create-and-access-a-repair-vm)
91+
2. [Enable Serial Console and memory dump collection](#enable-serial-console-and-memory-dump-collection)
92+
3. [Rebuild the VM](#rebuild-the-vm)
93+
4. [Collect the memory dump file](#collect-the-memory-dump-file)
94+
95+
> [!NOTE]
96+
> When encountering this boot error, the Guest OS is not operational. You will be troubleshooting in Offline mode to resolve this issue.
97+
98+
### Step 1: Create and access a Repair VM
99+
100+
1. Use [steps 1-3 of the VM Repair Commands](https://docs.microsoft.com/azure/virtual-machines/troubleshooting/repair-windows-vm-using-azure-virtual-machine-repair-commands#repair-process-example) to prepare a Repair VM.
101+
2. Use Remote Desktop Connection connect to the Repair VM.
102+
103+
### Step 2: Enable Serial Console and memory dump collection
104+
105+
To enable memory dump collection and Serial Console, run the script below:
106+
107+
1. Open an elevated command prompt session (Run as administrator).
108+
2. Run the following commands:
109+
110+
Enable Serial Console:
111+
112+
```
113+
bcdedit /store <VOLUME LETTER WHERE THE BCD FOLDER IS>:\boot\bcd /ems {<BOOT LOADER IDENTIFIER>} ON
114+
```
115+
116+
```
117+
bcdedit /store <VOLUME LETTER WHERE THE BCD FOLDER IS>:\boot\bcd /emssettings EMSPORT:1 EMSBAUDRATE:115200
118+
```
119+
3. Verify that the free space on the OS disk is as much as the memory size (RAM) on the VM.
120+
121+
If there's not enough space on the OS disk, you should change the location where the memory dump file will be created and refer that to any data disk attached to the VM that has enough free space. To change the location, replace “%SystemRoot%” with the drive letter (for example “F:”) of the data disk in the below commands.
122+
123+
Suggested configuration to enable OS dump:
124+
125+
Load Broken OS Disk:
126+
127+
```
128+
REG LOAD HKLM\BROKENSYSTEM <VOLUME LETTER OF BROKEN OS DISK>:\windows\system32\config\SYSTEM
129+
```
130+
131+
Enable on ControlSet001:
132+
133+
```
134+
REG ADD "HKLM\BROKENSYSTEM\ControlSet001\Control\CrashControl" /v CrashDumpEnabled /t REG_DWORD /d 1 /f
135+
REG ADD "HKLM\BROKENSYSTEM\ControlSet001\Control\CrashControl" /v DumpFile /t REG_EXPAND_SZ /d "%SystemRoot%\MEMORY.DMP" /f
136+
REG ADD "HKLM\BROKENSYSTEM\ControlSet001\Control\CrashControl" /v NMICrashDump /t REG_DWORD /d 1 /f
137+
```
138+
139+
Enable on ControlSet002:
140+
141+
```
142+
REG ADD "HKLM\BROKENSYSTEM\ControlSet002\Control\CrashControl" /v CrashDumpEnabled /t REG_DWORD /d 1 /f
143+
REG ADD "HKLM\BROKENSYSTEM\ControlSet002\Control\CrashControl" /v DumpFile /t REG_EXPAND_SZ /d "%SystemRoot%\MEMORY.DMP" /f
144+
REG ADD "HKLM\BROKENSYSTEM\ControlSet002\Control\CrashControl" /v NMICrashDump /t REG_DWORD /d 1 /f
145+
```
146+
147+
Unload broken OS disk:
148+
149+
```
150+
REG UNLOAD HKLM\BROKENSYSTEM
151+
```
152+
153+
### Step 3: Rebuild the VM
154+
155+
Use [step 5 of the VM Repair Commands](https://docs.microsoft.com/azure/virtual-machines/troubleshooting/repair-windows-vm-using-azure-virtual-machine-repair-commands#repair-process-example) to reassemble the VM.
156+
157+
### Step 4: Collect the memory dump file
158+
159+
To resolve this problem, you would need first to gather the memory dump file for the crash and contact support with the memory dump file. To collect the dump file, follow these steps:
160+
161+
1. Attach the OS disk to a new Repair VM:
162+
163+
- Use [steps 1-3 of the VM Repair Commands](https://docs.microsoft.com/azure/virtual-machines/troubleshooting/repair-windows-vm-using-azure-virtual-machine-repair-commands#repair-process-example) to prepare a new Repair VM.
164+
- Use Remote Desktop Connection connect to the Repair VM.
165+
166+
2. Locate the dump file and submit a support ticket:
167+
168+
- On the repair VM, go to windows folder in the attached OS disk. If the driver letter that is assigned to the attached OS disk is F, you need to go to `F:\Windows`.
169+
- Locate the memory.dmp file, and then [submit a support ticket](https://portal.azure.com/?#blade/Microsoft_Azure_Support/HelpAndSupportBlade) with the memory dump file.
170+
- If you are having trouble locating the memory.dmp file, you may wish to use [non-maskable interrupt (NMI) calls in serial console](https://docs.microsoft.com/azure/virtual-machines/troubleshooting/serial-console-windows#use-the-serial-console-for-nmi-calls) instead. You can follow the guide to [generate a crash dump file using NMI calls](https://docs.microsoft.com/windows/client-management/generate-kernel-or-complete-crash-dump) here.

0 commit comments

Comments
 (0)