|
| 1 | +--- |
| 2 | +title: VM is unresponsive while applying policy |
| 3 | +description: This article provides steps to resolve issues where the load screen is stuck when applying a policy during boot in an Azure VM. |
| 4 | +services: virtual-machines-windows |
| 5 | +documentationcenter: '' |
| 6 | +author: TobyTu |
| 7 | +manager: dcscontentpm |
| 8 | +editor: '' |
| 9 | +tags: azure-resource-manager |
| 10 | +ms.assetid: a97393c3-351d-4324-867d-9329e31b5628 |
| 11 | +ms.service: virtual-machines-windows |
| 12 | +ms.workload: infrastructure-services |
| 13 | +ms.tgt_pltfrm: na |
| 14 | +ms.topic: troubleshooting |
| 15 | +ms.date: 05/07/2020 |
| 16 | +ms.author: v-mibufo |
| 17 | +--- |
| 18 | + |
| 19 | +# VM is unresponsive while applying ‘Group Policy Local Users & Groups’ policy |
| 20 | + |
| 21 | +This article provides steps to resolve issues where the load screen is stuck when applying a policy during boot in an Azure VM. |
| 22 | + |
| 23 | +## Symptoms |
| 24 | + |
| 25 | +When you use [Boot diagnostics](https://docs.microsoft.com/azure/virtual-machines/troubleshooting/boot-diagnostics) to view the screenshot of the VM, you will see that the screen is stuck loading with the message: ‘Applying Group Policy Local Users and Groups policy’. |
| 26 | + |
| 27 | + |
| 28 | + |
| 29 | + |
| 30 | + |
| 31 | +## Cause |
| 32 | + |
| 33 | +This issue is caused by a code defect in the Windows Profile Service Dynamic Link Library (*profsvc.dll*). |
| 34 | + |
| 35 | +> [!NOTE] |
| 36 | +> This applies only on Windows Server 2012 and Windows Server 2012 R2. |
| 37 | +
|
| 38 | +The policy being applied that won’t finish its processes is: |
| 39 | + |
| 40 | +`Computer Configuration\Policies\Administrative Templates\System\User Profiles\Delete user profiles older than a specified number of days on system restart` |
| 41 | + |
| 42 | +It will only get stuck if all six of the following conditions are true: |
| 43 | + |
| 44 | +1. The **Delete user profiles older than a specified number of days on system restart** policy is enabled. |
| 45 | +2. You have profiles that have met the age requirements to require cleanup. |
| 46 | +3. You have any components that have registered for delete notification for profiles. |
| 47 | +4. The components make any calls (direct or indirect) that need to acquire data from the Service Control Manager (SCM) components of Windows, for example, Start, Stop, or Query information about a service. |
| 48 | +5. You have a service that configured to start as automatic. |
| 49 | +6. This same service is set to run under the context of a domain account (as opposed to using a built-in account, for example, local system). |
| 50 | + |
| 51 | +### What's the code defect |
| 52 | + |
| 53 | +The defect is because of the Service Control Manager (SCM) and the Profile services attempting to apply locks on one another simultaneously. Locks exist to prevent multiple services from making changes on the same data at the same time, which would cause corruption. Ordinarily multiple lock requests wouldn’t cause an issue, however since this is happening during boot, neither service can complete their processes as they are stuck waiting upon one another. |
| 54 | + |
| 55 | +*OS Bug 5880648 - Service Control Manager deadlocks with the "Delete user profiles on restart" policy.* |
| 56 | + |
| 57 | +There are two actions that overlap so that: |
| 58 | + |
| 59 | +- Action 1 acquires the profile lock but hasn't yet acquired the SCM lock. |
| 60 | + |
| 61 | + **AND** |
| 62 | + |
| 63 | +- Action 2 acquires the SCM lock but hasn't yet acquired the profile lock. |
| 64 | + |
| 65 | +Once this has occurred, the next attempt to acquire the second required lock hangs the action. |
| 66 | + |
| 67 | +**Action 1: Old profile deletion notification (has Profile Lock, needs SCM Lock)** |
| 68 | + |
| 69 | +1. First, the policy that is set to delete old profiles grabs an internal profile service lock. |
| 70 | + |
| 71 | + This lock is there to prevent two threads from interacting with the profiles while the delete operation is progress. |
| 72 | + |
| 73 | +2. It then finds profiles that are old enough to be deleted. |
| 74 | +3. As a step of the profile deletion, a component that has registered for notifications of the deletions of a profile tries to start a service. |
| 75 | +4. Before the Service Control Manager (SCM) starting the service, it first needs to acquire an internal SCM lock, which is held by threads in **Action 2**. |
| 76 | + |
| 77 | +**Action 2: Profile load or creation for user-specific data (has SCM Lock, needs Profile Lock)** |
| 78 | + |
| 79 | +1. At boot, SCM needs to first order all autostart services by their group, as well as any services that those services are dependent upon. |
| 80 | +2. SCM acquires an internal SCM lock that is used to control access to starting, stopping, or configuring services as it orders the services. |
| 81 | +3. Once the services are in order, the SCM loops through each service and starts it. |
| 82 | +4. If the service is running under the context of a domain account, a profile needs to be loaded or created for the domain account in order to store user-specific data. |
| 83 | +5. This request is sent to the Profile Service. |
| 84 | +6. The profile service needs access to the internal lock acquired in **Action 1**. |
| 85 | + |
| 86 | +## Resolution |
| 87 | + |
| 88 | +### Process overview |
| 89 | + |
| 90 | +1. [Create and access a Repair VM](#create-and-access-a-repair-vm) |
| 91 | +2. [Enable Serial Console and memory dump collection](#enable-serial-console-and-memory-dump-collection) |
| 92 | +3. [Rebuild the VM](#rebuild-the-vm) |
| 93 | +4. [Collect the memory dump file](#collect-the-memory-dump-file) |
| 94 | + |
| 95 | +> [!NOTE] |
| 96 | +> When encountering this boot error, the Guest OS is not operational. You will be troubleshooting in Offline mode to resolve this issue. |
| 97 | +
|
| 98 | +### Step 1: Create and access a Repair VM |
| 99 | + |
| 100 | +1. Use [steps 1-3 of the VM Repair Commands](https://docs.microsoft.com/azure/virtual-machines/troubleshooting/repair-windows-vm-using-azure-virtual-machine-repair-commands#repair-process-example) to prepare a Repair VM. |
| 101 | +2. Use Remote Desktop Connection connect to the Repair VM. |
| 102 | + |
| 103 | +### Step 2: Enable Serial Console and memory dump collection |
| 104 | + |
| 105 | +To enable memory dump collection and Serial Console, run the script below: |
| 106 | + |
| 107 | +1. Open an elevated command prompt session (Run as administrator). |
| 108 | +2. Run the following commands: |
| 109 | + |
| 110 | + Enable Serial Console: |
| 111 | + |
| 112 | + ``` |
| 113 | + bcdedit /store <VOLUME LETTER WHERE THE BCD FOLDER IS>:\boot\bcd /ems {<BOOT LOADER IDENTIFIER>} ON |
| 114 | + ``` |
| 115 | +
|
| 116 | + ``` |
| 117 | + bcdedit /store <VOLUME LETTER WHERE THE BCD FOLDER IS>:\boot\bcd /emssettings EMSPORT:1 EMSBAUDRATE:115200 |
| 118 | + ``` |
| 119 | +3. Verify that the free space on the OS disk is as much as the memory size (RAM) on the VM. |
| 120 | +
|
| 121 | + If there's not enough space on the OS disk, you should change the location where the memory dump file will be created and refer that to any data disk attached to the VM that has enough free space. To change the location, replace “%SystemRoot%” with the drive letter (for example “F:”) of the data disk in the below commands. |
| 122 | +
|
| 123 | +Suggested configuration to enable OS dump: |
| 124 | +
|
| 125 | +Load Broken OS Disk: |
| 126 | +
|
| 127 | +``` |
| 128 | +REG LOAD HKLM\BROKENSYSTEM <VOLUME LETTER OF BROKEN OS DISK>:\windows\system32\config\SYSTEM |
| 129 | +``` |
| 130 | +
|
| 131 | +Enable on ControlSet001: |
| 132 | +
|
| 133 | +``` |
| 134 | +REG ADD "HKLM\BROKENSYSTEM\ControlSet001\Control\CrashControl" /v CrashDumpEnabled /t REG_DWORD /d 1 /f |
| 135 | +REG ADD "HKLM\BROKENSYSTEM\ControlSet001\Control\CrashControl" /v DumpFile /t REG_EXPAND_SZ /d "%SystemRoot%\MEMORY.DMP" /f |
| 136 | +REG ADD "HKLM\BROKENSYSTEM\ControlSet001\Control\CrashControl" /v NMICrashDump /t REG_DWORD /d 1 /f |
| 137 | +``` |
| 138 | +
|
| 139 | +Enable on ControlSet002: |
| 140 | +
|
| 141 | +``` |
| 142 | +REG ADD "HKLM\BROKENSYSTEM\ControlSet002\Control\CrashControl" /v CrashDumpEnabled /t REG_DWORD /d 1 /f |
| 143 | +REG ADD "HKLM\BROKENSYSTEM\ControlSet002\Control\CrashControl" /v DumpFile /t REG_EXPAND_SZ /d "%SystemRoot%\MEMORY.DMP" /f |
| 144 | +REG ADD "HKLM\BROKENSYSTEM\ControlSet002\Control\CrashControl" /v NMICrashDump /t REG_DWORD /d 1 /f |
| 145 | +``` |
| 146 | +
|
| 147 | +Unload broken OS disk: |
| 148 | +
|
| 149 | +``` |
| 150 | +REG UNLOAD HKLM\BROKENSYSTEM |
| 151 | +``` |
| 152 | +
|
| 153 | +### Step 3: Rebuild the VM |
| 154 | +
|
| 155 | +Use [step 5 of the VM Repair Commands](https://docs.microsoft.com/azure/virtual-machines/troubleshooting/repair-windows-vm-using-azure-virtual-machine-repair-commands#repair-process-example) to reassemble the VM. |
| 156 | +
|
| 157 | +### Step 4: Collect the memory dump file |
| 158 | +
|
| 159 | +To resolve this problem, you would need first to gather the memory dump file for the crash and contact support with the memory dump file. To collect the dump file, follow these steps: |
| 160 | +
|
| 161 | +1. Attach the OS disk to a new Repair VM: |
| 162 | +
|
| 163 | + - Use [steps 1-3 of the VM Repair Commands](https://docs.microsoft.com/azure/virtual-machines/troubleshooting/repair-windows-vm-using-azure-virtual-machine-repair-commands#repair-process-example) to prepare a new Repair VM. |
| 164 | + - Use Remote Desktop Connection connect to the Repair VM. |
| 165 | +
|
| 166 | +2. Locate the dump file and submit a support ticket: |
| 167 | +
|
| 168 | + - On the repair VM, go to windows folder in the attached OS disk. If the driver letter that is assigned to the attached OS disk is F, you need to go to `F:\Windows`. |
| 169 | + - Locate the memory.dmp file, and then [submit a support ticket](https://portal.azure.com/?#blade/Microsoft_Azure_Support/HelpAndSupportBlade) with the memory dump file. |
| 170 | + - If you are having trouble locating the memory.dmp file, you may wish to use [non-maskable interrupt (NMI) calls in serial console](https://docs.microsoft.com/azure/virtual-machines/troubleshooting/serial-console-windows#use-the-serial-console-for-nmi-calls) instead. You can follow the guide to [generate a crash dump file using NMI calls](https://docs.microsoft.com/windows/client-management/generate-kernel-or-complete-crash-dump) here. |
0 commit comments