Skip to content

Commit f69d917

Browse files
authored
Merge pull request #114286 from v-miegge/v-miegge/directory-service-initialization-failure
CI 117468 - Created file, added images
2 parents 2486b1c + 604657d commit f69d917

File tree

4 files changed

+231
-0
lines changed

4 files changed

+231
-0
lines changed
39.4 KB
Loading
73.8 KB
Loading
96.1 KB
Loading
Lines changed: 231 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,231 @@
1+
---
2+
title: Troubleshoot Windows stop error – directory service initialization failure
3+
description: Resolve issues where an Active Directory domain controller virtual machine (VM) in Azure is stuck in a loop stating it needs to restart.
4+
services: virtual-machines-windows, azure-resource-manager
5+
documentationcenter: ''
6+
author: v-miegge
7+
manager: dcscontentpm
8+
editor: ''
9+
tags: azure-resource-manager
10+
11+
ms.assetid: 3396f8fe-7573-4a15-a95d-a1e104c6b76d
12+
ms.service: virtual-machines-windows
13+
ms.workload: na
14+
ms.tgt_pltfrm: vm-windows
15+
ms.topic: troubleshooting
16+
ms.date: 05/05/2020
17+
ms.author: v-miegge
18+
---
19+
20+
# Troubleshoot Windows stop error – directory service initialization failure
21+
22+
This article provides steps to resolve issues where an Active Directory domain controller virtual machine (VM) in Azure, is stuck in a loop and states that it needs to restart.
23+
24+
## Symptom
25+
26+
When you use [Boot diagnostics](https://docs.microsoft.com/azure/virtual-machines/troubleshooting/boot-diagnostics) to view the screenshot of the VM, the screenshot shows that the VM needs to restart because of an error, displaying the stop code **0xC00002E1** in Windows Server 2008 R2, or **0xC00002E2** in Windows Server 2012 or later.
27+
28+
![Windows Server 2012 startup screen states "Your PC ran into a problem and needs to restart. We're just collecting some error info, and then we'll restart for you.".](./media/troubleshoot-directory-service-initialization-failure/1.png)
29+
30+
## Cause
31+
32+
Error code **0xC00002E2** represents **STATUS_DS_INIT_FAILURE**, and error code **0xC00002E1** represents **STATUS_DS_CANT_START**. Both errors occur when there's an issue with the directory service.
33+
34+
As the OS boots up, it's then forced to restart automatically by the Local Security Authentication Server (**LSASS.exe**), which authenticates user logins. Authentication can't happen when the operating system on the VM is a domain controller that doesn't have read/write access to its local Active Directory database. Because of a lack of access to **Active Directory (AD)**, LSASS.exe can't authenticate, and it's forced to restart the OS.
35+
36+
This error can be caused by any of the following conditions:
37+
38+
- There's no access to the disk holding the local AD database (**NTDS.DIT**).
39+
- The disk holding the local AD database (NTDS.DIT) has run out of free space.
40+
- The local AD database (NTDS.DIT) file is missing.
41+
- The VM has multiple disks and the Storage Area Network (SAN) policy is configured improperly. The SAN policy isn't set to **ONLINEALL**, and the non-OS disks are attached in offline mode on the disk manager.
42+
- The local AD database (NTDS.DIT) file is corrupt.
43+
44+
## Solution
45+
46+
### Process overview:
47+
48+
1. Create and Access a Repair VM.
49+
1. Free space on disk.
50+
1. Check that the drive containing the AD database is attached.
51+
1. Enable Directory Services Restore Mode.
52+
1. **Recommended**: Before you rebuild the VM, enable serial console and memory dump collection.
53+
1. Rebuild the VM.
54+
1. Reconfigure the SAN Policy.
55+
56+
> [!NOTE]
57+
> When encountering this error, the Guest OS isn't operational. You will be troubleshooting in offline mode to resolve the issue.
58+
59+
### Create and access a repair VM
60+
61+
1. Use [steps 1-3 of the VM Repair Commands](https://docs.microsoft.com/azure/virtual-machines/troubleshooting/repair-windows-vm-using-azure-virtual-machine-repair-commands#repair-process-example) to prepare a Repair VM.
62+
1. Using Remote Desktop Connection connect to the Repair VM.
63+
64+
### Free up space on disk
65+
66+
As the disk is now attached to a repair VM, verify that the disk holding the Active Directory internal database has enough space to perform correctly.
67+
68+
1. Check whether the disk is full by right-clicking on the drive and selecting **Properties**.
69+
1. If the disk has less than 300 Mb of free space, [expand it to a maximum of 1 Tb using PowerShell](https://docs.microsoft.com/azure/virtual-machines/windows/expand-os-disk).
70+
1. If the disk has reached 1 Tb of used space, perform a disk cleanup.
71+
72+
1. Use PowerShell to [detach the data disk](https://docs.microsoft.com/azure/virtual-machines/windows/detach-disk#detach-a-data-disk-using-powershell) from the broken VM.
73+
1. Once detached from the broken VM, [attach the data disk](https://docs.microsoft.com/azure/virtual-machines/windows/attach-disk-ps#attach-an-existing-data-disk-to-a-vm) to a functioning VM.
74+
1. Use the [Disk Cleanup tool](https://support.microsoft.com/help/4026616/windows-10-disk-cleanup) to free up additional space.
75+
76+
1. **Optional** - If more space is needed, open a CMD instance and enter the `defrag <LETTER ASSIGNED TO THE OS DISK>: /u /x /g` command to perform a de-fragmentation on the drive:
77+
78+
* In the command, replace `<LETTER ASSIGNED TO THE OS DISK>` with the OS Disk's letter. For example, if the disk letter is `F:`, then the command would be `defrag F: /u /x /g`.
79+
80+
* Depending upon the level of fragmentation, the de-fragmentation could take hours.
81+
82+
If there's enough space on the disk, continue to the next task.
83+
84+
### Check that the drive containing the Active Directory database is attached
85+
86+
1. Open an elevated CMD instance and run the following commands:
87+
88+
1. Load registry file:
89+
90+
`REG LOAD HKLM\BROKENSYSTEM f:\windows\system32\config\SYSTEM`
91+
92+
The designation `f:` assumes that the disk is drive `F:`. Use the drive letter belonging to the drive containing the OS disk.
93+
94+
1. Determine the drive letter and folder of **NTDS.DIT**:
95+
96+
```
97+
REG QUERY "HKLM\BROKENSYSTEM\ControlSet001\Services\NTDS\parameters" /v "DSA Working Directory"
98+
REG QUERY "HKLM\BROKENSYSTEM\ControlSet001\Services\NTDS\parameters" /v "DSA Database file"
99+
REG QUERY "HKLM\BROKENSYSTEM\ControlSet001\Services\NTDS\parameters" /v "Database backup path"
100+
REG QUERY "HKLM\BROKENSYSTEM\ControlSet001\Services\NTDS\parameters" /v "Database log files path"
101+
```
102+
103+
1. Unload registry file:
104+
105+
`REG UNLOAD HKLM\BROKENSYSTEM`
106+
107+
1. Using Azure portal, verify that the drive where NTDS.DIT is set up, is added to the VM.
108+
1. Using the Disk Management console from the guest OS, verify that the disk containing NTDS.DIT is online.
109+
1. The Disk Management tool can be found in **Administrative Tools > Computer Management > Storage**, or may be accessed using the `diskmgmt.msc` command in a CMD instance.
110+
1. If the disk isn't attached to the VM, reattach the data disk to fix the issue.
111+
112+
If the disk was attached normally, continue with the next task.
113+
114+
### Enable Directory Services Restore Mode
115+
116+
Set up the VM to boot on **Directory Services Restore Mode (DSRM)** mode to bypass checking the existence of the NTDS.DIT file during boot.
117+
118+
1. Before you continue, verify that you've completed the previous tasks to attach the disk to a repair VM, and have determined which disk the NTDS.DIT file is located in.
119+
1. Using an elevated CMD instance, list the booting partition info on that store to find the identifier from the active partition:
120+
121+
`bcdedit /store <Drive Letter>:\boot\bcd /enum`
122+
123+
Replace `< Drive Letter >` with the letter determined in the previous steps.
124+
125+
![The screenshot shows an elevated CMD instance after entering the 'bcdedit /store <Drive Letter>:\boot\bcd /enum' command, which displays Windows Boot Manager with the identifier.](./media/troubleshoot-directory-service-initialization-failure/2.png)
126+
127+
1. Enable the `safeboot DsRepair` flag on the booting partition:
128+
129+
`bcdedit /store <Drive Letter>:\boot\bcd /set {<Identifier>} safeboot dsrepair`
130+
131+
Replace `< Drive Letter >` and `< Identifier >` with the values determined in the previous steps.
132+
133+
1. Query the booting options again to ensure that your change was properly set.
134+
135+
![The screenshot shows an elevated CMD instance after enabling the safeboot DsRepair flag.](./media/troubleshoot-directory-service-initialization-failure/3.png)
136+
137+
### Recommended: before you rebuild the VM, enable serial console and memory dump collection
138+
139+
To enable memory dump collection and Serial Console, run the following script by opening an elevated command prompt session (Run as administrator), and run the following commands.
140+
141+
1. Enable the Serial Console:
142+
143+
```
144+
bcdedit /store <VOLUME LETTER WHERE THE BCD FOLDER IS>:\boot\bcd /ems {<BOOT LOADER IDENTIFIER>} ON
145+
bcdedit /store <VOLUME LETTER WHERE THE BCD FOLDER IS>:\boot\bcd /emssettings EMSPORT:1 EMSBAUDRATE:115200
146+
```
147+
148+
1. Verify that the free space on the OS disk is at least equal to the memory size (RAM) on the VM.
149+
150+
1. If there's not enough space on the OS disk, change the location where the memory dump file will be created, and refer that to any data disk attached to the VM that has enough free space.
151+
152+
To change the location, replace `%SystemRoot%` with the drive letter (such as, `F:`) of the data disk in the following commands.
153+
154+
#### The following configuration is suggested to enable OS dump:
155+
156+
**Load Broken OS Disk**:
157+
158+
`REG LOAD HKLM\BROKENSYSTEM <VOLUME LETTER OF BROKEN OS DISK>:\windows\system32\config\SYSTEM`
159+
160+
**Enable on ControlSet001**:
161+
162+
```
163+
REG ADD "HKLM\BROKENSYSTEM\ControlSet001\Control\CrashControl" /v CrashDumpEnabled /t REG_DWORD /d 1 /f
164+
REG ADD "HKLM\BROKENSYSTEM\ControlSet001\Control\CrashControl" /v DumpFile /t REG_EXPAND_SZ /d "%SystemRoot%\MEMORY.DMP" /f
165+
REG ADD "HKLM\BROKENSYSTEM\ControlSet001\Control\CrashControl" /v NMICrashDump /t REG_DWORD /d 1 /f
166+
```
167+
168+
**Enable on ControlSet002**:
169+
170+
```
171+
REG ADD "HKLM\BROKENSYSTEM\ControlSet002\Control\CrashControl" /v CrashDumpEnabled /t REG_DWORD /d 1 /f
172+
REG ADD "HKLM\BROKENSYSTEM\ControlSet002\Control\CrashControl" /v DumpFile /t REG_EXPAND_SZ /d "%SystemRoot%\MEMORY.DMP" /f
173+
REG ADD "HKLM\BROKENSYSTEM\ControlSet002\Control\CrashControl" /v NMICrashDump /t REG_DWORD /d 1 /f
174+
```
175+
176+
**Unload Broken OS Disk**:
177+
178+
`REG UNLOAD HKLM\BROKENSYSTEM`
179+
180+
### Rebuild the VM
181+
182+
1. Use [step 5 of the VM Repair Commands](https://docs.microsoft.com/azure/virtual-machines/troubleshooting/repair-windows-vm-using-azure-virtual-machine-repair-commands#repair-process-example) to reassemble the VM.
183+
184+
### Reconfigure the Storage Area Network policy
185+
186+
1. When booting in DSRM mode, the only user available to log in is the recovery administrator, which was used when the VM was promoted to a domain controller. All other users will show an authentication error.
187+
188+
1. If no other DC is available, you must log in locally using `.\administrator` or `machinename\administrator` and the DSRM password.
189+
190+
1. Set up the SAN policy so that all the disks are online.
191+
192+
1. Open an elevated CMD instance and enter `DISKPART`.
193+
1. Query for the list of the disks.
194+
195+
`DISKPART> list disk`
196+
197+
1. Enter the following commands to select the disk that needs to be brought online and change the SAN policy:
198+
199+
```
200+
DISKPART> select disk 1
201+
Disk 1 is now the selected disk.
202+
203+
DISKPART> attributes disk clear readonly
204+
Disk attributes cleared successfully.
205+
206+
DISKPART> attributes disk
207+
Current Read-only State : No
208+
Read-only : No
209+
Boot Disk : No
210+
Pagefile Disk : No
211+
Hibernation File Disk : No
212+
Crashdump Disk : No
213+
Clustered Disk : No
214+
215+
DISKPART> online disk
216+
DiskPart successfully onlined the selected disk.
217+
218+
DISKPART> san
219+
SAN Policy : Online All
220+
```
221+
222+
1. Once the issue is fixed, ensure that the flag `DsRepair safeboot` is removed:
223+
224+
`bcdedit /deletevalue {default} safeboot dsrepair`
225+
226+
1. Restart your VM.
227+
228+
> [!NOTE]
229+
> If your VM was just migrated from on-premise and you want to migrate more domain controllers from on-premise to Azure, you should consider following the steps in the article below to prevent this issue from happening in future migrations:
230+
>
231+
> [How to upload existing on-premises Hyper-V domain controllers to Azure by using Azure PowerShell](https://support.microsoft.com/help/2904015)

0 commit comments

Comments
 (0)