Skip to content

Commit 3188bcf

Browse files
author
Simonx Xu
authored
Merge pull request #9577 from v-lianna/CI_6880
AB#6880 Create hyper-v-start-state-access-failures-clustered-standalone.md
2 parents f6a0820 + 2921e90 commit 3188bcf

File tree

2 files changed

+203
-0
lines changed

2 files changed

+203
-0
lines changed

support/windows-server/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2854,6 +2854,8 @@ items:
28542854
items:
28552855
- name: Get-VMNetworkAdapter command doesn't report IP addresses
28562856
href: ./virtualization/get-vmnetworkadapter-doesnt-report-ip-addresses.md
2857+
- name: Troubleshoot Hyper-V virtual machine start, state, and access failures
2858+
href: ./virtualization/hyper-v-start-state-access-failures-clustered-standalone.md
28572859
- name: Virtual Machines enter the paused state
28582860
href: ./virtualization/virtual-machines-enter-paused-state-low-disk-free.md
28592861
- name: Virtual machine shutdown actions don't run
Lines changed: 201 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,201 @@
1+
---
2+
title: Troubleshoot Hyper-V Virtual Machine Startup, State, and Access Failures
3+
description: Helps resolve issues related to Hyper-V VMs that fail to start, become stuck in transitional states, or become inaccessible in clustered and standalone environments.
4+
ms.date: 09/01/2025
5+
manager: dcscontentpm
6+
audience: itpro
7+
ms.topic: troubleshooting
8+
ms.reviewer: kaushika, jeffhugh, v-lianna
9+
ms.custom:
10+
- sap:virtualization and hyper-v\virtual machine state
11+
- pcy:WinComm Storage High Avail
12+
---
13+
# Troubleshoot Hyper-V virtual machine startup, state, and access failures in clustered and standalone environments
14+
15+
This article provides a detailed troubleshooting guide to help you resolve issues related to Hyper-V virtual machines (VMs) that fail to start, become stuck in transitional states (such as starting, stopping, saved, or paused), or become inaccessible in both clustered and standalone environments. Common causes include VM configuration file corruption, storage or network problems, process lockups, checkpoint or automatic virtual hard disk (AVHDX) issues, and permission or driver errors. Timely identifying and resolving these problems is essential to minimizing VM downtime, preventing business disruption, and avoiding data loss in production environments.
16+
17+
When dealing with Hyper-V VM issues, you might encounter various symptoms, including:
18+
19+
## End-user and technical symptoms
20+
21+
- VMs fail to start or power on in Hyper-V Manager or Failover Cluster Manager.
22+
- VMs are stuck in states like "starting," "stopping," "saved-critical," "paused," or "restoring."
23+
- VMs are missing or invisible in Hyper-V Manager or the output of `Get-VM`.
24+
- VM states are displayed as "running critical," "stopping," or "online pending."
25+
- VM consoles are inaccessible, and remote desktop connections are unavailable.
26+
- VMs fail to migrate successfully between cluster nodes.
27+
- Hyper-V Manager or Failover Cluster Manager can't change VM states or report their status.
28+
- The virtual machine management service (VMMS) or VMM services are stuck in a "Stopping" state.
29+
- Storage volumes, such as Cluster Shared Volumes (CSVs), appear as RAW or offline, and VHDX files are inaccessible or locked.
30+
31+
## Error messages, event logs, and codes
32+
33+
- Error messages:
34+
35+
- > A virtual machine or container with the specified identifier already exists in Hyper-V.
36+
- > Failed to start worker process: Catastrophic failure 0x8000FFFF.
37+
- > Virtual machine failed to generate VHD tree: The system cannot find the file specified (0x80070002).
38+
- > The process cannot access the file because it is being used by another process.
39+
- > Failed to perform the Cleaning up stale reference point(s) operation. The virtual machine is currently performing: Turning Off.
40+
- > The file or directory is corrupted and unreadable. (0x80070570)
41+
42+
- Event IDs: 21502, 1069, 1205, 5120, 1135, 225, 15500, 1793, 1795, 7034, 7031, 7036, 16300, 14102, 4092, 18012, 18016, 20848, 20864, 12620, 12240, 153, 20848, 18524, 1146, 1230.
43+
- Cluster resources are stuck in "online pending" or "failed" states.
44+
- VMs are unavailable after patching, host restarts, or storage and network events.
45+
46+
Hyper-V VM failures might originate from several root causes, which are categorized as follows, along with their respective resolutions:
47+
48+
- [Cause 1: Configuration and metadata corruption](#cause-1-configuration-and-metadata-corruption)
49+
- [Cause 2: Storage and file system issues](#cause-2-storage-and-file-system-issues)
50+
- [Cause 3: Process and service lockups](#cause-3-process-and-service-lockups)
51+
- [Cause 4: Permissions, security, and driver problems](#cause-4-permissions-security-and-driver-problems)
52+
- [Cause 5: Cluster, network, and failover issues](#cause-5-cluster-network-and-failover-issues)
53+
54+
## Initial checks before proceeding
55+
56+
To resolve these issues, perform the initial checks using the following steps:
57+
58+
1. Identify error messages, event IDs, and affected VMs using Hyper-V Manager, Failover Cluster Manager, or PowerShell.
59+
2. Review system logs, Hyper-V logs, and cluster event logs for relevant entries.
60+
61+
## Cause 1: Configuration and metadata corruption
62+
63+
- Corrupt or missing VM configuration files (for example, `.VMCX` and `.XML`) prevent Hyper-V from recognizing or starting the VM, often after failed migrations, storage issues, or abrupt shutdowns.
64+
- Checkpoint (AVHDX) chain corruption or missing differencing disks prevent the VM from starting.
65+
- Orphaned checkpoints, incomplete merges, or invalid entries in configuration files block VM operations.
66+
- Duplicate VM GUIDs or object entries, particularly with System Center Virtual Machine Manager (SCVMM), can cause "already exists" errors and prevent VM imports or starts.
67+
68+
To resolve this issue, see [File system and storage checks](#resolution-file-system-and-storage-checks).
69+
70+
## Cause 2: Storage and file system issues
71+
72+
- CSVs or volumes are offline, RAW, or inaccessible due to storage subsystem failures, disk corruption, or drive letter conflicts.
73+
- VHD or VHDX files are locked or in use by another process, such as a backup or antivirus program.
74+
- Missing or corrupt VM runtime state files (VMRS) impede VM operations.
75+
- BitLocker-locked disks prevent VMs from starting after patching or rebooting.
76+
77+
### Resolution: File system and storage checks
78+
79+
1. Verify storage volumes:
80+
81+
- Use Disk Management or `diskpart` to ensure volumes are online and properly assigned.
82+
- If volumes are RAW or missing, reassign drive letters and repair disk corruption using `chkdsk`:
83+
84+
```console
85+
chkdsk <drive_letter>: /f /r</drive_letter>
86+
```
87+
88+
2. Check VM configuration and disk file presence:
89+
90+
- Confirm the existence of `.VMCX`, `.VMRS`, `.VHDX`, and `.AVHDX` files in the VM folder.
91+
- For missing or corrupt configuration files, rebuild the VM using existing VHDX files or restore from a backup.
92+
- For missing or corrupt AVHDX files:
93+
94+
```powershell
95+
Set-VHD -Path <vhdx path> -ParentPath <parent vhdx path> -IgnoreIDMismatch</parent></vhdx>
96+
```
97+
98+
- If BitLocker is enabled, unlock the disk:
99+
100+
```console
101+
manage-bde -unlock D: -RecoveryPassword <yourrecoverypassword></yourrecoverypassword>
102+
```
103+
104+
- For locked or in-use files, use Process Explorer to identify and terminate the locking process, or reboot the host to release the lock.
105+
106+
## Cause 3: Process and service lockups
107+
108+
- Stale VM Worker Process (VMWP) or VMMS processes are stuck due to storage or network issues or deadlocks.
109+
- Failed attempts to terminate VM processes via Task Manager, `taskkill`, or Process Explorer persist due to kernel or resource locks.
110+
111+
### Resolution: Process and service recovery
112+
113+
1. If the VM is stuck in transitional states:
114+
115+
- End the VM process:
116+
117+
```console
118+
taskkill /PID <pid> /F</pid>
119+
```
120+
121+
- Restart VMMS or the host if processes remain stuck.
122+
2. Remove saved states or checkpoints:
123+
124+
```powershell
125+
Get-VMSnapshot <vmname> | Remove-VMSavedState<br>Remove-VMSavedState <vmname></vmname></vmname>
126+
```
127+
128+
## Cause 4: Permissions, security, and driver problems
129+
130+
- Permissions issues restrict the Hyper-V service account from accessing VM files or folders.
131+
- Antivirus or third-party filter drivers interfere with Hyper-V, blocking file access or causing merge failures.
132+
- Outdated or misconfigured storage or network drivers lead to connectivity loss or failover events.
133+
134+
### Resolution: Permission and security configuration
135+
136+
1. Ensure the Hyper-V service account has full control over VM files and folders.
137+
2. Apply antivirus exclusions as per Microsoft's Hyper-V documentation.
138+
3. Identify and unload problematic filter drivers:
139+
140+
```console
141+
fltmc
142+
fltmc unload <drivername></drivername>
143+
```
144+
145+
## Cause 5: Cluster, network, and failover issues
146+
147+
- CSV or network communication failures, such as cluster node isolation, result in mass VM failovers or reboots.
148+
- Improper cluster configurations or inconsistent patching across nodes cause instability.
149+
- Live migration or failover failures occur due to insufficient memory, incompatible settings, or node misconfigurations.
150+
151+
### Resolution 1: Cluster and network remediation
152+
153+
1. Validate the cluster health and configuration using the cluster validation wizard or:
154+
155+
```powershell
156+
Test-Cluster
157+
```
158+
159+
2. Resolve network issues by reviewing event IDs (for example, 5120 and 1135) and adjusting parameters:
160+
161+
```powershell
162+
(Get-Cluster).SameSubnetThreshold = <value></value>
163+
```
164+
165+
3. Ensure consistent patching and proper network/storage configurations across nodes.
166+
167+
### Resolution 2: VM configuration repairs and rebuilds
168+
169+
1. For corrupt configuration files, edit the `.VMCX` file or create a new VM with existing disks.
170+
2. Address saved state or checkpoint issues by removing invalid checkpoints or reattaching disks.
171+
172+
## Escalation and bug reference
173+
174+
If known bugs or product defects are involved (for example, UEFI firmware bugs or cluster communication issues), review vendor advisories and apply the recommended updates or fixes.
175+
176+
## Data collection
177+
178+
Gather the following logs and diagnostic information to assist with troubleshooting:
179+
180+
- Hyper-V event logs:
181+
182+
```powershell
183+
Get-WinEvent -LogName Microsoft-Windows-Hyper-V-VMMS-Admin | Export-Csv -Path <unc path></unc>
184+
```
185+
186+
- Cluster logs:
187+
188+
```powershell
189+
Get-ClusterLog -UseLocalTime -Destination <folder></folder>
190+
```
191+
192+
- Process dumps for stuck services:
193+
194+
```console
195+
procdump -ma <pid> <output_path></output_path></pid>
196+
```
197+
198+
## References
199+
200+
- [Hyper-V performance tuning guide](/windows-server/virtualization/hyper-v)
201+
- [Failover cluster troubleshooting](/sql/sql-server/failover-clusters/windows/failover-cluster-troubleshooting)

0 commit comments

Comments
 (0)