|
| 1 | +--- |
| 2 | +title: Windows Server iSCSI Storage Connectivity Troubleshooting Guidance |
| 3 | +description: Resolves issues that occur in SAN-based and iSCSI storage environments in Windows Server. |
| 4 | +ms.date: 10/08/2025 |
| 5 | +manager: dcscontentpm |
| 6 | +audience: itpro |
| 7 | +ms.topic: troubleshooting |
| 8 | +ms.reviewer: kaushika |
| 9 | +ms.custom: |
| 10 | +- sap:Backup, Recovery, Disk, and Storage\iSCSI |
| 11 | +- pcy:WinComm Storage High Avail |
| 12 | +appliesto: |
| 13 | + - <a href=https://learn.microsoft.com/windows/release-health/windows-server-release-info target=_blank>Supported versions of Windows Server</a> |
| 14 | +--- |
| 15 | + |
| 16 | +# Windows Server iSCSI storage connectivity troubleshooting guidance |
| 17 | + |
| 18 | +## Summary |
| 19 | + |
| 20 | +SAN-based and iSCSI storage environments in Windows Server (2025, 2022, 2019, and 2016) are essential for clustering, high-availability, virtualization, and large-scale file services. However, these environments can experience various issues, from connectivity dropouts and disk corruption to performance degradation and cluster failures. Causes range from misconfiguration and a driver-firmware mismatch to underlying network instability, hardware faults, and OS storage subsystem bugs. This article provides a step-by-step approach to diagnose and resolve common iSCSI, disk, and cluster-related failures to help administrators maintain high service availability, data integrity, and operational efficiency. |
| 21 | + |
| 22 | +## Troubleshooting checklist |
| 23 | + |
| 24 | +Use this checklist for systematic troubleshooting: |
| 25 | + |
| 26 | +- **Networking** |
| 27 | + - Make sure that iSCSI, management, and client networks are segregated and correctly routed. |
| 28 | + - Are MTU, VLANs, Jumbo Frames, and Flow Control/ROCE/PFC are consistently configured? |
| 29 | +- **Firmware/Driver Updates** |
| 30 | + - Are network adapters, storage controllers, and storage array firmware current and vendor-supported? |
| 31 | +- **Storage Infrastructure** |
| 32 | + - Are all SCSI, multipath or MPIO, and iSCSI target device drivers and tools up to date? |
| 33 | + - Verify that all SAN zoning and LUN masking are correct. |
| 34 | +- **Windows Configuration** |
| 35 | + - Does the appropriate MPIO policy exist? Verify that disks and LUNs are visible and healthy in Disk Management. |
| 36 | + - Are cluster and quorum configurations validated (Test-Cluster, validation reports)? |
| 37 | + - Are antivirus exclusions set for storage and cluster paths? |
| 38 | + - Is the correct Group Policy policy enabled for disk and volume access? |
| 39 | +- **Backup/Restore** |
| 40 | + - Does the VSS and backup configuration comply with storage guidelines? |
| 41 | + - Are shadow copies and snapshots managed and not excessive? |
| 42 | +- **Documentation/Change Log** |
| 43 | + - Document all infrastructure changes that occurred before the incident (updates, config edits, firmware, hardware swaps, and so on). |
| 44 | + |
| 45 | +## Common issues and solutions |
| 46 | + |
| 47 | +The following sections detail the most common failure modes and provide step-by-step solutions. |
| 48 | + |
| 49 | +### iSCSI disk disconnections, failover issues, surprise removal |
| 50 | + |
| 51 | +#### Symptoms |
| 52 | + |
| 53 | +- Drives or volumes offline or RAW after restart. |
| 54 | +- "Disk X has been surprise removed" (Event ID 157). |
| 55 | +- iSCSI errors: Event IDs 9, 20, 27, 39, 153, "Target did not respond," "Initiator failed to connect," "IO operation at logical block address was retried." |
| 56 | + |
| 57 | +#### Cause |
| 58 | + |
| 59 | +- Network instability. |
| 60 | +- Multi-path configuration errors. |
| 61 | +- Mismatched VLAN/MTU/Jumbo settings, improper failover scripts. |
| 62 | +- Outdated firmware/drivers. |
| 63 | +- Resource exhaustion on SAN/NAS array. |
| 64 | + |
| 65 | +#### Resolution |
| 66 | + |
| 67 | +1. Network and Multipath review: |
| 68 | + - Verify network hardware configuration (MTU, VLAN, Jumbo, Flow Control). |
| 69 | + - Use mpclaim -v and Get-MSDSMAutomaticClaim. |
| 70 | + - Remove duplicate iSCSI sessions and unnecessary paths. |
| 71 | +2. Firmware and driver update: |
| 72 | + - Update all storage and network firmware and drivers. |
| 73 | +3. Timeout registry tuning: |
| 74 | + - Set disk and iSCSI timeout (HKLM\SYSTEM\CurrentControlSet\Services\disk, TimeOutValue=179). |
| 75 | +4. Check SAN Health: |
| 76 | + - Coordinate with vendor to verify logs and event triggers. |
| 77 | +5. Application and script adjustments |
| 78 | + - Avoid reliance on disk numbers (can change after path failover). |
| 79 | +6. Command tools: |
| 80 | + - Get-Disk, Get-PhysicalDisk, Out-GridView (review mapping) |
| 81 | + - Netsh trace start scenario=netconnection capture=yes tracefile=c:\os.etl |
| 82 | +7. Known bugs: |
| 83 | + - For REFS/backup unresponsiveness on Windows Server 2025, apply KB5062660. |
| 84 | + - Review cluster logs for evidence of quorum or heartbeat-related outages. |
| 85 | + |
| 86 | +### 2. Volumes changing to RAW, file system, metadata corruption |
| 87 | + |
| 88 | +#### Symptoms |
| 89 | + |
| 90 | +- NTFS, RECORD, RAW volume, inaccessible share, failed copy or backup. |
| 91 | +- "The file system detected a checksum error" or "Device... has a bad block." |
| 92 | +- ReFS volume unexpectedly becomes RAW. |
| 93 | + |
| 94 | +#### Cause |
| 95 | + |
| 96 | +- Unclean shutdowns, storage disconnects. |
| 97 | +- Physical disk or controller hardware failure. |
| 98 | +- Partition table or volume boundary errors. |
| 99 | +- Service dependencies not set (file shares missing after restart). |
| 100 | + |
| 101 | +#### Resolution |
| 102 | + |
| 103 | +1. File system repair |
| 104 | + - NTFS: chkdsk X: /f /r (back up data first because "/r" might take a long time and could risk further loss. |
| 105 | + - ReFS: Use refsutil for salvage: |
| 106 | + |
| 107 | + ```console |
| 108 | + refsutil salvage -QA D: \<log path> \<recovery path> -x |
| 109 | + refsutil salvage -FA D: \<log path> \<recovery path> -x |
| 110 | + ``` |
| 111 | + |
| 112 | + - Adjust partition boundaries with disk management tool if mismatched. |
| 113 | +2. **Address Underlying Disk Issues |
| 114 | + - Use vendor diagnostics tool, review SMART data, replace as necessary. |
| 115 | +3. Service dependencies |
| 116 | + - Set dependency: sc configuration LanmanServer depend= MSiSCSI |
| 117 | + - Or set critical services to "Automatic (Delayed Start)" in services.msc |
| 118 | +4. Check Hardware/Virtualization Layer |
| 119 | + - Fix drives incorrectly presented as removable in VM configuration or hypervisor layer. |
| 120 | +5. Recover file shares |
| 121 | + - Manually restart LanmanServer or use delayed start. |
| 122 | + |
| 123 | +### Cluster resource, ownership, or quorum failures |
| 124 | + |
| 125 | +#### Symptoms |
| 126 | + |
| 127 | +- Cluster disks go offline or don't bring resource online. |
| 128 | +- "Cluster Disk X contains an invalid mount point." |
| 129 | +- Event IDs 98, 55 |
| 130 | +- Failover Cluster Manager shows failed resources, lost quorum, invalid signatures |
| 131 | + |
| 132 | +#### Cause |
| 133 | + |
| 134 | +- Incorrect cluster quorum configuration. |
| 135 | +- Disk or signature conflicts (duplicate VHDs, snapshots attached). |
| 136 | +- Permissions on cluster-related files (MachineKeys), missing cert/key pairs. |
| 137 | +- Storage vendor PR (persistent reservation) mismatch or stale entries. |
| 138 | + |
| 139 | +#### Resolution |
| 140 | + |
| 141 | +1. Verify cluster and disk resource. Use: |
| 142 | + |
| 143 | + ```powershell |
| 144 | + Test-Cluster |
| 145 | + Get-ClusterLog -Destination \<path> |
| 146 | + ``` |
| 147 | + |
| 148 | + - Review and reassign quorum if it's necessary. |
| 149 | +2. Correct Permissions |
| 150 | + - On certificate store: SYSTEM and ADMINISTRATORS must have full control on: C:\ProgramData\Microsoft\Crypto\RSA\MachineKeys |
| 151 | + - Re-import required certificates as necessary by using certutil -store -service clussvc\my |
| 152 | +3. Handle disk ownership and signature: |
| 153 | + - Detach conflicting or double-mapped VHDs. |
| 154 | + - Verify that all signatures are unique. |
| 155 | + - Involve storage vendor to clear persistent reservations (PRs). |
| 156 | +4. Check Group Policy/Security |
| 157 | + - Make sure that Group Policy Object for "Network access: Restrict anonymous access to Named Pipes and Shares" isn't blocking share access. |
| 158 | + |
| 159 | +### Performance and latency issues |
| 160 | + |
| 161 | +#### Symptoms |
| 162 | + |
| 163 | +- High "Avg. Disk sec/Write" (for example, 500 ms) |
| 164 | +- Slow backups or queries |
| 165 | +- Frequent IO retries |
| 166 | +- Excessive shadow copies (more than 1,000). |
| 167 | +- Disks appear "full" regardless of free space |
| 168 | + |
| 169 | +#### Cause |
| 170 | + |
| 171 | +- Disk fragmentation, lack of TRIM on SSDs. |
| 172 | +- Accumulated VSS or shadow copy snapshots. |
| 173 | +- Network or switch misconfiguration or packet loss. |
| 174 | +- Full or oversized virtual disks matching physical disk boundary. |
| 175 | + |
| 176 | +#### Resolution |
| 177 | + |
| 178 | +1. Disk and file system optimization |
| 179 | + - SSD: Optimize-Volume -DriveLetter X -ReTrim -Verbose |
| 180 | + - Remove excessive shadow copies: vssadministrator delete shadows /all |
| 181 | +2. NFS and iSCSI optimization |
| 182 | + - Verify jumbo frames, set correct MTU:netsh interface ipv4 set subinterface "\<Name>" mtu=9000 store=persistent |
| 183 | +3. Disk cleanup |
| 184 | + - Resize VHDs appropriately. |
| 185 | + - Remove orphaned or unused disks by using DevNodeClean:devnodeclean /n (dry run), devnodeclean /r (remove) |
| 186 | +4. Firmware, driver and policy review |
| 187 | + - Update HBA, network adapter, and storage driver. |
| 188 | + - Ensure correct antivirus exclusions. |
| 189 | + |
| 190 | +### System, driver, registry, service problems |
| 191 | + |
| 192 | +#### Symptoms |
| 193 | + |
| 194 | +- Storage commands or PowerShell scripts fail and return MetadataError or "Initiator instance does not exist." |
| 195 | +- MOF, WMI, provider errors. |
| 196 | +- iSCSI Initiator "Favorites" UI lists old IPs after restarts. |
| 197 | +- Service doesn't start because of missing permissions or keys. |
| 198 | + |
| 199 | +#### Cause |
| 200 | + |
| 201 | +- MOF file or WMI provider corruption (often triggered by interrupted recompilation). |
| 202 | +- OS-level iSCSI subsystem or storage service corruption. |
| 203 | +- Known bugs (for example, REFS backup with Veeam on Windows Server 2025, iSCSI UI bug). |
| 204 | + |
| 205 | +#### Resolution |
| 206 | + |
| 207 | +1. MOF file recovery |
| 208 | + |
| 209 | + - Recompile MOFs: |
| 210 | + |
| 211 | + ```console |
| 212 | + mofcomp iscsiwmiv2_uninstall.mof |
| 213 | + mofcomp iscsirem.mof |
| 214 | + ... |
| 215 | + |
| 216 | + - Verify WMI and storage provider states and repair as necessary. |
| 217 | +2. Permissions and certificate issues |
| 218 | + - Use certutil and file permissions tools and fix missing private keys. |
| 219 | +3. Apply relevant hotfixes |
| 220 | + - For Windows Server 2025 REFS with Veeam: [Apply KB5062660](https://support.microsoft.com/help/5062660). |
| 221 | + - For known bugs, monitor for product group guidance. |
| 222 | +4. UI and rRegistry updates |
| 223 | + - For iSCSI Initiator Favorites bug, functionality is unaffected. A fix may not be available. |
| 224 | + |
| 225 | +## Common issues quick reference table |
| 226 | + |
| 227 | +| Symptom | Root cause | Resolution | |
| 228 | +| --- | --- | --- | |
| 229 | +| Disk surprise removed (157) | Storage/network instability | Network configuration, MPIO settings, update drivers | |
| 230 | +| Disk RAW/Inaccessible | File system or partition issue | chkdsk, refsutil, adjust partition, backup data | |
| 231 | +| "The device has a bad block" (7) | Hardware fault | Replace disk/controller, run vendor diagnostics | |
| 232 | +| iSCSI failed to connect (20,27) | Config/network/target issue | Review target IP, reset sessions, fix mappings | |
| 233 | +| Cluster disk/ownership errors | Quorum/mount/perm issue | Verify cluster configuration, fix permissions, vendor PR | |
| 234 | +| Performance/Backup slowness | Fragmentation, VSS, driver | Disk trim, cleanup shadow copies, update drivers | |
| 235 | +| File shares disappear after reboot | Service dep. not set | sc configuration, Delayed Start, review logs | |
| 236 | +| iSCSI Initiator UI shows old IP | OS bug | Ignore, no functional impact | |
| 237 | +| ReFS/backup hang on 2025 + Veeam | MS bug (KB5062660) | Apply KB5062660 update | |
| 238 | +| Incorrect disk mapping | Wrong IP/session/favorite | Remove/add correct targets, use persistent sessions | |
| 239 | +| Shadow copies and more than 1,000 | Performance drain | vssadministrator delete shadows /all | |
| 240 | +| MOF/WMI errors | WMI/MOF provider corruption | Recompile MOF files, verify repository | |
| 241 | + |
| 242 | +## Data collection |
| 243 | + |
| 244 | +Before you contact Microsoft Support, you can gather the following information about your issue. |
| 245 | + |
| 246 | +- **Event Viewer:** Export system, application, and storage logs |
| 247 | + - Filter: Event IDs 9, 20, 27, 39, 43, 55, 98, 129, 153, 157, 507 (and others, if relevant) |
| 248 | +- **Cluster:** |
| 249 | + - Get-ClusterLog -Destination \<path> |
| 250 | + - Test-Cluster |
| 251 | +- **Disk/Physical Mapping:** |
| 252 | + - Get-WmiObject -Class win32_diskdrive | select \* |
| 253 | + - Get-Disk, Get-PhysicalDisk |
| 254 | +- **Network Traces:** |
| 255 | + - netsh trace start scenario=netconnection capture=yes tracefile=\<path> |
| 256 | + - Wireshark .pcap as necessary |
| 257 | +- **Performance:** |
| 258 | + - Perfmon, logman for disk/network stats |
| 259 | + - Storport traces: logman create trace drivers_storage |
| 260 | +- **Advanced:** |
| 261 | + - refsutil logs for ReFS recovery |
| 262 | + - Output from tools like DevNodeClean, MPIO and iscsicli.exe |
| 263 | +- **Service Config:** |
| 264 | + - sc qc \<service>, sc config \<service> depend= MSiSCSI |
| 265 | +- **Screenshots:** For UI anomalies (disk management, iSCSI Initiator "Favorites," and so on) |
| 266 | + |
| 267 | +## References |
| 268 | + |
| 269 | +- [Microsoft Docs: iSCSI Initiator](/windows-server/storage/iscsi/iscsi-target-server) |
| 270 | +- [Windows Server Failover Clustering](/windows-server/failover-clustering/failover-clustering-overview) |
| 271 | +- [KB5062660: Windows Server 2025 REFS Fix](https://support.microsoft.com/help/5062660) |
| 272 | +- [Troubleshoot: iSCSI initiator not login to favorite targets](/troubleshoot/windows-server/backup-and-storage/iscsi-initiator-not-login-to-favorite-targets) |
| 273 | +- [Microsoft Docs: PowerShell Storage Cmdlets](/powershell/module/storage/) |
0 commit comments