|
| 1 | +--- |
| 2 | +title: Configuring Witness Resource Troubleshooting Guide |
| 3 | +description: Resolves issues that affect Witness resources for Windows Server and Azure Stack HCI clusters. |
| 4 | +ms.date: 10/06/2025 |
| 5 | +author: kaushika-msft |
| 6 | +ms.author: kaushika |
| 7 | +manager: dcscontentpm |
| 8 | +audience: itpro |
| 9 | +ms.topic: troubleshooting |
| 10 | +ms.reviewer: kaushika |
| 11 | +ms.custom: |
| 12 | +- sap: clustering and high availability\configuring witness resource |
| 13 | +- pcy: High availability\setup and configuration of clustered services and applications |
| 14 | +appliesto: |
| 15 | + - <a href=https://learn.microsoft.com/windows/release-health/windows-server-release-info target=_blank>Supported versions of Windows Server</a> |
| 16 | +--- |
| 17 | + |
| 18 | +# Configuring Witness resource troubleshooting guide |
| 19 | + |
| 20 | +## Summary |
| 21 | + |
| 22 | +Failover cluster quorum and Witness resources (File Share Witness, Disk Witness, Cloud Witness) are foundational for Windows Server and Azure Stack HCI clusters. They provide critical vote count to maintain high availability. Failures in Witness configuration or operation can jeopardize production workloads and trigger loss of quorum, unplanned failovers, or node shutdowns. |
| 23 | + |
| 24 | +Witness resources depend on correct permissions, resilient networking, proper registration, and accurate configuration. Each potential failure scenario requires a tailored resolution. This guide provides a thorough checklist, diagnoses the most common issues, lists data collection requirements, and provides a solution matrix for quick field troubleshooting. |
| 25 | + |
| 26 | +## Troubleshooting checklist |
| 27 | + |
| 28 | +Use this checklist for systematic troubleshooting: |
| 29 | + |
| 30 | +- **Permissions and share validation** |
| 31 | + - Is the Cluster Name Object (CNO) or computer account granted Full Control on both share and NTFS permissions? |
| 32 | + - Was the share created on a Windows server or compatible NAS (by having SMB 2.0+/Kerberos support)? |
| 33 | + - If you use a cloud witness, are storage account name, key, and endpoint correct? |
| 34 | +- **Network and connectivity** |
| 35 | + - Can all cluster nodes reach the witness (file share/disk/cloud witness) over the expected port (TCP 445 for file, 443 for cloud)? |
| 36 | + - Are there any proxy settings that might block internet connectivity (for cloud witness)? |
| 37 | + - Do DNS/route tables resolve the file share/cloud witness endpoint correctly from each node? |
| 38 | +- **Cluster Configuration** |
| 39 | + - Is only one witness resource configured for the cluster? |
| 40 | + - Is the witness assigned to the cluster core resources and not to a non-cluster-related role? |
| 41 | + - Are there any duplicate or obsolete witness resources visible in Failover Cluster Manager? |
| 42 | +- **Storage health** |
| 43 | + - Are quorum disks healthy, online, and not in use by antivirus scans or backup jobs? |
| 44 | + - Does chkdsk/Repair-Volume report any issues? |
| 45 | +- **Cluster node/OS health** |
| 46 | + - Are all cluster nodes running supported operating systems and up to date drivers? |
| 47 | + - Have any recent migrations or domain moves been fully completed (CNO/CreatingDC registry updated)? |
| 48 | +- **Log review** |
| 49 | + - Are there Access Denied (error 5, 1326), Network Path Not Found (53), or Password Expired (1330) errors in the logs? |
| 50 | + - Are there recurring failover, resource offline/failed events, or authentication errors involving the witness? |
| 51 | + |
| 52 | +## Common issues and solutions |
| 53 | + |
| 54 | +### Issue 1: File share witness doesn't come online |
| 55 | + |
| 56 | +#### Symptoms |
| 57 | + |
| 58 | +Resource state is "Failed" or "Offline." Errors: |
| 59 | +- "File share witness resource... failed to arbitrate for the file share." |
| 60 | +- "Failed to open or create file... Witness.log, error 5." |
| 61 | +- Event ID 1069, 1564, 1205. |
| 62 | + |
| 63 | +#### Root cause and resolution |
| 64 | + |
| 65 | +- Incorrect permissions: |
| 66 | + - Resolution: Assign Full Control permissions on the share and NTFS to the CNO/computer account. |
| 67 | + - On the file share **Security** tab, add "Cluster_Name$" (or cluster computer account), and grant Full Control. |
| 68 | + - Make sure that the share permission and NTFS permission both reflect this change. |
| 69 | +- Network issues and blocked SMB: |
| 70 | + - Resolution: Open TCP 445 between all nodes and the FSW host. |
| 71 | + - Use Test-NetConnection \<FileShare> -Port 445 from each node. |
| 72 | +- Share Hosted on Unsupported Device/Platform: |
| 73 | + - Resolution: Host FSW on only a Windows server or supported NAS that has SMB/Kerberos integration. |
| 74 | + - Avoid DFS, replicated storage, or cloud witness share that's used as a file share witness. |
| 75 | + |
| 76 | +### Issue 2: Disk Witness or Witness Disk intermittently offline |
| 77 | + |
| 78 | +#### Symptoms |
| 79 | + |
| 80 | +Cluster logs: |
| 81 | +- "Cluster Disk X... failed. Error code was 0x5 (Access is denied)" |
| 82 | +- Event IDs: 1558, 1069, 1792, 1795 |
| 83 | +- Physical disk moves between nodes or enters failed state intermittently |
| 84 | + |
| 85 | +#### Root cause and resolution |
| 86 | + |
| 87 | +- Antivirus interference: |
| 88 | + - Resolution: Exclude cluster disks (and their underlying volumes/paths) from antivirus scans. |
| 89 | + - Check running minifilters: fltmc |
| 90 | + - Apply AV exclusion, then restart all nodes. |
| 91 | +- Disk and storage health or corruption: |
| 92 | + - Resolution: Run CHKDSK/Repair-Volume on the disk. Reformat and restore if corrupted. |
| 93 | + - Remove disk from cluster, run chkdsk /f, reformat as NTFS, restore as quorum disk. |
| 94 | +- Persistent reservations and controllers: |
| 95 | + - Resolution: Make sure that storage controllers and drivers are up to date and that single, unique reservations are held per disk. |
| 96 | + - Use vendor-specific tools to check SCSI-3 persistent reservation status. |
| 97 | + |
| 98 | +### Issue 3: Cloud witness can't be configured or brought online |
| 99 | + |
| 100 | +#### Symptoms |
| 101 | + |
| 102 | +Resource can't be added or brought online. |
| 103 | + |
| 104 | +Error messages: |
| 105 | +- "An error occurred while validating access." |
| 106 | +- "Cloud Witness failed to come online. Error: The client and server cannot communicate because they don't possess a common algorithm." |
| 107 | +- "The version of the connection isn't permitted on this storage account." |
| 108 | + |
| 109 | +#### Root cause and resolution |
| 110 | + |
| 111 | +- TLS mismatch or version error: |
| 112 | + - Resolution: Enforce TLS 1.2 on all nodes. |
| 113 | + - PowerShell: |
| 114 | + |
| 115 | + ```powershell |
| 116 | + [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12 |
| 117 | + Set-ClusterQuorum -CloudWitness -AccountName <StorageAccount> -AccessKey <AccessKey> |
| 118 | + ``` |
| 119 | + |
| 120 | + > [!NOTE] |
| 121 | + > |
| 122 | + > Remove any Disk Witness before you add a Cloud Witness. |
| 123 | + |
| 124 | +- Incorrect storage account/endpoint/network: |
| 125 | + - Resolution: |
| 126 | + - Verify name, key, endpoint. |
| 127 | + - Make sure that TCP 443 is open and not blocked by a firewall or proxy. |
| 128 | + - For private endpoints, update hosts file if DNS doesn't resolve. |
| 129 | +- Simultaneous witness types: |
| 130 | + - Resolution: Only one witness (disk/file/cloud) is allowed at a time. |
| 131 | + - Remove any existing witness: |
| 132 | + |
| 133 | + ```powershell |
| 134 | + Set-ClusterQuorum -NodeAndDiskMajority |
| 135 | + Set-ClusterQuorum -CloudWitness -AccountName <StorageAccount> -AccessKey <AccessKey> |
| 136 | + ``` |
| 137 | + |
| 138 | +### Issue 4: Other common causes and fixes |
| 139 | + |
| 140 | +- Password synchronization failures (Error 1330, 1326): |
| 141 | + - Use Failover Cluster Manager or Repair-ClusterNameAccount PowerShell to resync password or reset CNO in AD. |
| 142 | +- Multiple witness resources / configuration drift: |
| 143 | + - Remove all witness resources, set quorum to "none." |
| 144 | + - Manually delete duplicate or stale witness resources. |
| 145 | + - Restore witness cleanly. |
| 146 | +- Network path not found / Access Denied (Error 53/5): |
| 147 | + - Verify network routes/firewall, name resolution, and DNS registration. |
| 148 | + - Restart cluster resource/service after correcting. |
| 149 | +- Kerberos or disabled computer object: |
| 150 | + - Restore computer object from Active Directory Recycle Bin, or re-create if deleted. |
| 151 | + - For Active Directory object disabled, enable and reset password. |
| 152 | +- Cluster quorum loss after node down/resource migration: |
| 153 | + - Migrate core cluster resources before shutting down a node (workaround for known bug in WS2025). |
| 154 | + - If all nodes except one are lost, and quorum isn't achieved, use forced quorum—but only as last resort (Start-ClusterNode -ForceQuorum). |
| 155 | + |
| 156 | +## Common issues quick reference table |
| 157 | + |
| 158 | +| Symptom | Event ID/Error code | Cause | Resolution | |
| 159 | +| --- | --- | --- | --- | |
| 160 | +| File share witness FSW offline/fails arbitration | 1069, 1564, 5, 53 | Permissions, network, misconfiguration| Set Full Control on share/NTFS, open TCP 445, verify CNO | |
| 161 | +| Can't bring cloud witness online | "Not permitted," TLS | Protocol/version mismatch, endpoint | Enforce TLS1.2, verify storage account, remove disk witness | |
| 162 | +| Disk witness intermittent offline | 1558, 1792, 1069, 5 | AV interference, disk corruption | Add AV exclusions, chkdsk/repair, reformat disk | |
| 163 | +| "The specified network password isn't correct," password expired | 86, 1330, 1326 | CNO password unsynced, AD object issues | Repair CNO, reset password, AD User status | |
| 164 | +| Multiple witness resources/duplicate | -- | Configuration drift | Remove all, set quorum none, re-add single witness | |
| 165 | +| After node loss, cluster offline/quorum loss | 1177, 7024, 7031 | Quorum config, bug in WS2025 | Migrate core role, enable dynamic quorum, apply hotfix | |
| 166 | +| "Unable to save property changes for FSW" (NAS/unsupported share) | -- | Non-Windows/non-compatible share | Host witness on Windows or supported device | |
| 167 | +| Witness works on one cluster, not others | -- | Path/permissions mismatch, hosts lookup | Update permissions, hosts file, re-add witness | |
| 168 | +| After domain/OS migration, FSW fails | -- | Registry (CreatingDC), AD object wrong | Update registry, check CreatingDC, restart cluster service | |
| 169 | + |
| 170 | +## Data collection |
| 171 | + |
| 172 | +- Cluster logs: |
| 173 | + - Get-ClusterLog -UseLocalTime |
| 174 | + - Standard logs for the time of occurrence. |
| 175 | +- System/Event Logs: |
| 176 | + - Export system/application logs, filter for event IDs (1558, 1069, 1326, and so on) |
| 177 | +- FSW/Cloud Witness Events: |
| 178 | + - Verify by using PowerShell:Get-ClusterResource | where {$_.ResourceType -like "\witness\" } |
| 179 | +- Connectivity Tests: |
| 180 | + - Test-NetConnection \<FileShareOrCloudEndpoint> -Port 445/443 |
| 181 | + - tracert \<FileShare> |
| 182 | +- Permissions checks: |
| 183 | + - icacls \<witness_path> |
| 184 | + - Review share and NTFS security settings. |
| 185 | +- Active Directory Objects: |
| 186 | + - Verify that status of CNO/Cluster objects in AD Users and Computers. |
| 187 | +- If NAS/non-Windows: |
| 188 | + - SMB protocol supported? Kerberos enabled? |
| 189 | + |
| 190 | +## References |
| 191 | + |
| 192 | +- [Configure a File Share Witness](/windows-server/failover-clustering/file-share-witness?tabs=domain-joined-witness) |
| 193 | +- [Cluster Quorum Best Practices](/windows-server/failover-clustering/manage-cluster-quorum) |
| 194 | +- [Dynamic Quorum in Windows Server](/windows-server/storage/storage-spaces/quorum#dynamic-quorum-behavior) |
0 commit comments