Skip to content

Commit 18b2b7f

Browse files
authored
Merge pull request #77922 from flyPacific/SeedNodeStatusHealthReportUpdate
Add Seed Node Status health report information
2 parents 62cde8b + d1c50c2 commit 18b2b7f

File tree

1 file changed

+23
-0
lines changed

1 file changed

+23
-0
lines changed

articles/service-fabric/service-fabric-understand-and-troubleshoot-with-system-health-reports.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,29 @@ When one of the previous conditions happens, **System.FM** or **System.FMM** fla
6868
* **Property**: Rebuild.
6969
* **Next steps**: Investigate the network connection between the nodes, as well as the state of any specific nodes that are listed on the description of the health report.
7070

71+
### Seed Node Status
72+
**System.FM** reports a cluster level warning if some seed nodes are unhealthy. Seed nodes are the nodes which maintain the availability of the underlying cluster. These nodes help to ensure the cluster remains up by establishing leases with other nodes and serving as tiebreakers during certain kinds of network failures. If a majority of the seed nodes are down in the cluster and they are not brought back, the cluster automatically shuts down.
73+
74+
A seed node is unhealthy if its node status is Down, Removed or Unknown.
75+
The warning report for seed node status will list all the unhealthy seed nodes with detailed information.
76+
77+
* **SourceID**: System.FM
78+
* **Property**: SeedNodeStatus
79+
* **Next steps**: If this warning shows in the cluster, follow below instructions to fix it:
80+
For cluster running Service Fabric version 6.5 or higher:
81+
For Service Fabric cluster on Azure, after the seed node goes down, Service Fabric will try to change it to a non-seed node automatically. To make this happen, make sure the number of non-seed nodes in the primary node type is greater or equal to the number of Down seed nodes. If necessary, add more nodes to the primary node type to achieve this.
82+
Depending on the cluster status, it may take some time to fix the issue. Once this is done, the warning report is automatically cleared.
83+
84+
For Service Fabric standalone cluster, to clear the warning report, all the seed nodes need to become healthy. Depending on why seed nodes are unhealthy, different actions need to be taken: if the seed node is Down, users need to bring that seed node up; if the seed node is Removed or Unknown, this seed node [needs to be removed from the cluster](https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-cluster-windows-server-add-remove-nodes).
85+
The warning report is automatically cleared when all seed nodes become healthy.
86+
87+
For cluster running Service Fabric version older than 6.5:
88+
In this case, the warning report needs to be cleared manually. **Users should make sure all the seed nodes become healthy before clearing the report**: if the seed node is Down, users need to bring that seed node up;if the seed node is Removed or Unknown, that seed node needs to be removed from the cluster.
89+
After all the seed nodes become healthy, use following command from Powershell to [clear the warning report](https://docs.microsoft.com/en-us/powershell/module/servicefabric/send-servicefabricclusterhealthreport):
90+
91+
```powershell
92+
PS C:\> Send-ServiceFabricClusterHealthReport -SourceId "System.FM" -HealthProperty "SeedNodeStatus" -HealthState OK
93+
7194
## Node system health reports
7295
System.FM, which represents the Failover Manager service, is the authority that manages information about cluster nodes. Each node should have one report from System.FM showing its state. The node entities are removed when the node state is removed. For more information, see [RemoveNodeStateAsync](https://docs.microsoft.com/dotnet/api/system.fabric.fabricclient.clustermanagementclient.removenodestateasync).
7396

0 commit comments

Comments
 (0)