Skip to content

Commit 62dc333

Browse files
authored
Merge pull request #267939 from vladelekic/user/vladelekic/IBThrottlingDocumentation
[Service Fabric] [Cluster Resource Manager] InBuild Replicas per Node Throttling Documentation
2 parents 24e51bb + 7af25b5 commit 62dc333

File tree

2 files changed

+170
-0
lines changed

2 files changed

+170
-0
lines changed
Lines changed: 168 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,168 @@
1+
---
2+
title: InBuild throttling
3+
description: Configure, understand, and apply InBuild Throttling constraint.
4+
ms.topic: conceptual
5+
ms.author: vladelekic
6+
author: vladelekic
7+
ms.service: service-fabric
8+
services: service-fabric
9+
ms.date: 02/28/2024
10+
---
11+
12+
# Throttling InBuild Replicas per Node
13+
14+
Replicas of stateful services transition through several phases during their lifecycle: InBuild, Ready, Closing, and Dropped states. Only replicas in Ready and Closing states can declare either default or dynamic load. Replicas in the Dropped state are irrelevant in terms of resource utilization. The resource utilization of InBuild (IB) replicas is specific, so the Cluster Resource Manager provides more support for handling that utilization. Since IB replicas can't report dynamic load, and their resource utilization might be higher than resource utilization of less utilized replicas in Ready state, especially for I/O and memory-related metrics, the CRM allows limiting the number of concurrent builds per node.
15+
16+
The CRM provides support for limiting the number of IB replicas per node. Limits can be defined in two ways:
17+
* Per node type
18+
* For all nodes that satisfy placement constraint
19+
20+
Depending on the constraint priority, according to general rules about constraint prioritization, the CRM halts movements and creations of replicas on a node if the actions violate the defined IB limit for that node. The constraint blocks only operations that could cause high I/O and memory consumption during the InBuild phase, especially when extensively replicating context from other active replicas.
21+
22+
Movements of any kind and promotion of StandBy replicas are restricted operations that cause state replication and extensive resource utilization during the InBuild phase. On the other hand, the promotion of an active Secondary replica to Primary replica isn't a problematic operation, so the constraint blocks such operations. During the promotion of a Secondary replica, the state of the replica is up-to-date with the Primary replica, eliminating the need for extra replication.
23+
24+
> [!NOTE]
25+
> The promotion of StandBy replicas could be blocked due to InBuild replicas per node throttling. The transition from StandBy to Ready replicas could cause extensive I/O and memory utilization, depending on the amount of context that needs to be replicated from active replicas. Thus, ignoring promotions of StandBy replicas could cause issues that the InBuild replicas per node throttling constraint aims to resolve.
26+
>
27+
28+
## Configuring the InBuild Replicas per Node Throttling
29+
30+
### Enable InBuild Replicas per Node Throttling for Action
31+
32+
There are three different categories of actions that the Cluster Resource Manager performs:
33+
34+
* _Placement_: Controls placement of missing replicas, orchestrates swaps during upgrades, and removal of extra replicas.
35+
* _Constraint Check_: Enforces rules.
36+
* _Balancing_: Performs actions that reduce the imbalance of total node utilization in a cluster.
37+
38+
The Cluster Resource Manager allows enabling/disabling throttling of IB replicas per node for each category. Configurations that control whether throttling is active for specific actions are *ThrottlePlacementPhase*, *ThrottleConstraintCheckPhase*, and *ThrottleBalancingPhase*, respectively. The value specified for these configurations is boolean. The cluster manifest section that explicitly defines these configurations is provided:
39+
40+
```xml
41+
<Section Name="PlacementAndLoadBalancing">
42+
<Parameter Name="ThrottlePlacementPhase" Value="true">
43+
<Parameter Name="ThrottleConstraintCheckPhase" Value="true">
44+
<Parameter Name="ThrottleBalancingPhase" Value="true">
45+
</Section>
46+
```
47+
48+
Here's an example of configurations that enable IB replicas throttling per node, defined via ClusterConfig.json for standalone deployments or Template.json for Azure-hosted clusters:
49+
50+
```json
51+
"fabricSettings": [
52+
{
53+
"name": "PlacementAndLoadBalancing",
54+
"parameters": [
55+
{
56+
"name": "ThrottlePlacementPhase",
57+
"value": "true"
58+
},
59+
{
60+
"name": "ThrottleConstraintCheckPhase",
61+
"value": "true"
62+
},
63+
{
64+
"name": "ThrottleBalancingPhase",
65+
"value": "true"
66+
}
67+
]
68+
}
69+
]
70+
```
71+
72+
### Configure InBuild Replicas per Node Limits
73+
74+
The Cluster Resource Manager allows defining IB limits globally and for each category of actions:
75+
76+
* MaximumInBuildReplicasPerNode: Defines IB limits globally. These limits are used to evaluate the final IB limit for each category.
77+
* MaximumInBuildReplicasPerNodePlacementThrottle: Defines IB limits for the placement category. These limits are used to evaluate the final IB limit only for the placement category.
78+
* MaximumInBuildReplicasPerNodeConstraintCheckThrottle: Defines IB limits for the constraint check category. These limits are used to evaluate the final IB limit only for the constraint check category.
79+
* MaximumInBuildReplicasPerNodeBalancingThrottle: Defines IB limits for the balancing category. These limits are used to evaluate the final IB limit only for the balancing category.
80+
81+
For each option, the Cluster Resource Manager provides two options for defining the limit of IB replicas:
82+
83+
* Define IB limit for all nodes in a single node type.
84+
* Define IB limit for all nodes with a matching placement constraint.
85+
86+
These rules allow you to define multiple values for a single category, and the CRM always respects the most strict limit that you provided. The limit for each node in a specific phase is the lowest value according to node type or any placement property that corresponds to that node, for both global limits and category limits. If the limit for an action category for a specific node isn't defined, the CRM assumes that there's no upper IB replica count for a node.
87+
88+
The cluster manifest sections that explicitly define limits for each phase are provided:
89+
90+
```xml
91+
<Section Name="MaximumInBuildReplicasPerNode">
92+
<Parameter Name="NodeTypeA" Value="10" />
93+
<Parameter Name="NodeTypeB" Value="20" />
94+
<Parameter Name="NodeTypeName == NodeTypeA || NodeTypeName == NodeTypeC" Value="15" />
95+
</Section>
96+
97+
<Section Name="MaximumInBuildReplicasPerNodePlacementThrottle">
98+
<Parameter Name="NodeTypeC" Value="20" />
99+
</Section>
100+
101+
<Section Name="MaximumInBuildReplicasPerNodeConstraintCheckThrottle">
102+
<Parameter Name="NodeTypeD" Value="10" />
103+
<Parameter Name="Color == Blue" Value="8" />
104+
</Section>
105+
106+
<Section Name="MaximumInBuildReplicasPerNodeBalancingThrottle">
107+
<Parameter Name="Color == Red" Value="25" />
108+
</Section>
109+
```
110+
111+
Here's an example of the same IB limits defined via ClusterConfig.json for standalone deployments or Template.json for Azure-hosted clusters:
112+
113+
```json
114+
"fabricSettings": [
115+
{
116+
"name": "MaximumInBuildReplicasPerNode",
117+
"parameters": [
118+
{
119+
"name": "NodeTypeA",
120+
"value": "10"
121+
},
122+
{
123+
"name": "NodeTypeB",
124+
"value": "20"
125+
},
126+
{
127+
"name": "NodeTypeName == NodeTypeA || NodeTypeName == NodeTypeC",
128+
"value": "15"
129+
}
130+
]
131+
},
132+
{
133+
"name": "MaximumInBuildReplicasPerNodePlacementThrottle",
134+
"parameters": [
135+
{
136+
"name": "NodeTypeC",
137+
"value": "20"
138+
}
139+
]
140+
},
141+
{
142+
"name": "MaximumInBuildReplicasPerNodeConstraintCheckThrottle",
143+
"parameters": [
144+
{
145+
"name": "NodeTypeD",
146+
"value": "10"
147+
},
148+
{
149+
"name": "Color == Blue",
150+
"value": "8"
151+
}
152+
]
153+
},
154+
{
155+
"name": "MaximumInBuildReplicasPerNodeBalancingThrottle",
156+
"parameters": [
157+
{
158+
"name": "Color == Red",
159+
"value": "25"
160+
}
161+
]
162+
}
163+
]
164+
```
165+
166+
## Next Steps
167+
- For more information about replica states, check out the article on [replica lifecycle](service-fabric-concepts-replica-lifecycle.md)
168+
- For more information about balancing and other action categories, check out the article on [balancing action](service-fabric-cluster-resource-manager-balancing.md)

articles/service-fabric/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -379,6 +379,8 @@
379379
href: service-fabric-cluster-resource-manager-autoscaling.md
380380
- name: Node tagging
381381
href: service-fabric-cluster-resource-manager-node-tagging.md
382+
- name: InBuild replicas per node throttling
383+
href: service-fabric-cluster-resource-manager-inbuild-throttling.md
382384
- name: Monitoring and diagnostics
383385
items:
384386
- name: Monitoring overview

0 commit comments

Comments
 (0)