You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/service-fabric/service-fabric-cluster-resource-manager-autoscaling.md
+23-18Lines changed: 23 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,9 +10,9 @@ ms.date: 07/14/2022
10
10
---
11
11
12
12
# Introduction to Auto Scaling
13
-
Auto scaling is another capability of Service Fabric to dynamically scale your services based on the load that services are reporting, or based on their usage of resources. Auto scaling gives great elasticity and enables provisioning of extra instances or partitions of your service on demand. The entire auto scaling process is automated and transparent, and once you set up your policies on a service there is no need for manual scaling operations at the service level. Auto scaling can be turned on either at service creation time, or at any time by updating the service.
13
+
Auto scaling is another capability of Service Fabric to dynamically scale your services based on the load that services are reporting, or based on their usage of resources. Auto scaling gives great elasticity and enables provisioning of extra instances or partitions of your service on demand. The entire auto scaling process is automated and transparent, and once you set up your policies on a service there's no need for manual scaling operations at the service level. Auto scaling can be turned on either at service creation time, or at any time by updating the service.
14
14
15
-
A common scenario where auto-scaling is useful is when the load on a particular service varies over time. For example, a service such as a gateway can scale based on the amount of resources necessary to handle incoming requests. Let's take a look at an example of what those scaling rules could look like:
15
+
A common scenario where autoscaling is useful is when the load on a particular service varies over time. For example, a service such as a gateway can scale based on the amount of resources necessary to handle incoming requests. Let's take a look at an example of what those scaling rules could look like:
16
16
* If all instances of my gateway are using more than two cores on average, then scale out the gateway service by adding one more instance. Do this addition every hour, but never have more than seven instances in total.
17
17
* If all instances of my gateway are using less than 0.5 cores on average, then scale the service in by removing one instance. Do this removal every hour, but never have fewer than three instances in total.
18
18
@@ -24,9 +24,9 @@ The rest of this article describes the scaling policies, ways to enable or to di
24
24
Auto scaling policies can be defined for each service in a Service Fabric cluster. Each scaling policy consists of two parts:
25
25
***Scaling trigger** describes when scaling of the service is performed. Conditions that are defined in the trigger are checked periodically to determine if a service should be scaled or not.
26
26
27
-
***Scaling mechanism** describes how scaling is performed when it is triggered. Mechanism is only applied when the conditions from the trigger are met.
27
+
***Scaling mechanism** describes how scaling is performed when it's triggered. Mechanism is only applied when the conditions from the trigger are met.
28
28
29
-
All triggers that are currently supported work either with [logical load metrics](service-fabric-cluster-resource-manager-metrics.md), or with physical metrics like CPU or memory usage. Either way, Service Fabric monitors the reported load for the metric, and will evaluate the trigger periodically to determine if scaling is needed.
29
+
All triggers that are currently supported work either with [logical load metrics](service-fabric-cluster-resource-manager-metrics.md), or with physical metrics like CPU or memory usage. Either way, Service Fabric monitors the reported load for the metric, and evaluates the trigger periodically to determine if scaling is needed.
30
30
31
31
There are two mechanisms that are currently supported for auto scaling. The first one is meant for stateless services or for containers where auto scaling is performed by adding or removing [instances](service-fabric-concepts-replica-lifecycle.md). For both stateful and stateless services, auto scaling can also be performed by adding or removing named [partitions](service-fabric-concepts-partitioning.md) of the service.
32
32
@@ -38,16 +38,16 @@ The first type of trigger is based on the load of instances in a stateless servi
38
38
39
39
*_Lower load threshold_ is a value that determines when the service is **scaled in**. If the average load of all instances of the partitions is lower than this value, then the service is scaled in.
40
40
*_Upper load threshold_ is a value that determines when the service is **scaled out**. If the average load of all instances of the partition is higher than this value, then the service is scaled out.
41
-
*_Scaling interval_ determines how often the trigger is checked. Once the trigger is checked, if scaling is needed the mechanism will be applied. If scaling is not needed, then no action will be taken. In both cases, trigger will not be checked again before scaling interval expires again.
41
+
*_Scaling interval_ determines how often the trigger is checked. Once the trigger is checked, if scaling is needed the mechanism is applied. If scaling isn't needed, then no action is taken. In both cases, trigger isn't checked again before scaling interval expires again.
42
42
43
-
This trigger can be used only with stateless services (either stateless containers or Service Fabric services). In case when a service has multiple partitions, the trigger is evaluated for each partition separately, and each partition has the specified mechanism applied to it independently. Hence, the scaling behaviors of service partitions could vary based on their load. It is possible that some partitions of the service are scaled out, while some others are scaled in. Some partitions might not be scaled at all at the same time.
43
+
This trigger can be used only with stateless services (either stateless containers or Service Fabric services). In case when a service has multiple partitions, the trigger is evaluated for each partition separately, and each partition has the specified mechanism applied to it independently. Hence, the scaling behaviors of service partitions could vary based on their load. It's possible that some partitions of the service are scaled out, while some others are scaled in. Some partitions might not be scaled at all at the same time.
44
44
45
45
The only mechanism that can be used with this trigger is PartitionInstanceCountScaleMechanism. There are three factors that determine how this mechanism is applied:
46
46
*_Scale Increment_ determines how many instances are added or removed when mechanism is triggered.
47
-
*_Maximum Instance Count_ defines the upper limit for scaling. If number of instances of the partition reaches this limit, then the service is scaled out, regardless of the load. It is possible to omit this limit by specifying value of -1, and in that case the service is scaled out as much as possible (the limit is the number of nodes that are available in the cluster).
48
-
*_Minimum Instance Count_ defines the lower limit for scaling. If number of instances of the partition reaches this limit, then service is not scaled in regardless of the load.
47
+
*_Maximum Instance Count_ defines the upper limit for scaling. If number of instances of the partition reaches this limit, then the service is scaled out, regardless of the load. It's possible to omit this limit by specifying value of -1, and in that case the service is scaled out as much as possible (the limit is the number of nodes that are available in the cluster).
48
+
*_Minimum Instance Count_ defines the lower limit for scaling. If number of instances of the partition reaches this limit, then service isn't scaled in regardless of the load.
49
49
50
-
## Setting auto scaling policy for instancebased scaling
50
+
## Setting auto scaling policy for instance-based scaling
51
51
52
52
### Using application manifest
53
53
```xml
@@ -108,9 +108,9 @@ The second trigger is based on the load of all partitions of one service. Metric
108
108
109
109
*_Lower load threshold_ is a value that determines when the service is **scaled in**. If the average load of all partitions of the service is lower than this value, then the service is scaled in.
110
110
*_Upper load threshold_ is a value that determines when the service is **scaled out**. If the average load of all partitions of the service is higher than this value, then the service is scaled out.
111
-
*_Scaling interval_ determines how often the trigger is checked. Once the trigger is checked, if scaling is needed the mechanism is applied. If scaling is not needed, then no action is taken. In both cases, trigger is checked again before scaling interval expires again.
111
+
*_Scaling interval_ determines how often the trigger is checked. Once the trigger is checked, if scaling is needed the mechanism is applied. If scaling isn't needed, then no action is taken. In both cases, trigger is checked again before scaling interval expires again.
112
112
113
-
This trigger can be used both with stateful and stateless services. The only mechanism that can be used with this trigger is AddRemoveIncrementalNamedPartitionScalingMechanism. When service is scaled out then a new partition is added, and when service is scaled in one of existing partitions is removed. There are restrictions that are checked when service is created or updated and service creation/update fails if these conditions are not met:
113
+
This trigger can be used both with stateful and stateless services. The only mechanism that can be used with this trigger is AddRemoveIncrementalNamedPartitionScalingMechanism. When service is scaled out then a new partition is added, and when service is scaled in one of existing partitions is removed. There are restrictions that are checked when service is created or updated and service creation/update fails if these conditions aren't met:
114
114
* Named partition scheme must be used for the service.
115
115
* Partition names must be consecutive integer numbers, like "0," "1," ...
116
116
* First partition name must be "0."
@@ -123,8 +123,8 @@ The actual auto scaling operation that is performed respects this naming scheme
123
123
124
124
Same as with mechanism that uses scaling by adding or removing instances, there are three parameters that determine how this mechanism is applied:
125
125
*_Scale Increment_ determines how many partitions added or removed when mechanism is triggered.
126
-
*_Maximum Partition Count_ defines the upper limit for scaling. If number of partitions of the service reaches this limit, then the service is not scaled out, regardless of the load. It is possible to omit this limit by specifying value of -1, and in that case the service is scaled out as much as possible (the limit is the actual capacity of the cluster).
127
-
*_Minimum Partition Count_ defines the lower limit for scaling. If number of partitions of the service reaches this limit, then service is not scaled in regardless of the load.
126
+
*_Maximum Partition Count_ defines the upper limit for scaling. If number of partitions of the service reaches this limit, then the service isn't scaled out, regardless of the load. It's possible to omit this limit by specifying value of -1, and in that case the service is scaled out as much as possible (the limit is the actual capacity of the cluster).
127
+
*_Minimum Partition Count_ defines the lower limit for scaling. If number of partitions of the service reaches this limit, then service isn't scaled in regardless of the load.
128
128
129
129
> [!WARNING]
130
130
> When AddRemoveIncrementalNamedPartitionScalingMechanism is used with stateful services, Service Fabric will add or remove partitions **without notification or warning**. Repartitioning of data will not be performed when scaling mechanism is triggered. In case of scale out operation, new partitions will be empty, and in case of scale in operation, **partition will be deleted together with all the data that it contains**.
In order to enable the resource monitor service to scale based on actual resources, one could add the feature `ResourceMonitorService`.
188
+
To enable the resource monitor service to scale based on actual resources, you can add the `ResourceMonitorService` feature as follows:
189
189
190
190
```json
191
191
"fabricSettings": [
192
-
...
192
+
...
193
193
],
194
194
"addonFeatures": [
195
195
"ResourceMonitorService"
196
196
],
197
197
```
198
-
There are two metrics that represent actual physical resources. One of them is servicefabric:/_CpuCores which represent the actual cpu usage (so 0.5 represents half a core) and the other being servicefabric:/_MemoryInMB which represents the memory usage in MBs.
199
-
ResourceMonitorService is responsible for tracking cpu and memory usage of user services. This service will apply weighted moving average in order to account for potential short-lived spikes. Resource monitoring is supported for both containerized and non-containerized applications on Windows and for containerized ones on Linux. Auto scaling on resources is only enabled for services activated in [exclusive process model](service-fabric-hosting-model.md#exclusive-process-model).
198
+
Service Fabric supports CPU and memory governance using two built-in metrics: `servicefabric:/_CpuCores` for CPU and `servicefabric:/_MemoryInMB` for memory. The Resource Monitor Service is responsible for tracking CPU and memory usage and updating the Cluster Resource Manager with the current resource usage. This service applies a weighted moving average to account for potential short-lived spikes. Resource monitoring is supported for both containerized and noncontainerized applications on Windows and for containerized applications on Linux.
199
+
200
+
> [!NOTE]
201
+
> CPU and memory consumption monitored in the Resource Monitor Service and updated to the Cluster Resource Manager do not impact any decision-making process outside of auto scaling. If [resource governance](service-fabric-resource-governance.md#resource-governance-metrics) is needed, it can be configured without interfering with auto scaling functionalities, and vice versa.
202
+
203
+
> [!IMPORTANT]
204
+
> Resource-based auto scaling is supported only for services activated in the [exclusive process model](service-fabric-hosting-model.md#exclusive-process-model).
200
205
201
206
## Next steps
202
207
Learn more about [application scalability](service-fabric-concepts-scalability.md).
0 commit comments