You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pages/serverless-containers/reference-content/containers-autoscaling.mdx
+7-10Lines changed: 7 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -46,19 +46,16 @@ When the maximum scale is reached, new requests are queued for processing. When
46
46
47
47
### Autoscaler behavior
48
48
49
-
The autoscaler decides to start new instances when:
49
+
The autoscaler decides to add new instances (scale up) when the number of concurrent requests defined (default is `80`) is reached.
50
50
51
-
- the existing instances are no longer able to handle the load because they are busy responding to other ongoing requests. By default, this happens if an instance is already processing 80 requests (max_concurrency = 80).
52
-
53
-
- our system detects an unusual number of requests. In this case, some instances may be started in anticipation to avoid a potential cold start.
51
+
The same autoscaler decides to remove instances (scale down) down to `1` when no more requests are received for 30 seconds.
54
52
55
-
The same autoscaler decides to remove instances when:
56
-
57
-
- no more requests are being processed. If even a single request is being processed (or detected as being processed), then the autoscaler will not be able to remove this instance. The system also prioritizes instances with the fewest ongoing requests, or if very few requests are being sent, it tries to select a particular instance to shut down the others, and therefore scale down.
58
-
- an instance has not responded to a request for more than 15 minutes of inactivity. The instance is only shut down after this interval, once again to absorb any potential new peaks and thus avoid the cold start. These 15 minutes of inactivity are not configurable.
53
+
Scaling down to zero (if min-scale is set to `0`) happens after 15 minutes of inactivity.
59
54
60
55
<Messagetype="note">
61
-
Redeploying your resource results in the termination of existing instances and a return to the minimum scale.
56
+
Redeploying your resource does not entail downtime. Instances are gradually replaced with new ones.
57
+
58
+
Old instances remain running to handle traffic, while new instances are brought up and verified before fully replacing the old ones. This method helps maintain application availability and service continuity throughout the update process.
62
59
</Message>
63
60
64
61
## CPU and RAM percentage
@@ -81,4 +78,4 @@ This parameter sets the maximum number of instances of your resource. You should
81
78
82
79
The autoscaler decides to start new instances when the existing instances' CPU or RAM usage exceeds the threshold you defined for a certain amount of time.
83
80
84
-
The same autoscaler decides to remove existing instances when the CPU or RAM usage of certain instances is reduced, and the remaining instances' usage does not exceed the threshold.
81
+
The same autoscaler decides to remove existing instances when the CPU or RAM usage of certain instances is reduced, and the remaining instances' usage does not exceed the threshold.
Copy file name to clipboardExpand all lines: pages/serverless-functions/concepts.mdx
+1-6Lines changed: 1 addition & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,11 +17,6 @@ categories:
17
17
Autoscaling refers to the ability of Serverless Functions to automatically adjust the number of instances without manual intervention.
18
18
Scaling mechanisms ensure that resources are provisioned dynamically to handle incoming requests efficiently while minimizing idle capacity and cost.
19
19
20
-
Autoscaling parameters are [min-scale](/serverless-functions/concepts/#min-scale) and [max-scale](/serverless-functions/concepts/#max-scale). Available scaling policies are:
21
-
***Concurrent requests:** requests incoming to the resource at the same time. Default value suitable for most use cases.
22
-
***CPU usage:** to scale based on CPU percentage, suitable for intensive CPU workloads.
23
-
***RAM usage** to scale based on RAM percentage, suitable for memory-intensive workloads.
24
-
25
20
## Build step
26
21
27
22
Before deploying Serverless Functions, they have to be built. This step occurs during deployment.
@@ -215,4 +210,4 @@ Triggers can take many forms, such as HTTP requests, messages from a queue or a
215
210
216
211
## vCPU-s
217
212
218
-
Unit used to measure the resource consumption of a container. It reflects the amount of vCPU used over time.
213
+
Unit used to measure the resource consumption of a container. It reflects the amount of vCPU used over time.
Copy file name to clipboardExpand all lines: pages/serverless-functions/reference-content/functions-autoscaling.mdx
+7-9Lines changed: 7 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -38,16 +38,14 @@ When the maximum scale is reached, new requests are queued for processing. When
38
38
39
39
### Autoscaler behavior
40
40
41
-
The autoscaler decides to start new instances when:
41
+
The autoscaler decides to add new instances (scale up) for each concurrent request. For example, 5 concurrent requests will generate 5 Serverless Functions instances. This parameter can be customized on Serverless Containers but not on Serverless Functions.
42
42
43
-
- the existing instances are no longer able to handle the load because they are busy responding to other ongoing requests. By default, this happens if an instance is already processing 80 requests (max_concurrency = 80).
44
-
- our system detects an unusual number of requests. In this case, some instances may be started in anticipation to avoid a potential cold start.
43
+
The same autoscaler decides to remove instances (scale down) down to `1` when no more requests are received for 30 seconds.
45
44
46
-
The same autoscaler decides to remove instances when:
47
-
48
-
- no more requests are being processed. If even a single request is being processed (or detected as being processed), then the autoscaler will not be able to remove this instance. The system also prioritizes instances with the fewest ongoing requests, or if very few requests are being sent, it tries to select a particular instance to shut down the others, and therefore scale down.
49
-
- an instance has not responded to a request for more than 15 minutes of inactivity. The instance is only shut down after this interval, once again to absorb any potential new peaks and thus avoid the cold start. These 15 minutes of inactivity are not configurable.
45
+
Scaling down to zero (if min-scale is set to `0`) happens after 15 minutes of inactivity.
50
46
51
47
<Messagetype="note">
52
-
Redeploying your resource results in the termination of existing instances and a return to the min scale, which you observe when redeploying.
53
-
</Message>
48
+
Redeploying your resource does not entail downtime. Instances are gradually replaced with new ones.
49
+
50
+
Old instances remain running to handle traffic, while new instances are brought up and verified before fully replacing the old ones. This method helps maintain application availability and service continuity throughout the update process.
0 commit comments