Skip to content

Commit 9042a70

Browse files
docs: stabilization window (#5360)
* docs: stabilization window * Minor wording change --------- Co-authored-by: Sherlock Xu <65327072+Sherlock113@users.noreply.github.com>
1 parent 23cf939 commit 9042a70

File tree

1 file changed

+5
-13
lines changed

1 file changed

+5
-13
lines changed

docs/source/scale-with-bentocloud/scaling/autoscaling.rst

Lines changed: 5 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -93,19 +93,11 @@ It's worth noting that when the external queue is enabled, ``max_concurrency`` w
9393
Autoscaling policies
9494
--------------------
9595

96-
You can customize scaling behavior to match your Service's needs with scaling-up and scaling-down policies.
96+
You can customize scaling behavior to match your Service's needs with the stabilization window.
9797

98-
Allowed scaling-up policies (``scale_up_behavior``):
98+
The stabilization window defines a time period during which the autoscaler temporarily holds off on scaling the number of replicas up or down. This helps prevent rapid or unnecessary scaling in response to short-lived spikes or drops in traffic.
9999

100-
- ``fast`` (default): There is no stabilization window, so the autoscaler can increase the number of replicas immediately if necessary. It can increase the number of replicas by 100% or by 4 replicas, whichever is higher, every 15 seconds.
101-
- ``stable``: The autoscaler can increase the number of replicas, but it will stabilize the number of replicas for 600 seconds (10 minutes) before deciding to scale up further. It can increase the number of replicas by 100% every 15 seconds.
102-
- ``disabled``: Scaling-up is turned off.
103-
104-
Allowed scaling-down policies (``scale_down_behavior``):
105-
106-
- ``fast``: There is no stabilization window, so the autoscaler can reduce the number of replicas immediately if necessary. It can decrease the number of replicas by 100% or by 4 replicas, whichever is higher, every 15 seconds.
107-
- ``stable`` (default): The autoscaler can reduce the number of replicas, but it will stabilize the number of replicas for 600 seconds (10 minutes) before deciding to scale down further. It can decrease the number of replicas by 100% every 15 seconds.
108-
- ``disabled``: Scaling-down is turned off.
100+
You can set the stabilization window to any value between 0 and 3600 seconds.
109101

110102
To set autoscaling policies, you need to configure the above fields in a separate YAML or JSON file. For example:
111103

@@ -118,8 +110,8 @@ To set autoscaling policies, you need to configure the above fields in a separat
118110
max_replicas: 2
119111
min_replicas: 1
120112
policy:
121-
scale_down_behavior: "disabled | stable | fast" # Choose the behavior
122-
scale_up_behavior: "disabled | stable | fast" # Choose the behavior
113+
scale_up_stabilization_window: 180
114+
scale_down_stabilization_window: 600
123115
124116
You can then deploy your project by referencing this file.
125117

0 commit comments

Comments
 (0)