docs(srv): add content reference on autoscaling MTA-5516

SamyOubouaziz · SamyOubouaziz · commit 84adcadb9e11 · 2025-02-18T11:45:09.000+01:00
diff --git a/pages/serverless-functions/reference-content/functions-autoscaling.mdx b/pages/serverless-functions/reference-content/functions-autoscaling.mdx
@@ -0,0 +1,50 @@
+---
+meta:
+  title: Functions autoscaling reference
+  description: Understand how autoscaling works for Serverless Functions in Scaleway.
+content:
+  h1: Functions autoscaling reference
+  paragraph: Understand how autoscaling works for Serverless Functions in Scaleway.
+tags: serverless functions autoscaling scale up down min max
+dates:
+  validation: 2025-02-18
+  posted: 2025-02-18
+categories:
+  - serverless
+  - functions
+---
+
+## Benefits of autoscaling
+
+## Minimum and maximum scales
+
+### Minimum scale (min-scale)
+
+The this parameter sets the lowest value your resource is allowed to scale down to:
+
+- If you set a value of `0`, all instances of your resource will be terminated after 15 minutes of inactivity.
+
+- If you set a value of `1` or more, the corresponding number of instances of your resource will remain available at all time.
+
+Customizing the minimum scale for Serverless can help ensure that an instance remains pre-allocated and ready to handle requests, reducing delays associated with [cold starts](/serverless-functions/concepts/#cold-start). However, this setting also impacts the costs of your Serverless Function.
+
+### Maximum scale (max-scale)
+
+This parameter sets the maximum number of instances of your resource. You should adjust it based on your function’s traffic spikes, keeping in mind that you may wish to limit the max scale to manage costs effectively.
+
+### Autoscaler behavior
+
+The autoscaler decides to start new instances when:
+
+  - the existing instances are no longer able to handle the load because they are busy responding to other ongoing requests. By default, this happens if an instance is already processing 80 requests (max_concurrency = 80).
+  
+  - our system detects an unusual number of requests. In this case, some instances may be started in anticipation to avoid a potential cold start.
+
+The same autoscaler decides to remove instances when:
+
+  - no more requests are being processed. If even a single request is being processed (or detected as being processed), then the autoscaler will not be able to remove this instance. The system also prioritizes instances with the fewest ongoing requests, or if very few requests are being sent, it tries to select a particular instance to shut down the others, and therefore scale down.
+  - an instance has not responded to a request for more than 15 minutes of inactivity. The instance is only shut down after this interval, once again to absorb any potential new peaks and thus avoid the cold start. These 15 minutes of inactivity are not configurable.
+
+<Message type="note">
+Redeploying your resource results in the termination of existing instances and a return to the min scale, which you observe when redeploying.
+</Message>