-
Notifications
You must be signed in to change notification settings - Fork 258
docs(srv): add content reference on autoscaling MTA-5516 #4438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
84adcad
docs(srv): add content reference on autoscaling MTA-5516
SamyOubouaziz 0f47574
docs(srv): update
SamyOubouaziz 48db852
docs(srv): update
SamyOubouaziz 5f64f9f
docs(srv): update
SamyOubouaziz 7af524c
docs(srv): update
SamyOubouaziz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
54 changes: 54 additions & 0 deletions
54
pages/serverless-containers/reference-content/containers-autoscaling.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| --- | ||
| meta: | ||
| title: Containers autoscaling reference | ||
| description: Understand how autoscaling works for Serverless Containers in Scaleway. | ||
| content: | ||
| h1: Containers autoscaling reference | ||
| paragraph: Understand how autoscaling works for Serverless Containers in Scaleway. | ||
| tags: serverless containers autoscaling scale up down min max | ||
| dates: | ||
| validation: 2025-02-18 | ||
| posted: 2025-02-18 | ||
| categories: | ||
| - serverless | ||
| - containers | ||
| --- | ||
|
|
||
| ## Benefits of autoscaling | ||
|
|
||
| [Autoscaling](/serverless-containers/concepts/#autoscaling) offers several benefits, including improved responsiveness and cost efficiency. By automatically adjusting the number of instances of your resource based on current demand, you ensure that your applications can handle varying loads without manual intervention. This not only enhances user experience by maintaining performance levels but also helps in reducing costs by only using resources when necessary. Additionally, autoscaling helps in optimal resource utilization, minimizing the risk of performance degradation during peak times. | ||
|
|
||
| ## Minimum and maximum scales | ||
|
|
||
| ### Minimum scale (min-scale) | ||
|
|
||
| This parameter sets the lowest value your resource is allowed to scale down to: | ||
|
|
||
| - If you set a value of `0`, all instances of your resource will be terminated after 15 minutes of inactivity. | ||
SamyOubouaziz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| - If you set a value of `1` or more, the corresponding number of instances of your resource will remain available at all time. | ||
|
|
||
| Customizing the minimum scale for Serverless can help ensure that an instance remains pre-allocated and ready to handle requests, reducing delays associated with [cold starts](/serverless-containers/concepts/#cold-start). However, this setting also impacts the costs of your Serverless Container. | ||
|
|
||
| ### Maximum scale (max-scale) | ||
|
|
||
| This parameter sets the maximum number of instances of your resource. You should adjust it based on your resource's traffic spikes, keeping in mind that you may wish to limit the max scale to manage costs effectively. | ||
|
|
||
| When the maximum scale is reached, new requests are queued for processing. When the queue is full, the resource returns `503` errors for requests received beyond this point. | ||
|
|
||
| ### Autoscaler behavior | ||
|
|
||
| The autoscaler decides to start new instances when: | ||
|
|
||
| - the existing instances are no longer able to handle the load because they are busy responding to other ongoing requests. By default, this happens if an instance is already processing 80 requests (max_concurrency = 80). | ||
|
|
||
| - our system detects an unusual number of requests. In this case, some instances may be started in anticipation to avoid a potential cold start. | ||
|
|
||
| The same autoscaler decides to remove instances when: | ||
|
|
||
| - no more requests are being processed. If even a single request is being processed (or detected as being processed), then the autoscaler will not be able to remove this instance. The system also prioritizes instances with the fewest ongoing requests, or if very few requests are being sent, it tries to select a particular instance to shut down the others, and therefore scale down. | ||
| - an instance has not responded to a request for more than 15 minutes of inactivity. The instance is only shut down after this interval, once again to absorb any potential new peaks and thus avoid the cold start. These 15 minutes of inactivity are not configurable. | ||
|
|
||
| <Message type="note"> | ||
| Redeploying your resource results in the termination of existing instances and a return to the min scale, which you observe when redeploying. | ||
| </Message> | ||
53 changes: 53 additions & 0 deletions
53
pages/serverless-functions/reference-content/functions-autoscaling.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| --- | ||
| meta: | ||
| title: Functions autoscaling reference | ||
| description: Understand how autoscaling works for Serverless Functions in Scaleway. | ||
| content: | ||
| h1: Functions autoscaling reference | ||
| paragraph: Understand how autoscaling works for Serverless Functions in Scaleway. | ||
| tags: serverless functions autoscaling scale up down min max | ||
| dates: | ||
| validation: 2025-02-18 | ||
| posted: 2025-02-18 | ||
| categories: | ||
| - serverless | ||
| - functions | ||
| --- | ||
|
|
||
| ## Benefits of autoscaling | ||
|
|
||
| [Autoscaling](/serverless-functions/concepts/#autoscaling) offers several benefits, including improved responsiveness and cost efficiency. By automatically adjusting the number of instances of your resource based on current demand, you ensure that your applications can handle varying loads without manual intervention. This not only enhances user experience by maintaining performance levels but also helps in reducing costs by only using resources when necessary. Additionally, autoscaling helps in optimal resource utilization, minimizing the risk of performance degradation during peak times. | ||
|
|
||
| ## Minimum and maximum scales | ||
|
|
||
| ### Minimum scale (min-scale) | ||
|
|
||
| This parameter sets the lowest value your resource is allowed to scale down to: | ||
|
|
||
| - If you set a value of `0`, all instances of your resource will be terminated after 15 minutes of inactivity. | ||
|
|
||
| - If you set a value of `1` or more, the corresponding number of instances of your resource will remain available at all time. | ||
|
|
||
| Customizing the minimum scale for Serverless can help ensure that an instance remains pre-allocated and ready to handle requests, reducing delays associated with [cold starts](/serverless-functions/concepts/#cold-start). However, this setting also impacts the costs of your Serverless Function. | ||
|
|
||
| ### Maximum scale (max-scale) | ||
|
|
||
| This parameter sets the maximum number of instances of your resource. You should adjust it based on your resource's traffic spikes, keeping in mind that you may wish to limit the max scale to manage costs effectively. | ||
|
|
||
| When the maximum scale is reached, new requests are queued for processing. When the queue is full, the resource returns `503` errors for requests received beyond this point. | ||
|
|
||
| ### Autoscaler behavior | ||
|
|
||
| The autoscaler decides to start new instances when: | ||
|
|
||
| - the existing instances are no longer able to handle the load because they are busy responding to other ongoing requests. By default, this happens if an instance is already processing 80 requests (max_concurrency = 80). | ||
| - our system detects an unusual number of requests. In this case, some instances may be started in anticipation to avoid a potential cold start. | ||
|
|
||
| The same autoscaler decides to remove instances when: | ||
|
|
||
| - no more requests are being processed. If even a single request is being processed (or detected as being processed), then the autoscaler will not be able to remove this instance. The system also prioritizes instances with the fewest ongoing requests, or if very few requests are being sent, it tries to select a particular instance to shut down the others, and therefore scale down. | ||
| - an instance has not responded to a request for more than 15 minutes of inactivity. The instance is only shut down after this interval, once again to absorb any potential new peaks and thus avoid the cold start. These 15 minutes of inactivity are not configurable. | ||
|
|
||
| <Message type="note"> | ||
| Redeploying your resource results in the termination of existing instances and a return to the min scale, which you observe when redeploying. | ||
| </Message> |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.