-
Notifications
You must be signed in to change notification settings - Fork 6
Add section about autoscaling #200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 4 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
bf4c93d
Add section about autoscaling
nullfunc dba40ac
text updates
nullfunc 8a6281d
change field to proper extension
nullfunc f035d83
add concepts page.
nullfunc 437e9d9
review updates
nullfunc 73c1c97
update support chart to be consistent with other charts
nullfunc File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
--- | ||
title: Scaling | ||
description: Defang can help you handle service irregular loads. | ||
sidebar_position: 375 | ||
--- | ||
|
||
# Scaling | ||
|
||
Scaling is the process of adjusting the number of instances (or replicas) of a service to meet the current demand. Services that receive requests—such as APIs, workers, or background jobs—can be scaled up or down to optimize performance, availability, and cost. | ||
|
||
Scaling is a core concept in distributed systems and cloud-native applications. It ensures your system can handle varying workloads without degrading user experience or over-provisioning resources. | ||
|
||
## Why Scale? | ||
|
||
Scaling enables services to respond effectively under different conditions: | ||
|
||
- **High Traffic**: When demand spikes, scaling up ensures your service can process more requests in parallel. | ||
- **Cost Optimization**: Scaling down during periods of low demand helps reduce unnecessary resource usage and cloud costs. | ||
- **Fault Tolerance**: Multiple instances of a service provide redundancy in case of instance failure. | ||
- **Throughput & Latency**: Additional instances can reduce response times and increase the number of operations your service can perform per second. | ||
|
||
## Types of Scaling | ||
|
||
There are two main ways to scale a service: | ||
|
||
- **Horizontal Scaling**: Adds or removes instances of a service. This is the most common approach for stateless services. | ||
- **Vertical Scaling**: Increases or decreases the resources (CPU, memory) available to a single instance. | ||
|
||
In most modern deployments, horizontal scaling is preferred because it aligns well with cloud-native principles and is easier to automate and distribute. | ||
|
||
## Auto-Scaling | ||
|
||
**Auto-scaling** refers to automatically adjusting the number of service instances based on defined policies or metrics. | ||
|
||
Instead of manually adding more instances when traffic increases, an auto-scaling system watches key indicators (like CPU usage) and takes action in real time. | ||
|
||
### How It Works | ||
|
||
Auto-scaling systems typically rely on: | ||
|
||
- **Metrics Collection**: Real-time monitoring of system metrics. | ||
- **Scaling Policies**: Rules that define when to scale up or down. For example: | ||
- If average CPU > 85% for 5 minutes → scale up by 2 instances. | ||
- **Cooldown Periods**: Delays between scaling events to prevent rapid, repeated changes (flapping). | ||
|
||
### Supported Platforms | ||
|
||
| Platform | Auto-Scaling Support | | ||
|----------------|----------------------| | ||
| AWS | ✅ Supported | | ||
| GCP | ✅ Supported | | ||
| DigitalOcean | ❌ Not yet supported | | ||
|
||
> 💡 We're actively working on support for additional platforms. Let us know which ones you’d like to see next! | ||
nullfunc marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
### Benefits of Auto-Scaling | ||
|
||
- **Elasticity**: Automatically adapts to changing workloads. | ||
- **Resilience**: Helps maintain performance during traffic surges or partial outages. | ||
- **Efficiency**: Reduces the need for manual intervention or over-provisioning. | ||
|
||
### Considerations | ||
|
||
- Ensure services are **stateless** or use **externalized state** (e.g., databases, caches) for smooth scaling. ([12 Factor App](https://12factor.net/processes)) | ||
- Test services under load to identify scaling bottlenecks. | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.