|
| 1 | +--- |
| 2 | +title: Scaling |
| 3 | +description: Defang can help you handle service irregular loads. |
| 4 | +sidebar_position: 375 |
| 5 | +--- |
| 6 | + |
| 7 | +# Scaling |
| 8 | + |
| 9 | +Scaling is the process of adjusting the number of instances (or replicas) of a service to meet the current demand. Services that receive requests—such as APIs, workers, or background jobs—can be scaled up or down to optimize performance, availability, and cost. |
| 10 | + |
| 11 | +Scaling is a core concept in distributed systems and cloud-native applications. It ensures your system can handle varying workloads without degrading user experience or over-provisioning resources. |
| 12 | + |
| 13 | +## Why Scale? |
| 14 | + |
| 15 | +Scaling enables services to respond effectively under different conditions: |
| 16 | + |
| 17 | +- **High Traffic**: When demand spikes, scaling up ensures your service can process more requests in parallel. |
| 18 | +- **Cost Optimization**: Scaling down during periods of low demand helps reduce unnecessary resource usage and cloud costs. |
| 19 | +- **Fault Tolerance**: Multiple instances of a service provide redundancy in case of instance failure. |
| 20 | +- **Throughput & Latency**: Additional instances can reduce response times and increase the number of operations your service can perform per second. |
| 21 | + |
| 22 | +## Types of Scaling |
| 23 | + |
| 24 | +There are two main ways to scale a service: |
| 25 | + |
| 26 | +- **Horizontal Scaling**: Adds or removes instances of a service. This is the most common approach for stateless services. |
| 27 | +- **Vertical Scaling**: Increases or decreases the resources (CPU, memory) available to a single instance. |
| 28 | + |
| 29 | +In most modern deployments, horizontal scaling is preferred because it aligns well with cloud-native principles and is easier to automate and distribute. |
| 30 | + |
| 31 | +## Auto-Scaling |
| 32 | + |
| 33 | +**Auto-scaling** refers to automatically adjusting the number of service instances based on defined policies or metrics. |
| 34 | + |
| 35 | +Instead of manually adding more instances when traffic increases, an auto-scaling system watches key indicators (like CPU usage) and takes action in real time. |
| 36 | + |
| 37 | +### How It Works |
| 38 | + |
| 39 | +Auto-scaling systems typically rely on: |
| 40 | + |
| 41 | +- **Metrics Collection**: Real-time monitoring of system metrics. |
| 42 | +- **Scaling Policies**: Rules that define when to scale up or down. For example: |
| 43 | + - If average CPU > 85% for 5 minutes → scale up by 2 instances. |
| 44 | +- **Cooldown Periods**: Delays between scaling events to prevent rapid, repeated changes (flapping). |
| 45 | + |
| 46 | +### Supported Platforms |
| 47 | + |
| 48 | +| Platform | Auto-Scaling Support | |
| 49 | +|----------------|----------------------| |
| 50 | +| AWS | ✅ Supported | |
| 51 | +| GCP | ✅ Supported | |
| 52 | +| DigitalOcean | ❌ Not yet supported | |
| 53 | + |
| 54 | +> 💡 We're actively working on support for additional platforms. Let us know which ones you’d like to see next! |
| 55 | +
|
| 56 | +### Benefits of Auto-Scaling |
| 57 | + |
| 58 | +- **Elasticity**: Automatically adapts to changing workloads. |
| 59 | +- **Resilience**: Helps maintain performance during traffic surges or partial outages. |
| 60 | +- **Efficiency**: Reduces the need for manual intervention or over-provisioning. |
| 61 | + |
| 62 | +### Considerations |
| 63 | + |
| 64 | +- Ensure services are **stateless** or use **externalized state** (e.g., databases, caches) for smooth scaling. ([12 Factor App](https://12factor.net/processes)) |
| 65 | +- Test services under load to identify scaling bottlenecks. |
| 66 | + |
0 commit comments