Skip to content

Commit f035d83

Browse files
committed
add concepts page.
1 parent 8a6281d commit f035d83

File tree

2 files changed

+69
-3
lines changed

2 files changed

+69
-3
lines changed

docs/concepts/scaling.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
---
2+
title: Scaling
3+
description: Defang can help you handle service irregular loads.
4+
sidebar_position: 375
5+
---
6+
7+
# Scaling
8+
9+
Scaling is the process of adjusting the number of instances (or replicas) of a service to meet the current demand. Services that receive requests—such as APIs, workers, or background jobs—can be scaled up or down to optimize performance, availability, and cost.
10+
11+
Scaling is a core concept in distributed systems and cloud-native applications. It ensures your system can handle varying workloads without degrading user experience or over-provisioning resources.
12+
13+
## Why Scale?
14+
15+
Scaling enables services to respond effectively under different conditions:
16+
17+
- **High Traffic**: When demand spikes, scaling up ensures your service can process more requests in parallel.
18+
- **Cost Optimization**: Scaling down during periods of low demand helps reduce unnecessary resource usage and cloud costs.
19+
- **Fault Tolerance**: Multiple instances of a service provide redundancy in case of instance failure.
20+
- **Throughput & Latency**: Additional instances can reduce response times and increase the number of operations your service can perform per second.
21+
22+
## Types of Scaling
23+
24+
There are two main ways to scale a service:
25+
26+
- **Horizontal Scaling**: Adds or removes instances of a service. This is the most common approach for stateless services.
27+
- **Vertical Scaling**: Increases or decreases the resources (CPU, memory) available to a single instance.
28+
29+
In most modern deployments, horizontal scaling is preferred because it aligns well with cloud-native principles and is easier to automate and distribute.
30+
31+
## Auto-Scaling
32+
33+
**Auto-scaling** refers to automatically adjusting the number of service instances based on defined policies or metrics.
34+
35+
Instead of manually adding more instances when traffic increases, an auto-scaling system watches key indicators (like CPU usage) and takes action in real time.
36+
37+
### How It Works
38+
39+
Auto-scaling systems typically rely on:
40+
41+
- **Metrics Collection**: Real-time monitoring of system metrics.
42+
- **Scaling Policies**: Rules that define when to scale up or down. For example:
43+
- If average CPU > 85% for 5 minutes → scale up by 2 instances.
44+
- **Cooldown Periods**: Delays between scaling events to prevent rapid, repeated changes (flapping).
45+
46+
### Supported Platforms
47+
48+
| Platform | Auto-Scaling Support |
49+
|----------------|----------------------|
50+
| AWS | ✅ Supported |
51+
| GCP | ✅ Supported |
52+
| DigitalOcean | ❌ Not yet supported |
53+
54+
> 💡 We're actively working on support for additional platforms. Let us know which ones you’d like to see next!
55+
56+
### Benefits of Auto-Scaling
57+
58+
- **Elasticity**: Automatically adapts to changing workloads.
59+
- **Resilience**: Helps maintain performance during traffic surges or partial outages.
60+
- **Efficiency**: Reduces the need for manual intervention or over-provisioning.
61+
62+
### Considerations
63+
64+
- Ensure services are **stateless** or use **externalized state** (e.g., databases, caches) for smooth scaling. ([12 Factor App](https://12factor.net/processes))
65+
- Test services under load to identify scaling bottlenecks.
66+

docs/tutorials/scaling-your-services.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -83,13 +83,13 @@ Once deployed, your services' CPU usage is monitored for how much load it is han
8383

8484
Requirements
8585

86-
- BYOC, your own cloud platform account (AWS or GCP).
87-
- You must be on the Pro or higher plan to use autoscaling.
86+
- BYOC, your own cloud platform account. ([Scaling](/docs/concepts/scaling))
87+
- You must be on the Pro or higher plan to use autoscaling. ([Defang plans](https://defang.io/#pricing))
8888
- Only staging and production deployment modes supported. ([Deployment modes](/docs/concepts/deployment-modes))
8989
- The service must be stateless or able to run in multiple instances.
9090
- Only CPU metrics are used for scaling decisions.
9191

9292
Best Practices
9393

94-
- Design your services to be horizontally scalable.
94+
- Design your services to be horizontally scalable. ([12 Factor App](https://12factor.net/processes))
9595
- Use shared or external storage if your service writes data. (e.g. Postgres or Redis [managed services](/docs/concepts/managed-storage) )

0 commit comments

Comments
 (0)