Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/concepts/scaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ Scaling is the process of adjusting the number of instances (or replicas) of a s

Scaling is a core concept in distributed systems and cloud-native applications. It ensures your system can handle varying workloads without degrading user experience or over-provisioning resources.


## Why Scale?

Scaling enables services to respond effectively under different conditions:
Expand All @@ -32,7 +31,7 @@ In most modern deployments, horizontal scaling is preferred because it aligns we

**Auto-scaling** refers to automatically adjusting the number of service instances based on defined policies or metrics.

Instead of manually adding more instances when traffic increases, an auto-scaling system watches key indicators (like CPU usage) and takes action in real time.
Instead of manually adding more instances when traffic increases, an auto-scaling system watches key indicators (like CPU usage) and takes action in real time. Defang autoscaling will create up to 10 new instances of your service as load demands. The maximum 10 replicas is not user configurable.

### Example

Expand Down Expand Up @@ -63,6 +62,7 @@ Auto-scaling systems typically rely on:
- **Scaling Policies**: Rules that define when to scale up or down. For example:
- If average CPU > 85% for 5 minutes → scale up by 2 instances.
- **Cooldown Periods**: Delays between scaling events to prevent rapid, repeated changes (flapping).
- **Max Replicas**: There is a maximum of 10 replicas per service.

### Supported Providers

Expand Down