From bf4c93dac755cebf45a3ffb83208826e2216a0b6 Mon Sep 17 00:00:00 2001 From: Eric Liu Date: Mon, 7 Apr 2025 21:53:15 -0700 Subject: [PATCH 1/6] Add section about autoscaling --- docs/tutorials/scaling-your-services.mdx | 40 ++++++++++++++++++++++-- 1 file changed, 37 insertions(+), 3 deletions(-) diff --git a/docs/tutorials/scaling-your-services.mdx b/docs/tutorials/scaling-your-services.mdx index 6186ba8e3..63657a003 100644 --- a/docs/tutorials/scaling-your-services.mdx +++ b/docs/tutorials/scaling-your-services.mdx @@ -27,14 +27,14 @@ services: deploy: resources: reservations: - cpus: '2' - memory: '512M' + cpus: "2" + memory: "512M" ``` The minimum resources which can be reserved: | Resource | Minimum | -|----------|---------| +| -------- | ------- | | CPUs | 0.5 | | Memory | 512M | @@ -57,3 +57,37 @@ services: deploy: replicas: 3 ``` + +## Autoscaling Your Services + +Autoscaling allows your services to automatically adjust the number of replicas based on CPU usage — helping you scale up during traffic spikes and scale down during quieter periods. + +> **Note:** Autoscaling is only available to **Pro** tier users. + +### Enabling Autoscaling + +To enable autoscaling for a service, add the `x-defang-autoscaling: true` field under the service definition in your `compose.yaml` file. + +Example: + +```yaml +services: + web: + image: myorg/web:latest + ports: + - 80:80 + x-defang-autoscaling: true +``` + +Once deployed, your services' CPU usage is monitored for how much load it is handling, sustained high loads will result in more replicas being started. + +Requirements + +- You must be on the Pro plan to use autoscaling. +- The service must be stateless or able to run in multiple instances. +- Only CPU metrics are used for scaling decisions. + +Best Practices + +- Design your services to be horizontally scalable. +- Use shared or external storage if your service writes data. From dba40acbd5c0ed9d9c56c585faab4d63fd4b7dad Mon Sep 17 00:00:00 2001 From: Eric Liu Date: Tue, 8 Apr 2025 12:18:11 -0700 Subject: [PATCH 2/6] text updates --- docs/tutorials/scaling-your-services.mdx | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/tutorials/scaling-your-services.mdx b/docs/tutorials/scaling-your-services.mdx index 63657a003..8c05f35bb 100644 --- a/docs/tutorials/scaling-your-services.mdx +++ b/docs/tutorials/scaling-your-services.mdx @@ -62,7 +62,7 @@ services: Autoscaling allows your services to automatically adjust the number of replicas based on CPU usage — helping you scale up during traffic spikes and scale down during quieter periods. -> **Note:** Autoscaling is only available to **Pro** tier users. +> **Note:** Autoscaling is only available to **Pro** tier or higher users. ### Enabling Autoscaling @@ -83,11 +83,13 @@ Once deployed, your services' CPU usage is monitored for how much load it is han Requirements -- You must be on the Pro plan to use autoscaling. +- BYOC, your own cloud platform account (AWS or GCP). +- You must be on the Pro or higher plan to use autoscaling. +- Only staging and production deployment modes supported. ([Deployment modes](/docs/concepts/deployment-modes)) - The service must be stateless or able to run in multiple instances. - Only CPU metrics are used for scaling decisions. Best Practices - Design your services to be horizontally scalable. -- Use shared or external storage if your service writes data. +- Use shared or external storage if your service writes data. (e.g. Postgres or Redis [managed services](/docs/concepts/managed-storage) ) From 8a6281ddcb1b80d997765db36f1c5548ece3cee4 Mon Sep 17 00:00:00 2001 From: Eric Liu Date: Tue, 8 Apr 2025 12:21:13 -0700 Subject: [PATCH 3/6] change field to proper extension --- docs/tutorials/scaling-your-services.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/tutorials/scaling-your-services.mdx b/docs/tutorials/scaling-your-services.mdx index 8c05f35bb..c4fdda04a 100644 --- a/docs/tutorials/scaling-your-services.mdx +++ b/docs/tutorials/scaling-your-services.mdx @@ -66,7 +66,7 @@ Autoscaling allows your services to automatically adjust the number of replicas ### Enabling Autoscaling -To enable autoscaling for a service, add the `x-defang-autoscaling: true` field under the service definition in your `compose.yaml` file. +To enable autoscaling for a service, add the `x-defang-autoscaling: true` extension under the service definition in your `compose.yaml` file. Example: From f035d83c9ebf527062eb5f465913352aab4430b8 Mon Sep 17 00:00:00 2001 From: Eric Liu Date: Tue, 8 Apr 2025 13:45:20 -0700 Subject: [PATCH 4/6] add concepts page. --- docs/concepts/scaling.md | 66 ++++++++++++++++++++++++ docs/tutorials/scaling-your-services.mdx | 6 +-- 2 files changed, 69 insertions(+), 3 deletions(-) create mode 100644 docs/concepts/scaling.md diff --git a/docs/concepts/scaling.md b/docs/concepts/scaling.md new file mode 100644 index 000000000..3148de9e3 --- /dev/null +++ b/docs/concepts/scaling.md @@ -0,0 +1,66 @@ +--- +title: Scaling +description: Defang can help you handle service irregular loads. +sidebar_position: 375 +--- + +# Scaling + +Scaling is the process of adjusting the number of instances (or replicas) of a service to meet the current demand. Services that receive requests—such as APIs, workers, or background jobs—can be scaled up or down to optimize performance, availability, and cost. + +Scaling is a core concept in distributed systems and cloud-native applications. It ensures your system can handle varying workloads without degrading user experience or over-provisioning resources. + +## Why Scale? + +Scaling enables services to respond effectively under different conditions: + +- **High Traffic**: When demand spikes, scaling up ensures your service can process more requests in parallel. +- **Cost Optimization**: Scaling down during periods of low demand helps reduce unnecessary resource usage and cloud costs. +- **Fault Tolerance**: Multiple instances of a service provide redundancy in case of instance failure. +- **Throughput & Latency**: Additional instances can reduce response times and increase the number of operations your service can perform per second. + +## Types of Scaling + +There are two main ways to scale a service: + +- **Horizontal Scaling**: Adds or removes instances of a service. This is the most common approach for stateless services. +- **Vertical Scaling**: Increases or decreases the resources (CPU, memory) available to a single instance. + +In most modern deployments, horizontal scaling is preferred because it aligns well with cloud-native principles and is easier to automate and distribute. + +## Auto-Scaling + +**Auto-scaling** refers to automatically adjusting the number of service instances based on defined policies or metrics. + +Instead of manually adding more instances when traffic increases, an auto-scaling system watches key indicators (like CPU usage) and takes action in real time. + +### How It Works + +Auto-scaling systems typically rely on: + +- **Metrics Collection**: Real-time monitoring of system metrics. +- **Scaling Policies**: Rules that define when to scale up or down. For example: + - If average CPU > 85% for 5 minutes → scale up by 2 instances. +- **Cooldown Periods**: Delays between scaling events to prevent rapid, repeated changes (flapping). + +### Supported Platforms + +| Platform | Auto-Scaling Support | +|----------------|----------------------| +| AWS | ✅ Supported | +| GCP | ✅ Supported | +| DigitalOcean | ❌ Not yet supported | + +> 💡 We're actively working on support for additional platforms. Let us know which ones you’d like to see next! + +### Benefits of Auto-Scaling + +- **Elasticity**: Automatically adapts to changing workloads. +- **Resilience**: Helps maintain performance during traffic surges or partial outages. +- **Efficiency**: Reduces the need for manual intervention or over-provisioning. + +### Considerations + +- Ensure services are **stateless** or use **externalized state** (e.g., databases, caches) for smooth scaling. ([12 Factor App](https://12factor.net/processes)) +- Test services under load to identify scaling bottlenecks. + \ No newline at end of file diff --git a/docs/tutorials/scaling-your-services.mdx b/docs/tutorials/scaling-your-services.mdx index c4fdda04a..e68794c68 100644 --- a/docs/tutorials/scaling-your-services.mdx +++ b/docs/tutorials/scaling-your-services.mdx @@ -83,13 +83,13 @@ Once deployed, your services' CPU usage is monitored for how much load it is han Requirements -- BYOC, your own cloud platform account (AWS or GCP). -- You must be on the Pro or higher plan to use autoscaling. +- BYOC, your own cloud platform account. ([Scaling](/docs/concepts/scaling)) +- You must be on the Pro or higher plan to use autoscaling. ([Defang plans](https://defang.io/#pricing)) - Only staging and production deployment modes supported. ([Deployment modes](/docs/concepts/deployment-modes)) - The service must be stateless or able to run in multiple instances. - Only CPU metrics are used for scaling decisions. Best Practices -- Design your services to be horizontally scalable. +- Design your services to be horizontally scalable. ([12 Factor App](https://12factor.net/processes)) - Use shared or external storage if your service writes data. (e.g. Postgres or Redis [managed services](/docs/concepts/managed-storage) ) From 437e9d9cc89439f3bd9b0e87c695fc55cf5ae35e Mon Sep 17 00:00:00 2001 From: Eric Liu Date: Tue, 8 Apr 2025 15:00:02 -0700 Subject: [PATCH 5/6] review updates --- docs/concepts/scaling.md | 3 +-- docs/tutorials/scaling-your-services.mdx | 4 ++-- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/concepts/scaling.md b/docs/concepts/scaling.md index 3148de9e3..c68000195 100644 --- a/docs/concepts/scaling.md +++ b/docs/concepts/scaling.md @@ -50,8 +50,7 @@ Auto-scaling systems typically rely on: | AWS | ✅ Supported | | GCP | ✅ Supported | | DigitalOcean | ❌ Not yet supported | - -> 💡 We're actively working on support for additional platforms. Let us know which ones you’d like to see next! +| Playground | ❌ Not supported | ### Benefits of Auto-Scaling diff --git a/docs/tutorials/scaling-your-services.mdx b/docs/tutorials/scaling-your-services.mdx index e68794c68..5dd7035bb 100644 --- a/docs/tutorials/scaling-your-services.mdx +++ b/docs/tutorials/scaling-your-services.mdx @@ -83,10 +83,10 @@ Once deployed, your services' CPU usage is monitored for how much load it is han Requirements -- BYOC, your own cloud platform account. ([Scaling](/docs/concepts/scaling)) +- BYOC, your own cloud platform account. - You must be on the Pro or higher plan to use autoscaling. ([Defang plans](https://defang.io/#pricing)) - Only staging and production deployment modes supported. ([Deployment modes](/docs/concepts/deployment-modes)) -- The service must be stateless or able to run in multiple instances. +- The service must be stateless or able to run in multiple instances. ([Scaling](/docs/concepts/scaling)) - Only CPU metrics are used for scaling decisions. Best Practices From 73c1c973021bc4623c9c30c03592f52a66ee5c2c Mon Sep 17 00:00:00 2001 From: Eric Liu Date: Tue, 8 Apr 2025 15:05:55 -0700 Subject: [PATCH 6/6] update support chart to be consistent with other charts --- docs/concepts/scaling.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/concepts/scaling.md b/docs/concepts/scaling.md index c68000195..a94938e19 100644 --- a/docs/concepts/scaling.md +++ b/docs/concepts/scaling.md @@ -46,11 +46,11 @@ Auto-scaling systems typically rely on: ### Supported Platforms | Platform | Auto-Scaling Support | -|----------------|----------------------| -| AWS | ✅ Supported | -| GCP | ✅ Supported | -| DigitalOcean | ❌ Not yet supported | -| Playground | ❌ Not supported | +|----------------|:----------------------:| +| Playground | ❌ | +| AWS | ✅ | +| DigitalOcean | ❌ | +| GCP | ✅ | ### Benefits of Auto-Scaling