Use request count for autoscaling policy by tomrf1 · Pull Request #1303 · guardian/support-dotcom-components

tomrf1 · 2025-03-03T09:48:34Z

Currently we scale up when average cpu hits 40%. This is bad because:
a) cpu clearly isn't the only relevant factor when considering error spikes
b) it soon scales back down because cpu goes down, even if traffic is still high

Instead we can use the RequestCountPerTarget metric (documented here) to scale up when traffic increases. This is more suitable for e.g. the morning traffic spike.

I've set it to 20,000 requests per target (where target means an ec2 instance).
This means e.g. at 80k requests per minute the ASG will have 4 instances.
I've reset the minimum count to 3, so during the night (UTC) when it's quiet it'll go back to 3 instances.

Recent requests per minute:

Use request count for autoscaling policy

8a70a27

tomrf1 requested a review from a team as a code owner March 3, 2025 09:48

LAKSHMIRPILLAI approved these changes Mar 3, 2025

View reviewed changes

shtukas approved these changes Mar 3, 2025

View reviewed changes

tomrf1 merged commit 347a5c8 into main Mar 3, 2025
4 checks passed

tomrf1 deleted the tf-asg-policy branch March 3, 2025 10:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use request count for autoscaling policy#1303

Use request count for autoscaling policy#1303
tomrf1 merged 1 commit intomainfrom
tf-asg-policy

tomrf1 commented Mar 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tomrf1 commented Mar 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tomrf1 commented Mar 3, 2025 •

edited

Loading