-
Notifications
You must be signed in to change notification settings - Fork 693
chore: new major release branch #4936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
…ingle invocation (#4603) Currently we restrict the `scale-up` Lambda to only handle a single event at a time. In very busy environments this can prove to be a bottleneck: there are calls to GitHub and AWS APIs that happen each time, and they can end up taking long enough that we can't process job queued events faster than they arrive. In our environment we are also using a pool, and typically we have responded to the alerts generated by this (SQS queue length growing) by expanding the size of the pool. This helps because we will more frequently find that we don't need to scale up, which allows the lambdas to exit a bit earlier, so we can get through the queue faster. But it makes the environment much less responsive to changes in usage patterns. At its core, this Lambda's task is to construct an EC2 `CreateFleet` call to create instances, after working out how many are needed. This is a job that can be batched. We can take any number of events, calculate the diff between our current state and the number of jobs we have, capping at the maximum, and then issue a single call. The thing to be careful about is how to handle partial failures, if EC2 creates some of the instances we wanted but not all of them. Lambda has a configurable function response type which can be set to `ReportBatchItemFailures`. In this mode, we return a list of failed messages from our handler and those are retried. We can make use of this to give back as many events as we failed to process. Now we're potentially processing multiple events in a single Lambda, one thing we should optimise for is not recreating GitHub API clients. We need one client for the app itself, which we use to find out installation IDs, and then one client for each installation which is relevant to the batch of events we are processing. This is done by creating a new client the first time we see an event for a given installation. We also remove the same `batch_size = 1` constraint from the `job-retry` Lambda. This Lambda is used to retry events that previously failed. However, instead of reporting failures to be retried, here we maintain the pre-existing fault-tolerant behaviour where errors are logged but explicitly do not cause message retries, avoiding infinite loops from persistent GitHub API issues or malformed events. Tests are added for all of this. Tests in a private repo (sorry) look good. This was running ephemeral runners with no pool, SSM high throughput enabled, the job queued check \_dis_abled, batch size of 200, wait time of 10 seconds. The workflow runs are each a matrix with 250 jobs.  --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Niek Palm <[email protected]> Co-authored-by: Niek Palm <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Upgrade to Lambda runtime Node24.x - Upgrade minimal AWS Terraform provider - Upgrade all lambda runtimes by default to 24x - Breaking change! ## Dependency and environment upgrades: * Updated all references to Node.js from version 22 to version 24 in GitHub Actions workflows (`.github/workflows/lambda.yml`, `.github/workflows/release.yml`) and Dockerfiles (`.ci/Dockerfile`, `.devcontainer/Dockerfile`). [[1]](diffhunk://#diff-b0732b88b9e5a3df48561602571a10179d2b28cbb21ba8032025932bc9606426L23-R23) [[2]](diffhunk://#diff-87db21a973eed4fef5f32b267aa60fcee5cbdf03c67fafdc2a9b553bb0b15f34L33-R33) [[3]](diffhunk://#diff-fd0c8401badda82156f9e7bd621fa3a0e586d8128e4a80af17c7cbff70bee11eL2-R2) [[4]](diffhunk://#diff-13bd9d7a30bf46656bc81f1ad5b908a627f9247be3f7d76df862b0578b534fc6L1-R1) * Upgraded the base Docker images for both the CI and devcontainer environments to use newer Node.js image digests. [[1]](diffhunk://#diff-fd0c8401badda82156f9e7bd621fa3a0e586d8128e4a80af17c7cbff70bee11eL2-R2) [[2]](diffhunk://#diff-13bd9d7a30bf46656bc81f1ad5b908a627f9247be3f7d76df862b0578b534fc6L1-R1) Terraform provider updates: * Increased the minimum required version for the AWS Terraform provider to `>= 6.21` in all example `versions.tf` files. [[1]](diffhunk://#diff-61160e0ae9e70de675b98889710708e1a9edcccd5194a4a580aa234caa5103aeL5-R5) [[2]](diffhunk://#diff-debb96ea7aef889f9deb4de51c61ca44a7e23832098e1c9d8b09fa54b1a96582L5-R5) * Updated the `.terraform.lock.hcl` files in all examples to lock the AWS provider at version `6.22.1`, the local provider at `2.6.1`, and the null provider at `3.2.4` where applicable, along with updated hash values and constraints. [[1]](diffhunk://#diff-101dfb4a445c2ab4a46c53cbc92db3a43f3423ba1e8ee7b8a11b393ebe835539L5-R43) [[2]](diffhunk://#diff-2a8b3082767f86cfdb88b00e963894a8cdd2ebcf705c8d757d46b55a98452a6cL5-R43) --------- Co-authored-by: github-aws-runners-pr|bot <github-aws-runners-pr[bot]@users.noreply.github.com> Co-authored-by: Guilherme Caulada <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Is there any other commits you plan to add to this PR besides those 2? Or it's ready for review?
Potentially will merge #4929 But that one can also go later this week. And I prefer we get rid of deprecated variables. But was not able to get the PR out. |
This pull request removes support for several deprecated AMI-related variables across all modules, fully migrating the configuration to the consolidated `ami` object. This change simplifies how AMI settings are managed, improves consistency, and reduces confusion for users. All references to the old variables (`ami_filter`, `ami_owners`, `ami_id_ssm_parameter_name`, `ami_kms_key_arn`) have been removed from module inputs, outputs, templates, documentation, and internal logic. **Migration to consolidated AMI configuration:** * Removed all deprecated AMI variables (`ami_filter`, `ami_owners`, `ami_id_ssm_parameter_name`, `ami_kms_key_arn`) from module variable definitions, outputs, and internal usage in `variables.tf`, `outputs.tf`, and related files. [[1]](diffhunk://#diff-05b5a57c136b6ff596500bcbfdcff145ef6cddea2a0e86d184d9daa9a65a288eL396-L424) [[2]](diffhunk://#diff-23e8f44c0f21971190244acdb8a35eaa21af7578ed5f1b97bef83f1a566d979cL138-L165) [[3]](diffhunk://#diff-de6c47c2496bd028a84d55ab12d8a4f90174ebfb6544b8b5c7b07a7ee4f27ec7L78-L90) [[4]](diffhunk://#diff-2daea3e8167ce5d859f6f1bee08138dbe216003262325490e8b90477277c104aL70-L89) [[5]](diffhunk://#diff-52d0673ff466b6445542e17038ea73a1cf41b8112f49ee57da4cebf8f0cb99c5L73-R73) [[6]](diffhunk://#diff-52d0673ff466b6445542e17038ea73a1cf41b8112f49ee57da4cebf8f0cb99c5L186-L187) [[7]](diffhunk://#diff-951f6bd1e32c3d27dd90e2dfb1f5232a704ef01fd925f3ee4323d6adc2dcdf5aL15-L20) [[8]](diffhunk://#diff-3937b99021390c0192952207dd2e26a409e0c03446478fb09ac3cd360bb60ee5L9-L14) * Updated example and documentation files to use the new `ami` object structure, replacing previous usage of the deprecated variables. [[1]](diffhunk://#diff-ef2038e9f8d807236d2acebe3c3a191039f8021cc4a0188f4778de908f0d453bL36-R40) [[2]](diffhunk://#diff-ef2038e9f8d807236d2acebe3c3a191039f8021cc4a0188f4778de908f0d453bL52-R57) [[3]](diffhunk://#diff-0a0d2ecd774e69a1397a913b6230f45692b49c0b17ccb103d318f6ab078353e2L48-R51) [[4]](diffhunk://#diff-b2b9df08c45240d599f6260d246bad6e67129932174131db209341d8464247a8L18-R19) [[5]](diffhunk://#diff-61032a0bb5f9d7ae65ba5155b2e58e12901f39bb7068f16b419d94c6f7a5b922L86-R89) * Refactored module runner logic to only use the new `ami` object, removing all fallback and compatibility code for the old variables. [[1]](diffhunk://#diff-dc46acf24afd63ef8c556b77c126ccc6e578bc87e3aa09a931f33d9bf2532fbbL182-L185) [[2]](diffhunk://#diff-57f00cdf57feef92ffcb35d6618e62e14ad652d5d520f331068332d5f3cade51L30-L32) [[3]](diffhunk://#diff-e9624b388e62ca51cf1fe073eb0919588e8c36a1143ecdb49580996a89f13bebL40-R51) * Updated internal references to AMI SSM parameter names and related policies to use the new configuration, ensuring all resource and environment variable logic is aligned with the consolidated approach. [[1]](diffhunk://#diff-bc00c0efa92f360635d026350da2fb775718514e2b1ae718281400e661b7469bL13-R13) [[2]](diffhunk://#diff-bc00c0efa92f360635d026350da2fb775718514e2b1ae718281400e661b7469bL28-R28) [[3]](diffhunk://#diff-5921c9e3315946068538b290966e7e4e51b6e49d04c2466b0bdd4b298629b29dL56-R57) [[4]](diffhunk://#diff-a598ba79d09e4770d55ed09e6b1d51e68c4a54562a3e3cbb46619d625a609d23L28-R28) [[5]](diffhunk://#diff-a598ba79d09e4770d55ed09e6b1d51e68c4a54562a3e3cbb46619d625a609d23L151-R151) With these updates, all AMI configuration is now handled through the unified `ami` object, making runner setup more straightforward and future-proof. ## Tested - [x] default example - [x] multi runner example --------- Co-authored-by: github-aws-runners-pr|bot <github-aws-runners-pr[bot]@users.noreply.github.com> Co-authored-by: Copilot <[email protected]>
## Description This PR adds support for configuring EC2 placement groups for GitHub Actions runners in the multi-runner module. It plumbs a new placement option from the runner configuration through to the underlying EC2 runner module. ## Details Updated modules/multi-runner/runners.tf to pass placement = each.value.runner_config.placement into the runners module. This allows specifying AWS placement groups for EC2 runners, enabling tighter control over instance placement. The change is backwards compatible: if placement is unset in runner_config, behavior remains unchanged. ## Motivation / Future work Placement groups are a prerequisite for supporting macOS runners, which require a host_id. A follow-up PR will add explicit macOS support leveraging this new placement wiring. --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Niek Palm <[email protected]>
guicaulada
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR description is missing:
- Add support to use placement group in launch template
But given each PR was individually reviewed, I'm confident this can be approved.
Uh oh!
There was an error while loading. Please reload this page.