Skip to content

Conversation

@KCSesh
Copy link
Contributor

@KCSesh KCSesh commented Dec 5, 2025

Issue Related to aws/amazon-ecs-agent#4538

Code change dependency: bottlerocket-os/bottlerocket#4715

Description of changes:
Recent ecs-agent updates introduced an incompatibility when using FIPS ECR endpoints alongside use_fips_endpoint=true, requiring us to choose one approach.

We opted to let users specify FIPS ECR endpoints directly. However, amazon-ecr-containerd-resolver doesn't support FIPS endpoints without use_fips_endpoint=true, and the library has been tech debt we've wanted to migrate away from.

This change replaces amazon-ecr-containerd-resolver with containerd's Docker resolver.

Testing done:

ECS Agent Conformance Testing

Ran internal ECS conformance tests across multiple variants and architectures with use_fips_endpoint=false:

Variant Architecture Region
aws-ecs-2 x86_64 commercial
aws-ecs-2-nvidia-fips aarch64 commercial
aws-ecs-3 x86_64 commercial
aws-ecs-2 aarch64 us-gov-west-1
aws-ecs-2-nvidia-fips x86_64 us-gov-west-1

Additionally verified ECS task execution with both FIPS and non-FIPS containers to confirm expected behavior.

Host Container Image Pull Testing

GovCloud (us-gov-west-1)
Variant Image Endpoint FIPS ECR Auth Endpoint Result
aws-ecs-2 dkr.ecr false api.ecr.us-gov-west-1.amazonaws.com
aws-ecs-2 dkr.ecr-fips true api.ecr-fips.us-gov-west-1.amazonaws.com
aws-ecs-2-fips dkr.ecr false api.ecr.us-gov-west-1.amazonaws.com
aws-ecs-2-fips dkr.ecr-fips true api.ecr-fips.us-gov-west-1.amazonaws.com
China Region (cn-north-1)
Variant Image Endpoint FIPS Result
aws-ecs-2 dkr.ecr false
aws-ecs-2-fips dkr.ecr-fips true ❌ Expected failure: invalid FIPS region: cn-north-1
aws-ecs-2-fips dkr.ecr false
Special Region Test For New Regions (ap-southeast-7)

Verified host container image pull works in special region:

level=info msg="setting up ECR client" fips=false region=ap-southeast-7
level=info msg="pulling private ECR image" ref="<account>.dkr.ecr.ap-southeast-7.amazonaws.com/test-alpine:latest"
level=info msg="pulled image successfully"
Digest test:
Details
bash-5.1# journalctl -u host-containers@test
Dec 24 02:24:13 ip-172-31-24-3.us-west-2.compute.internal systemd[1]: Started Host container: test.
Dec 24 02:24:14 ip-172-31-24-3.us-west-2.compute.internal host-containers@test[2236]: time="2025-12-24T02:24:14Z" level=info msg="Image does not exist, proceeding to pull image from source." ref="111111111111.dkr.ecr.us-west-2.amazonaws.com/test-alpine@sha256:7b9b6a044d921dfcaea2a843ff19d725948590352198f93cb878fd2c19d7ba3c"
Dec 24 02:24:14 ip-172-31-24-3.us-west-2.compute.internal host-containers@test[2236]: time="2025-12-24T02:24:14Z" level=info msg="setting up ECR client" fips=false region=us-west-2
Dec 24 02:24:14 ip-172-31-24-3.us-west-2.compute.internal host-containers@test[2236]: time="2025-12-24T02:24:14Z" level=info msg="pulling private ECR image" ref="111111111111.dkr.ecr.us-west-2.amazonaws.com/test-alpine@sha256:7b9b6a044d921dfcaea2a843ff19d725948590352198f93cb878fd2c19d7ba3c" region=us-west-2
Dec 24 02:24:15 ip-172-31-24-3.us-west-2.compute.internal host-containers@test[2236]: time="2025-12-24T02:24:15Z" level=info msg="pulled image successfully" img="111111111111.dkr.ecr.us-west-2.amazonaws.com/test-alpine@sha256:7b9b6a044d921dfcaea2a843ff19d725948590352198f93cb878fd2c19d7ba3c"
Dec 24 02:24:15 ip-172-31-24-3.us-west-2.compute.internal host-containers@test[2236]: time="2025-12-24T02:24:15Z" level=info msg="unpacking image..." img="111111111111.dkr.ecr.us-west-2.amazonaws.com/test-alpine@sha256:7b9b6a044d921dfcaea2a843ff19d725948590352198f93cb878fd2c19d7ba3c"
Dec 24 02:24:15 ip-172-31-24-3.us-west-2.compute.internal host-containers@test[2236]: time="2025-12-24T02:24:15Z" level=info msg="Container does not exist, proceeding to create it" ctr-id=test
Dec 24 02:24:15 ip-172-31-24-3.us-west-2.compute.internal host-containers@test[2236]: time="2025-12-24T02:24:15Z" level=info msg="container task does not exist, proceeding to create it" container-id=test
Dec 24 02:24:15 ip-172-31-24-3.us-west-2.compute.internal host-containers@test[2236]: time="2025-12-24T02:24:15Z" level=info msg="successfully started container task"
bash-5.1#
**Terms of contribution:**

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@KCSesh KCSesh changed the title Use docker resolver [WIP] host-ctr: Use docker resolver Dec 5, 2025
@KCSesh KCSesh force-pushed the use-docker-resolver branch from e9a464e to 1270ef8 Compare December 5, 2025 08:12
@KCSesh KCSesh force-pushed the use-docker-resolver branch 5 times, most recently from 0a23786 to 9f148c1 Compare December 20, 2025 01:34
@KCSesh KCSesh force-pushed the use-docker-resolver branch 2 times, most recently from f3e0301 to 0f5fa44 Compare December 24, 2025 00:57
@KCSesh KCSesh changed the title [WIP] host-ctr: Use docker resolver host-ctr: Use docker resolver Dec 24, 2025
@KCSesh KCSesh marked this pull request as ready for review December 24, 2025 01:59
})
authorizer := docker.NewDockerAuthorizer(authOpt)
c.Resolver = docker.NewResolver(docker.ResolverOptions{
// TODO: Consider adding support for user-provided credentials with registryConfig as fallback,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KCSesh KCSesh changed the title host-ctr: Use docker resolver host-ctr: use docker resolver Dec 24, 2025
KCSesh added 2 commits January 5, 2026 23:13
Replace the amazon-ecr-containerd-resolver dependency with direct
implementation using containerd's Docker resolver.

Signed-off-by: Kyle Sessions <[email protected]>
@KCSesh KCSesh force-pushed the use-docker-resolver branch from 0f5fa44 to 48d1c7e Compare January 5, 2026 23:26
[[package.metadata.build-package.external-files]]
url = "https://github.com/aws/amazon-ecs-agent/archive/v1.91.2/amazon-ecs-agent-1.91.2.tar.gz"
sha512 = "c079dc22ee60ff0701d9a66f59add26fcab02baae36c72f98e8397ea6747a1858c4df2cada9ed3e2af3657d65920d2495b0b94c88dfbd573a6485ce2a4d6a816"
# Verify the Git submodule commit of amazon-vpc-cni-plugins matches what is shipped
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good comment but admittedly it took me a moment to realize you meant shipped seperately in core-kit. Maybe revise to "... shipped in ../amazon-vpc-cni-plugins"

//
// Capture groups: [1] = account ID, [2] = "-fips" or empty, [3] = region
//
// ECR hostname pattern also used in the ecr-credential-provider:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've since deviated from this in order to support .eu domain suffix.

I think this regex predates your PR. It's a gnarly one to read. It would be great if we could reign it in or do away with it somehow. I know in Python regexes have "verbose" mode where you can add inline comments explaining parts of the regex.

I want to say I remember interacting with this one in the past, so there's a chance I tried to do battle with it and failed. Might be a dead end.


// Check if the image source is an ECR image. If it is, then we need to handle it with the ECR resolver.
isECRImage := ecrRegex.MatchString(source)
// Use unified image fetching for all registries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Feel a bit mixed about this comment, as it is more a comment about the history of the code than the current state.

log.G(ctx).WithField("ref", source).Error(err)
return err
}
// Use unified image fetching for all registries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same nit here

Comment on lines +43 to +45
// A set of the currently supported FIPS regions for ECR: https://docs.aws.amazon.com/general/latest/gr/ecr.html
// FIPS-supported ECR regions: https://docs.aws.amazon.com/general/latest/gr/ecr.html
var fipsSupportedEcrRegionSet = map[string]bool{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: dupe comment and the official list from the link appears to be larger now.

not nit: Is this something we can lean on the SDK for now? It seems like in the old code we needed to understand if we were doing FIPS to avoid hitting an error condition in the resolver.

Now we only use it to raise an error - but the SDK might take care of that for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants