Skip to content

Conversation

jfroche
Copy link
Collaborator

@jfroche jfroche commented Aug 15, 2025

Tests are often failing due to the SSH access to the instance.

EC2 Instance Connect push the temporary SSH key which is then available
only for 60 seconds. Recently, errors often occur when the SSH key is
sent to the instance, resulting in a timeout.

We replace runtime SSH key injection via EC2 Instance Connect API with
cloud-init configuration to add the SSH public key during instance initialization.

Note that we are still using EC2 Instance Connect to create the SSH key
pair, but we are not using it to push the key to the instance.

Tests are often failing due to the SSH access to the instance.

EC2 Instance Connect push the temporary SSH key which is then available
only for 60 seconds. Recently, errors often occur when the SSH key is
sent to the instance, resulting in a timeout.

We replace runtime SSH key injection via EC2 Instance Connect API with
cloud-init configuration to add the SSH public key during instance initialization.

Note that we are still using EC2 Instance Connect to create the SSH key
pair, but we are not using it to push the key to the instance.
@jfroche jfroche requested review from a team as code owners August 15, 2025 12:14
@jfroche jfroche changed the title fix(ci): replace EC2 Instance Connect with cloud-init SSH key injection fix(ci): avoid testinfra failure due to loss of ssh connection Aug 15, 2025
For the moment, the first matrix job that finishes will terminate all
the ec2 instances running in the current workflow run. This is not what we
want.

This change only terminates the instance that is running the matrix job.
@jfroche jfroche force-pushed the push-wvxxyloylnrr branch from 1125e52 to 3882f3e Compare August 15, 2025 16:13
Add optional timeout parameter to run_ssh_command() to check init completion status with a 5-second timeout.
Copy link

@jchancojr jchancojr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Collaborator

@samrose samrose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great improvement on this thank you!

@samrose samrose merged commit 266832a into develop Aug 18, 2025
14 checks passed
@samrose samrose deleted the push-wvxxyloylnrr branch August 18, 2025 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants