/kind bug
What steps did you take and what happened:
[A clear and concise description of what the bug is.]
The panic verified and fixed in #5792 resulted in the repeated restart of manager process, and thereby the Machine controller. The panic happened just after the Machine controller had called the EC2 RunInstances API to create an EC2 instance. However, because of the panic, the controller was not able to record the Instance ID in AWSMachine Status. After every restart, the Machine controller created a new EC2 instance.
I originally noted this problem in #2915
What did you expect to happen:
The Machine controller should create at most one EC2 instance for an AWSMachine.
Anything else you would like to add:
There is a well-understood way to solve this problem: create and store a client token, and once it is stored, use it in the RunInstances API call. We can derive the client token from immutable information in the AWSMachine spec.
Environment:
- Cluster-api-provider-aws version: v2.10.0
- Kubernetes version: (use
kubectl version): v1.34.1
- OS (e.g. from
/etc/os-release): N/A