Skip to content

Machine controller loses track of created VM instance, in turn creating multiple VM instances for one AWSMachine, but only being aware of one #5794

@dlipovetsky

Description

@dlipovetsky

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]
The panic verified and fixed in #5792 resulted in the repeated restart of manager process, and thereby the Machine controller. The panic happened just after the Machine controller had called the EC2 RunInstances API to create an EC2 instance. However, because of the panic, the controller was not able to record the Instance ID in AWSMachine Status. After every restart, the Machine controller created a new EC2 instance.

I originally noted this problem in #2915

What did you expect to happen:

The Machine controller should create at most one EC2 instance for an AWSMachine.

Anything else you would like to add:
There is a well-understood way to solve this problem: create and store a client token, and once it is stored, use it in the RunInstances API call. We can derive the client token from immutable information in the AWSMachine spec.

Environment:

  • Cluster-api-provider-aws version: v2.10.0
  • Kubernetes version: (use kubectl version): v1.34.1
  • OS (e.g. from /etc/os-release): N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.needs-priorityneeds-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions