Skip to content

OnDelete rollout strategy produces negative replica counts causing infinite reconciliation loop #12815

@mczimm

Description

@mczimm

What steps did you take and what happened?

I was switching a MachineDeployment from RollingUpdate strategy to OnDelete strategy. During this transition, the controller encountered a negative replica count calculation in the OnDelete rollout logic.

Specifically, at line 129 in machinedeployment_rollout_ondelete.go:
https://github.com/kubernetes-sigs/cluster-api/blob/main/internal/controllers/machinedeployment/machinedeployment_rollout_ondelete.go#L129

The machineSetScaleDownAmountDueToMachineDeletion calculation resulted in -1, which then caused:

A log message: "Unexpected negative scale down amount"
The negative value being used in the replica calculation at line 132 (subtracting a negative number, effectively adding)

The controller to enter an infinite loop, continuously flipping the MachineSet replicas up and down

What did you expect to happen?

The controller should:

Never allow negative scale down amounts to propagate through the replica calculations
Handle edge cases gracefully without entering infinite reconciliation loops
Either:

Clamp the value to 0 if negative, or
Skip processing that MachineSet and continue to the next one

Cluster API version

1.8.12

Kubernetes version

1.30.11

Anything else you would like to add?

Proposed Solution:

Add a safeguard to prevent negative values from affecting replica counts:

if machineSetScaleDownAmountDueToMachineDeletion < 0 {
	log.V(4).Error(errors.Errorf("Unexpected negative scale down amount: %d", machineSetScaleDownAmountDueToMachineDeletion), fmt.Sprintf("Error reconciling MachineSet %s", oldMS.Name))
	machineSetScaleDownAmountDueToMachineDeletion = 0 // to keep it always positive
	continue
}

This would ensure the replica count remains correct and prevents the infinite reconciliation loop.

Label(s) to be applied

/kind bug
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.needs-priorityIndicates an issue lacks a `priority/foo` label and requires one.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions