Skip to content

Commit fad84ba

Browse files
committed
MachinePool annotation for externally managed autoscaler
1 parent 542ac03 commit fad84ba

File tree

8 files changed

+170
-7
lines changed

8 files changed

+170
-7
lines changed

api/v1beta1/common_types.go

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,12 @@ const (
128128
// any changes to the actual object because it is a dry run) and the topology controller
129129
// will receive the resulting object.
130130
TopologyDryRunAnnotation = "topology.cluster.x-k8s.io/dry-run"
131+
132+
// ReplicasManagedByAnnotation is an annotation that indicates external (non-Cluster API) management of infra scaling.
133+
// The practical effect of this is that the capi "replica" count should be passively derived from the number of observed infra machines,
134+
// instead of being a source of truth for eventual consistency.
135+
// This annotation can be used to inform MachinePool status during in-progress scaling scenarios.
136+
ReplicasManagedByAnnotation = "cluster.x-k8s.io/replicas-managed-by"
131137
)
132138

133139
const (

docs/book/src/developer/architecture/controllers/machine-pool.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,30 @@ The `status` object **may** define several fields that do not affect functionali
9696
* `failureReason` - is a string that explains why a fatal error has occurred, if possible.
9797
* `failureMessage` - is a string that holds the message contained by the error.
9898

99+
Example:
100+
```yaml
101+
kind: MyMachinePool
102+
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
103+
spec:
104+
providerIDList:
105+
- cloud:////my-cloud-provider-id-0
106+
- cloud:////my-cloud-provider-id-1
107+
status:
108+
ready: true
109+
```
110+
111+
#### Externally Managed Autoscaler
112+
113+
A provider may implement an InfrastructureMachinePool that is externally managed by an autoscaler. For example, if you are using a Managed Kubernetes provider, it may include its own autoscaler solution. To indicate this to Cluster API, you would decorate the MachinePool object with the following annotation:
114+
115+
`"cluster.x-k8s.io/replicas-managed-by": ""`
116+
117+
Cluster API treats the annotation as a "boolean", meaning that the presence of the annotation is sufficient to indicate external replica count management, with one exception: if the value is `"false"`, then that indicates to Cluster API that replica enforcement is nominal, and managed by Cluster API.
118+
119+
Providers may choose to implement the `cluster.x-k8s.io/replicas-managed-by` annotation with different values (e.g., `external-autoscaler`, or `karpenter`) that may inform different provider-specific behaviors, but those values will have no effect upon Cluster API.
120+
121+
The effect upon Cluster API of this annotation is that during autoscaling events (initiated externally, not by Cluster API), when more or fewer MachinePool replicas are observed compared to the `Spec.Replicas` configuration, it will update its `Status.Phase` property to the value of `"Scaling"`.
122+
99123
Example:
100124
```yaml
101125
kind: MyMachinePool
@@ -104,10 +128,15 @@ spec:
104128
providerIDList:
105129
- cloud:////my-cloud-provider-id-0
106130
- cloud:////my-cloud-provider-id-1
131+
- cloud:////my-cloud-provider-id-2
132+
replicas: 1
107133
status:
108134
ready: true
135+
phase: Scaling
109136
```
110137

138+
It is the provider's responsibility to update Cluster API's `Spec.Replicas` property to the value observed in the underlying infra environment as it changes in response to external autoscaling behaviors. Once that is done, and the number of providerID items is equal to the `Spec.Replicas` property, the MachinePools's `Status.Phase` property will be set to `Running` by Cluster API.
139+
111140
### Secrets
112141

113142
The machine pool controller will use a secret in the following format:

docs/book/src/developer/providers/v1.2-to-v1.3.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ in Cluster API are kept in sync with the versions used by `sigs.k8s.io/controlle
4141
- A new timeout `nodeVolumeDetachTimeout` has been introduced that defines how long the controller will spend on waiting for all volumes to be detached.
4242
The default value is 0, meaning that the volume can be detached without any time limitations.
4343
- A new annotation `machine.cluster.x-k8s.io/exclude-wait-for-node-volume-detach` has been introduced that allows explicitly skip the waiting for node volume detaching.
44+
- A new annotation `"cluster.x-k8s.io/replicas-managed-by"` has been introduced to indicate that a MachinePool's replica enforcement is delegated to an external autoscaler (not managed by Cluster API). For more information see the documentation [here](../architecture/controllers/machine-pool.md#externally-managed-autoscaler).
4445
- The `Path` func in the `sigs.k8s.io/cluster-api/cmd/clusterctl/client/repository.Overrider` interface has been adjusted to also return an error.
4546

4647
### Other
@@ -52,8 +53,8 @@ The default value is 0, meaning that the volume can be detached without any time
5253
* the `--junit-report` argument [replaces JUnit custom reporter](https://onsi.github.io/ginkgo/MIGRATING_TO_V2#improved-reporting-infrastructure) code
5354
* see the ["Update tests to Ginkgo v2" PR](https://github.com/kubernetes-sigs/cluster-api/pull/6906) for a reference example
5455
- Cluster API introduced new [logging guidelines](../../developer/logging.md). All reconcilers in the core repository were updated
55-
to [log the entire object hierarchy](../../developer/logging.md#keyvalue-pairs). It would be great if providers would be adjusted
56-
as well to make it possible to cross-reference log entries across providers (please see CAPD for an infra provider reference implementation).
56+
to [log the entire object hierarchy](../../developer/logging.md#keyvalue-pairs). It would be great if providers would be adjusted
57+
as well to make it possible to cross-reference log entries across providers (please see CAPD for an infra provider reference implementation).
5758
- The `CreateLogFile` function and `CreateLogFileInput` struct in the E2E test framework for clusterctl has been renamed to `OpenLogFile` and `OpenLogFileInput` because the function will now append to the logfile instead of truncating the content.
5859
- The `Move` function in E2E test framework for clusterctl has been modified to:
5960
* print the `clusterctl move` command including the arguments similar to `Init`.

docs/book/src/reference/labels_and_annotations.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
| cluster.x-k8s.io/cloned-from-groupkind | It is the infrastructure machine annotation that stores the group-kind of the infrastructure template resource that was cloned for the machine. This annotation is set only during cloning a template. Older/adopted machines will not have this annotation. |
3434
| cluster.x-k8s.io/skip-remediation | It is used to mark the machines that should not be considered for remediation by MachineHealthCheck reconciler. |
3535
| cluster.x-k8s.io/managed-by | It can be applied to InfraCluster resources to signify that some external system is managing the cluster infrastructure. Provider InfraCluster controllers will ignore resources with this annotation. An external controller must fulfill the contract of the InfraCluster resource. External infrastructure providers should ensure that the annotation, once set, cannot be removed. |
36+
| cluster.x-k8s.io/replicas-managed-by | It can be applied to MachinePool resources to signify that some external system is managing infrastructure scaling for that pool. See [the MachinePool documentation](../developer/architecture/controllers/machine-pool.md#externally-managed-autoscaler) for more details. |
3637
| topology.cluster.x-k8s.io/dry-run | It is an annotation that gets set on objects by the topology controller only during a server side dry run apply operation. It is used for validating update webhooks for objects which get updated by template rotation (e.g. InfrastructureMachineTemplate). When the annotation is set and the admission request is a dry run, the webhook should deny validation due to immutability. By that the request will succeed (without any changes to the actual object because it is a dry run) and the topology controller will receive the resulting object. |
3738
| machine.cluster.x-k8s.io/certificates-expiry | It captures the expiry date of the machine certificates in RFC3339 format. It is used to trigger rollout of control plane machines before certificates expire. It can be set on BootstrapConfig and Machine objects. The value set on Machine object takes precedence. The annotation is only used by control plane machines. |
3839
| machine.cluster.x-k8s.io/exclude-node-draining | It explicitly skips node draining if set. |
@@ -45,4 +46,4 @@
4546
| machinedeployment.clusters.x-k8s.io/max-replicas | It is the maximum replicas a deployment can have at a given point, which is machinedeployment.spec.replicas + maxSurge. Used by the underlying machine sets to estimate their proportions in case the deployment has surge replicas. |
4647
| controlplane.cluster.x-k8s.io/skip-coredns | It explicitly skips reconciling CoreDNS if set. |
4748
|controlplane.cluster.x-k8s.io/skip-kube-proxy | It explicitly skips reconciling kube-proxy if set.|
48-
| controlplane.cluster.x-k8s.io/kubeadm-cluster-configuration| It is a machine annotation that stores the json-marshalled string of KCP ClusterConfiguration. This annotation is used to detect any changes in ClusterConfiguration and trigger machine rollout in KCP.|
49+
| controlplane.cluster.x-k8s.io/kubeadm-cluster-configuration| It is a machine annotation that stores the json-marshalled string of KCP ClusterConfiguration. This annotation is used to detect any changes in ClusterConfiguration and trigger machine rollout in KCP.|

exp/api/v1beta1/machinepool_types.go

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -163,6 +163,11 @@ const (
163163
// MachinePool infrastructure is scaling down.
164164
MachinePoolPhaseScalingDown = MachinePoolPhase("ScalingDown")
165165

166+
// MachinePoolPhaseScaling is the MachinePool state when the
167+
// MachinePool infrastructure is scaling.
168+
// This phase value is appropriate to indicate an active state of scaling by an external autoscaler.
169+
MachinePoolPhaseScaling = MachinePoolPhase("Scaling")
170+
166171
// MachinePoolPhaseDeleting is the MachinePool state when a delete
167172
// request has been sent to the API Server,
168173
// but its infrastructure has not yet been fully deleted.

exp/internal/controllers/machinepool_controller_phases.go

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -67,14 +67,26 @@ func (r *MachinePoolReconciler) reconcilePhase(mp *expv1.MachinePool) {
6767
mp.Status.SetTypedPhase(expv1.MachinePoolPhaseRunning)
6868
}
6969

70-
// Set the phase to "scalingUp" if the infrastructure is scaling up.
70+
// Set the appropriate phase in response to the MachinePool replica count being greater than the observed infrastructure replicas.
7171
if mp.Status.InfrastructureReady && *mp.Spec.Replicas > mp.Status.ReadyReplicas {
72-
mp.Status.SetTypedPhase(expv1.MachinePoolPhaseScalingUp)
72+
// If we are being managed by an external autoscaler and can't predict scaling direction, set to "Scaling".
73+
if annotations.ReplicasManagedByExternalAutoscaler(mp) {
74+
mp.Status.SetTypedPhase(expv1.MachinePoolPhaseScaling)
75+
} else {
76+
// Set the phase to "ScalingUp" if we are actively scaling the infrastructure out.
77+
mp.Status.SetTypedPhase(expv1.MachinePoolPhaseScalingUp)
78+
}
7379
}
7480

75-
// Set the phase to "scalingDown" if the infrastructure is scaling down.
81+
// Set the appropriate phase in response to the MachinePool replica count being less than the observed infrastructure replicas.
7682
if mp.Status.InfrastructureReady && *mp.Spec.Replicas < mp.Status.ReadyReplicas {
77-
mp.Status.SetTypedPhase(expv1.MachinePoolPhaseScalingDown)
83+
// If we are being managed by an external autoscaler and can't predict scaling direction, set to "Scaling".
84+
if annotations.ReplicasManagedByExternalAutoscaler(mp) {
85+
mp.Status.SetTypedPhase(expv1.MachinePoolPhaseScaling)
86+
} else {
87+
// Set the phase to "ScalingDown" if we are actively scaling the infrastructure in.
88+
mp.Status.SetTypedPhase(expv1.MachinePoolPhaseScalingDown)
89+
}
7890
}
7991

8092
// Set the phase to "failed" if any of Status.FailureReason or Status.FailureMessage is not-nil.

util/annotations/helpers.go

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,11 @@ func HasWithPrefix(prefix string, annotations map[string]string) bool {
5858
return false
5959
}
6060

61+
// ReplicasManagedByExternalAutoscaler returns true if the standard annotation for external autoscaler is present.
62+
func ReplicasManagedByExternalAutoscaler(o metav1.Object) bool {
63+
return hasTruthyAnnotationValue(o, clusterv1.ReplicasManagedByAnnotation)
64+
}
65+
6166
// AddAnnotations sets the desired annotations on the object and returns true if the annotations have changed.
6267
func AddAnnotations(o metav1.Object, desired map[string]string) bool {
6368
if len(desired) == 0 {
@@ -87,3 +92,15 @@ func hasAnnotation(o metav1.Object, annotation string) bool {
8792
_, ok := annotations[annotation]
8893
return ok
8994
}
95+
96+
// hasTruthyAnnotationValue returns true if the object has an annotation with a value that is not "false".
97+
func hasTruthyAnnotationValue(o metav1.Object, annotation string) bool {
98+
annotations := o.GetAnnotations()
99+
if annotations == nil {
100+
return false
101+
}
102+
if val, ok := annotations[annotation]; ok {
103+
return val != "false"
104+
}
105+
return false
106+
}

util/annotations/helpers_test.go

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,3 +151,95 @@ func TestAddAnnotations(t *testing.T) {
151151
})
152152
}
153153
}
154+
155+
func TestHasTruthyAnnotationValue(t *testing.T) {
156+
tests := []struct {
157+
name string
158+
obj metav1.Object
159+
annotationKey string
160+
expected bool
161+
}{
162+
{
163+
name: "annotation does not exist",
164+
obj: &corev1.Node{
165+
ObjectMeta: metav1.ObjectMeta{
166+
Annotations: map[string]string{
167+
"cluster.x-k8s.io/some-other-annotation": "",
168+
},
169+
},
170+
Spec: corev1.NodeSpec{},
171+
Status: corev1.NodeStatus{},
172+
},
173+
annotationKey: "cluster.x-k8s.io/replicas-managed-by",
174+
expected: false,
175+
},
176+
{
177+
name: "no val",
178+
obj: &corev1.Node{
179+
ObjectMeta: metav1.ObjectMeta{
180+
Annotations: map[string]string{
181+
"cluster.x-k8s.io/replicas-managed-by": "",
182+
},
183+
},
184+
Spec: corev1.NodeSpec{},
185+
Status: corev1.NodeStatus{},
186+
},
187+
annotationKey: "cluster.x-k8s.io/replicas-managed-by",
188+
expected: true,
189+
},
190+
{
191+
name: "annotation exists, true value",
192+
obj: &corev1.Node{
193+
ObjectMeta: metav1.ObjectMeta{
194+
Annotations: map[string]string{
195+
"cluster.x-k8s.io/replicas-managed-by": "true",
196+
},
197+
},
198+
Spec: corev1.NodeSpec{},
199+
Status: corev1.NodeStatus{},
200+
},
201+
annotationKey: "cluster.x-k8s.io/replicas-managed-by",
202+
expected: true,
203+
},
204+
{
205+
name: "annotation exists, random string value",
206+
obj: &corev1.Node{
207+
ObjectMeta: metav1.ObjectMeta{
208+
Annotations: map[string]string{
209+
"cluster.x-k8s.io/replicas-managed-by": "foo",
210+
},
211+
},
212+
Spec: corev1.NodeSpec{},
213+
Status: corev1.NodeStatus{},
214+
},
215+
annotationKey: "cluster.x-k8s.io/replicas-managed-by",
216+
expected: true,
217+
},
218+
{
219+
name: "annotation exists, false value",
220+
obj: &corev1.Node{
221+
ObjectMeta: metav1.ObjectMeta{
222+
Annotations: map[string]string{
223+
"cluster.x-k8s.io/replicas-managed-by": "false",
224+
},
225+
},
226+
Spec: corev1.NodeSpec{},
227+
Status: corev1.NodeStatus{},
228+
},
229+
annotationKey: "cluster.x-k8s.io/replicas-managed-by",
230+
expected: false,
231+
},
232+
}
233+
for _, tt := range tests {
234+
tt := tt
235+
t.Run(tt.name, func(t *testing.T) {
236+
g := NewWithT(t)
237+
ret := hasTruthyAnnotationValue(tt.obj, tt.annotationKey)
238+
if tt.expected {
239+
g.Expect(ret).To(BeTrue())
240+
} else {
241+
g.Expect(ret).To(BeFalse())
242+
}
243+
})
244+
}
245+
}

0 commit comments

Comments
 (0)