Skip to content

Commit 95b5681

Browse files
authored
Merge pull request #49823 from ArvindParekh/merged-main-dev-1.33
[Branch Sync] - Merge main into dev-1.33
2 parents d2f678e + 6e43d05 commit 95b5681

File tree

133 files changed

+3745
-800
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

133 files changed

+3745
-800
lines changed

OWNERS_ALIASES

Lines changed: 0 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -52,10 +52,8 @@ aliases:
5252
- sajibAdhi
5353
sig-docs-de-owners: # Admins for German content
5454
- bene2k1
55-
- rlenferink
5655
sig-docs-de-reviews: # PR reviews for German content
5756
- bene2k1
58-
- rlenferink
5957
sig-docs-en-owners: # Admins for English content
6058
- dipesh-rawat
6159
- divya-mohan0209
@@ -73,7 +71,6 @@ aliases:
7371
- katcosgrove
7472
- kbhawkey
7573
- mengjiao-liu
76-
- mickeyboxell
7774
- natalisucks
7875
- nate-double-u
7976
- reylejano
@@ -105,19 +102,16 @@ aliases:
105102
- dipesh-rawat
106103
- divya-mohan0209
107104
sig-docs-hi-reviews: # PR reviews for Hindi content
108-
- Babapool
109105
- bishal7679
110106
- dipesh-rawat
111107
- divya-mohan0209
112108
- niranjandarshann
113109
sig-docs-id-owners: # Admins for Indonesian content
114110
- ariscahyadi
115-
- danninov
116111
- girikuncoro
117112
- habibrosyad
118113
sig-docs-id-reviews: # PR reviews for Indonesian content
119114
- ariscahyadi
120-
- danninov
121115
- girikuncoro
122116
- habibrosyad
123117
sig-docs-it-owners: # Admins for Italian content
@@ -169,8 +163,6 @@ aliases:
169163
- mengjiao-liu
170164
- my-git9
171165
- SataQiu
172-
- Sea-n
173-
- tanjunchen
174166
- tengqm
175167
- windsonsea
176168
- xichengliudui
@@ -180,33 +172,26 @@ aliases:
180172
- chenrui333
181173
- howieyuen
182174
# idealhack
183-
- kinzhi
184175
- mengjiao-liu
185176
- my-git9
186177
# pigletfly
187178
- SataQiu
188-
- Sea-n
189-
- tanjunchen
190179
- tengqm
191180
- windsonsea
192181
- xichengliudui
193182
- ydFu
194183
sig-docs-pt-owners: # Admins for Portuguese content
195-
- devlware
196184
- edsoncelio
197185
- jcjesus
198186
- stormqueen1990
199187
sig-docs-pt-reviews: # PR reviews for Portugese content
200-
- devlware
201188
- edsoncelio
202189
- jcjesus
203190
- mrerlison
204191
- stormqueen1990
205192
sig-docs-vi-owners: # Admins for Vietnamese content
206-
- huynguyennovem
207193
- truongnh1992
208194
sig-docs-vi-reviews: # PR reviews for Vietnamese content
209-
- huynguyennovem
210195
- truongnh1992
211196
sig-docs-ru-owners: # Admins for Russian content
212197
- Arhell

assets/js/detect-js.js

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
$(document).ready(function () {
2+
document.documentElement.classList.remove('no-js');
3+
});

content/en/blog/_posts/2025-02-14-cloud-controller-manager-chicken-egg-problem/ccm-chicken-egg-problem-sequence-diagram.svg

Lines changed: 3 additions & 0 deletions
Loading
Lines changed: 243 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,243 @@
1+
---
2+
layout: blog
3+
title: "The Cloud Controller Manager Chicken and Egg Problem"
4+
date: 2025-02-14
5+
slug: cloud-controller-manager-chicken-egg-problem
6+
author: >
7+
Antonio Ojea,
8+
Michael McCune
9+
---
10+
11+
Kubernetes 1.31
12+
[completed the largest migration in Kubernetes history][migration-blog], removing the in-tree
13+
cloud provider. While the component migration is now done, this leaves some additional
14+
complexity for users and installer projects (for example, kOps or Cluster API) . We will go
15+
over those additional steps and failure points and make recommendations for cluster owners.
16+
This migration was complex and some logic had to be extracted from the core components,
17+
building four new subsystems.
18+
19+
1. **Cloud controller manager** ([KEP-2392][kep2392])
20+
2. **API server network proxy** ([KEP-1281][kep1281])
21+
3. **kubelet credential provider plugins** ([KEP-2133][kep2133])
22+
4. **Storage migration to use [CSI][csi]** ([KEP-625][kep625])
23+
24+
The [cloud controller manager is part of the control plane][ccm]. It is a critical component
25+
that replaces some functionality that existed previously in the kube-controller-manager and the
26+
kubelet.
27+
28+
{{< figure
29+
src="/images/docs/components-of-kubernetes.svg"
30+
alt="Components of Kubernetes"
31+
caption="Components of Kubernetes"
32+
>}}
33+
34+
One of the most critical functionalities of the cloud controller manager is the node controller,
35+
which is responsible for the initialization of the nodes.
36+
37+
As you can see in the following diagram, when the **kubelet** starts, it registers the Node
38+
object with the apiserver, Tainting the node so it can be processed first by the
39+
cloud-controller-manager. The initial Node is missing the cloud-provider specific information,
40+
like the Node Addresses and the Labels with the cloud provider specific information like the
41+
Node, Region and Instance type information.
42+
43+
{{< figure
44+
src="ccm-chicken-egg-problem-sequence-diagram.svg"
45+
alt="Chicken and egg problem sequence diagram"
46+
caption="Chicken and egg problem sequence diagram"
47+
class="diagram-medium"
48+
>}}
49+
50+
This new initialization process adds some latency to the node readiness. Previously, the kubelet
51+
was able to initialize the node at the same time it created the node. Since the logic has moved
52+
to the cloud-controller-manager, this can cause a [chicken and egg problem][chicken-and-egg]
53+
during the cluster bootstrapping for those Kubernetes architectures that do not deploy the
54+
controller manager as the other components of the control plane, commonly as static pods,
55+
standalone binaries or daemonsets/deployments with tolerations to the taints and using
56+
`hostNetwork` (more on this below)
57+
58+
## Examples of the dependency problem
59+
60+
As noted above, it is possible during bootstrapping for the cloud-controller-manager to be
61+
unschedulable and as such the cluster will not initialize properly. The following are a few
62+
concrete examples of how this problem can be expressed and the root causes for why they might
63+
occur.
64+
65+
These examples assume you are running your cloud-controller-manager using a Kubernetes resource
66+
(e.g. Deployment, DaemonSet, or similar) to control its lifecycle. Because these methods
67+
rely on Kubernetes to schedule the cloud-controller-manager, care must be taken to ensure it
68+
will schedule properly.
69+
70+
### Example: Cloud controller manager not scheduling due to uninitialized taint
71+
72+
As [noted in the Kubernetes documentation][kubedocs0], when the kubelet is started with the command line
73+
flag `--cloud-provider=external`, its corresponding `Node` object will have a no schedule taint
74+
named `node.cloudprovider.kubernetes.io/uninitialized` added. Because the cloud-controller-manager
75+
is responsible for removing the no schedule taint, this can create a situation where a
76+
cloud-controller-manager that is being managed by a Kubernetes resource, such as a `Deployment`
77+
or `DaemonSet`, may not be able to schedule.
78+
79+
If the cloud-controller-manager is not able to be scheduled during the initialization of the
80+
control plane, then the resulting `Node` objects will all have the
81+
`node.cloudprovider.kubernetes.io/uninitialized` no schedule taint. It also means that this taint
82+
will not be removed as the cloud-controller-manager is responsible for its removal. If the no
83+
schedule taint is not removed, then critical workloads, such as the container network interface
84+
controllers, will not be able to schedule, and the cluster will be left in an unhealthy state.
85+
86+
### Example: Cloud controller manager not scheduling due to not-ready taint
87+
88+
The next example would be possible in situations where the container network interface (CNI) is
89+
waiting for IP address information from the cloud-controller-manager (CCM), and the CCM has not
90+
tolerated the taint which would be removed by the CNI.
91+
92+
The [Kubernetes documentation describes][kubedocs1] the `node.kubernetes.io/not-ready` taint as follows:
93+
94+
> "The Node controller detects whether a Node is ready by monitoring its health and adds or removes this taint accordingly."
95+
96+
One of the conditions that can lead to a Node resource having this taint is when the container
97+
network has not yet been initialized on that node. As the cloud-controller-manager is responsible
98+
for adding the IP addresses to a Node resource, and the IP addresses are needed by the container
99+
network controllers to properly configure the container network, it is possible in some
100+
circumstances for a node to become stuck as not ready and uninitialized permanently.
101+
102+
This situation occurs for a similar reason as the first example, although in this case, the
103+
`node.kubernetes.io/not-ready` taint is used with the no execute effect and thus will cause the
104+
cloud-controller-manager not to run on the node with the taint. If the cloud-controller-manager is
105+
not able to execute, then it will not initialize the node. It will cascade into the container
106+
network controllers not being able to run properly, and the node will end up carrying both the
107+
`node.cloudprovider.kubernetes.io/uninitialized` and `node.kubernetes.io/not-ready` taints,
108+
leaving the cluster in an unhealthy state.
109+
110+
## Our Recommendations
111+
112+
There is no one “correct way” to run a cloud-controller-manager. The details will depend on the
113+
specific needs of the cluster administrators and users. When planning your clusters and the
114+
lifecycle of the cloud-controller-managers please consider the following guidance:
115+
116+
For cloud-controller-managers running in the same cluster, they are managing.
117+
118+
1. Use host network mode, rather than the pod network: in most cases, a cloud controller manager
119+
will need to communicate with an API service endpoint associated with the infrastructure.
120+
Setting “hostNetwork” to true will ensure that the cloud controller is using the host
121+
networking instead of the container network and, as such, will have the same network access as
122+
the host operating system. It will also remove the dependency on the networking plugin. This
123+
will ensure that the cloud controller has access to the infrastructure endpoint (always check
124+
your networking configuration against your infrastructure provider’s instructions).
125+
2. Use a scalable resource type. `Deployments` and `DaemonSets` are useful for controlling the
126+
lifecycle of a cloud controller. They allow easy access to running multiple copies for redundancy
127+
as well as using the Kubernetes scheduling to ensure proper placement in the cluster. When using
128+
these primitives to control the lifecycle of your cloud controllers and running multiple
129+
replicas, you must remember to enable leader election, or else your controllers will collide
130+
with each other which could lead to nodes not being initialized in the cluster.
131+
3. Target the controller manager containers to the control plane. There might exist other
132+
controllers which need to run outside the control plane (for example, Azure’s node manager
133+
controller). Still, the controller managers themselves should be deployed to the control plane.
134+
Use a node selector or affinity stanza to direct the scheduling of cloud controllers to the
135+
control plane to ensure that they are running in a protected space. Cloud controllers are vital
136+
to adding and removing nodes to a cluster as they form a link between Kubernetes and the
137+
physical infrastructure. Running them on the control plane will help to ensure that they run
138+
with a similar priority as other core cluster controllers and that they have some separation
139+
from non-privileged user workloads.
140+
1. It is worth noting that an anti-affinity stanza to prevent cloud controllers from running
141+
on the same host is also very useful to ensure that a single node failure will not degrade
142+
the cloud controller performance.
143+
4. Ensure that the tolerations allow operation. Use tolerations on the manifest for the cloud
144+
controller container to ensure that it will schedule to the correct nodes and that it can run
145+
in situations where a node is initializing. This means that cloud controllers should tolerate
146+
the `node.cloudprovider.kubernetes.io/uninitialized` taint, and it should also tolerate any
147+
taints associated with the control plane (for example, `node-role.kubernetes.io/control-plane`
148+
or `node-role.kubernetes.io/master`). It can also be useful to tolerate the
149+
`node.kubernetes.io/not-ready` taint to ensure that the cloud controller can run even when the
150+
node is not yet available for health monitoring.
151+
152+
For cloud-controller-managers that will not be running on the cluster they manage (for example,
153+
in a hosted control plane on a separate cluster), then the rules are much more constrained by the
154+
dependencies of the environment of the cluster running the cloud-controller-manager. The advice
155+
for running on a self-managed cluster may not be appropriate as the types of conflicts and network
156+
constraints will be different. Please consult the architecture and requirements of your topology
157+
for these scenarios.
158+
159+
### Example
160+
161+
This is an example of a Kubernetes Deployment highlighting the guidance shown above. It is
162+
important to note that this is for demonstration purposes only, for production uses please
163+
consult your cloud provider’s documentation.
164+
165+
```
166+
apiVersion: apps/v1
167+
kind: Deployment
168+
metadata:
169+
labels:
170+
app.kubernetes.io/name: cloud-controller-manager
171+
name: cloud-controller-manager
172+
namespace: kube-system
173+
spec:
174+
replicas: 2
175+
selector:
176+
matchLabels:
177+
app.kubernetes.io/name: cloud-controller-manager
178+
strategy:
179+
type: Recreate
180+
template:
181+
metadata:
182+
labels:
183+
app.kubernetes.io/name: cloud-controller-manager
184+
annotations:
185+
kubernetes.io/description: Cloud controller manager for my infrastructure
186+
spec:
187+
containers: # the container details will depend on your specific cloud controller manager
188+
- name: cloud-controller-manager
189+
command:
190+
- /bin/my-infrastructure-cloud-controller-manager
191+
- --leader-elect=true
192+
- -v=1
193+
image: registry/my-infrastructure-cloud-controller-manager@latest
194+
resources:
195+
requests:
196+
cpu: 200m
197+
memory: 50Mi
198+
hostNetwork: true # these Pods are part of the control plane
199+
nodeSelector:
200+
node-role.kubernetes.io/control-plane: ""
201+
affinity:
202+
podAntiAffinity:
203+
requiredDuringSchedulingIgnoredDuringExecution:
204+
- topologyKey: "kubernetes.io/hostname"
205+
labelSelector:
206+
matchLabels:
207+
app.kubernetes.io/name: cloud-controller-manager
208+
tolerations:
209+
- effect: NoSchedule
210+
key: node-role.kubernetes.io/master
211+
operator: Exists
212+
- effect: NoExecute
213+
key: node.kubernetes.io/unreachable
214+
operator: Exists
215+
tolerationSeconds: 120
216+
- effect: NoExecute
217+
key: node.kubernetes.io/not-ready
218+
operator: Exists
219+
tolerationSeconds: 120
220+
- effect: NoSchedule
221+
key: node.cloudprovider.kubernetes.io/uninitialized
222+
operator: Exists
223+
- effect: NoSchedule
224+
key: node.kubernetes.io/not-ready
225+
operator: Exists
226+
```
227+
228+
When deciding how to deploy your cloud controller manager it is worth noting that
229+
cluster-proportional, or resource-based, pod autoscaling is not recommended. Running multiple
230+
replicas of a cloud controller manager is good practice for ensuring high-availability and
231+
redundancy, but does not contribute to better performance. In general, only a single instance
232+
of a cloud controller manager will be reconciling a cluster at any given time.
233+
234+
[migration-blog]: /blog/2024/05/20/completing-cloud-provider-migration/
235+
[kep2392]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-cloud-provider/2392-cloud-controller-manager/README.md
236+
[kep1281]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1281-network-proxy
237+
[kep2133]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2133-kubelet-credential-providers
238+
[csi]: https://github.com/container-storage-interface/spec?tab=readme-ov-file#container-storage-interface-csi-specification-
239+
[kep625]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/625-csi-migration/README.md
240+
[ccm]: /docs/concepts/architecture/cloud-controller/
241+
[chicken-and-egg]: /docs/tasks/administer-cluster/running-cloud-controller/#chicken-and-egg
242+
[kubedocs0]: /docs/tasks/administer-cluster/running-cloud-controller/#running-cloud-controller-manager
243+
[kubedocs1]: /docs/reference/labels-annotations-taints/#node-kubernetes-io-not-ready

0 commit comments

Comments
 (0)