Skip to content

Commit 82080cd

Browse files
authored
Merge pull request kubernetes#1158 from feiskyer/ccm-imds
KEP: Support Instance Metadata Service with Cloud Controller Manager
2 parents 1434952 + d45cea8 commit 82080cd

File tree

1 file changed

+251
-0
lines changed

1 file changed

+251
-0
lines changed
Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
---
2+
title: Support Instance Metadata Service with Cloud Controller Manager
3+
authors:
4+
- "@feiskyer"
5+
owning-sig: sig-cloud-provider
6+
participating-sigs:
7+
- sig-node
8+
reviewers:
9+
- "@andrewsykim"
10+
- "@jagosan"
11+
approvers:
12+
- "@andrewsykim"
13+
editor: "@feiskyer"
14+
creation-date: 2019-07-22
15+
last-updated: 2019-12-15
16+
status: implementable
17+
see-also:
18+
- "/keps/sig-cloud-provider/20180530-cloud-controller-manager.md"
19+
---
20+
21+
# Support Instance Metadata Service with Cloud Controller Manager
22+
23+
## Table of Contents
24+
25+
<!-- TOC -->
26+
- [Release Signoff Checklist](#release-signoff-checklist)
27+
- [Summary](#summary)
28+
- [Motivation](#motivation)
29+
- [Goals](#goals)
30+
- [Non-Goals](#non-goals)
31+
- [Proposal](#proposal)
32+
- [Alternatives](#alternatives)
33+
- [Design Details](#design-details)
34+
- [Test Plan](#test-plan)
35+
- [Graduation Criteria](#graduation-criteria)
36+
- [Examples](#examples)
37+
- [Alpha -&gt; Beta Graduation](#alpha---beta-graduation)
38+
- [Beta -&gt; GA Graduation](#beta---ga-graduation)
39+
- [Removing a deprecated flag](#removing-a-deprecated-flag)
40+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
41+
- [Version Skew Strategy](#version-skew-strategy)
42+
- [Implementation History](#implementation-history)
43+
<!-- /TOC -->
44+
45+
## Release Signoff Checklist
46+
47+
**ACTION REQUIRED:** In order to merge code into a release, there must be an issue in [kubernetes/enhancements] referencing this KEP and targeting a release milestone **before [Enhancement Freeze](https://github.com/kubernetes/sig-release/tree/master/releases)
48+
of the targeted release**.
49+
50+
For enhancements that make changes to code or processes/procedures in core Kubernetes i.e., [kubernetes/kubernetes], we require the following Release Signoff checklist to be completed.
51+
52+
Check these off as they are completed for the Release Team to track. These checklist items _must_ be updated for the enhancement to be released.
53+
54+
- [ ] kubernetes/enhancements issue in release milestone, which links to KEP (this should be a link to the KEP location in kubernetes/enhancements, not the initial KEP PR)
55+
- [ ] KEP approvers have set the KEP status to `implementable`
56+
- [ ] Design details are appropriately documented
57+
- [ ] Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
58+
- [ ] Graduation criteria is in place
59+
- [ ] "Implementation History" section is up-to-date for milestone
60+
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
61+
- [ ] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
62+
63+
**Note:** Any PRs to move a KEP to `implementable` or significant changes once it is marked `implementable` should be approved by each of the KEP approvers. If any of those approvers is no longer appropriate than changes to that list should be approved by the remaining approvers and/or the owning SIG (or SIG-arch for cross cutting KEPs).
64+
65+
**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
66+
67+
[kubernetes.io]: https://kubernetes.io/
68+
[kubernetes/enhancements]: https://github.com/kubernetes/enhancements/issues
69+
[kubernetes/kubernetes]: https://github.com/kubernetes/kubernetes
70+
[kubernetes/website]: https://github.com/kubernetes/website
71+
72+
## Summary
73+
74+
With [cloud-controller-manager](https://kubernetes.io/docs/tasks/administer-cluster/running-cloud-controller/)(CCM), Kubelet won’t initialize itself from Instance Metadata Service (IMDS). Instead, CCM would get the node information from cloud APIs. This would introduce more cloud APIs invoking and more possibilities to get throttled, especially for large clusters.
75+
76+
This proposal aims to add instance metadata service (IMDS) support with CCM. So that, all the nodes could still initialize themselves and reconcile the IP addresses from IMDS.
77+
78+
## Motivation
79+
80+
Before CCM, kubelet supports getting Node information by the cloud provider's instance metadata service. This includes:
81+
82+
- NodeName
83+
- ProviderID
84+
- NodeAddresses
85+
- InstanceType
86+
- AvailabilityZone
87+
88+
Instance metadata service could help to reduce API throttling issues and the node's initialization duration. This is especially helpful for large clusters. But with CCM, this is not possible anymore because the above functionality has been moved to CCM.
89+
90+
Take Azure cloud provider for example:
91+
92+
- According to Azure documentation [here](https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-request-limits), for each Azure subscription and tenant, Resource Manager allows up to **12,000 read requests per hour** and 1,200 write requests per hour. That means, on average only 200 read requests could be sent per minute.
93+
- For different Azure APIs, there’re also additional rate limits based on different durations. For example, there are 3Min and 30Min read limits for VMSS APIs (the numbers below are only for reference since they’re not officially [documented](https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-request-limits)):
94+
- Microsoft.Compute/HighCostGetVMScaleSet3Min;200
95+
- Microsoft.Compute/HighCostGetVMScaleSet30Min;1000
96+
- Microsoft.Compute/VMScaleSetVMViews3Min;5000
97+
98+
Based on those rate limits, getting node’s information for a 5000 cluster may need hours. Things would be much worse for multiple clusters in the same tenant and subscription.
99+
100+
So the proposal aims to add IMDS support back with CCM, so that the kubernetes cluster could still be scaled to a large number of nodes.
101+
102+
### Goals
103+
104+
- Allow nodes to be initialized from IMDS.
105+
- Allow nodes to reconcile the node addresses from IMDS.
106+
107+
### Non-Goals
108+
109+
- Authentication and authorization for each provider implementations.
110+
- API throttling [issue](https://github.com/kubernetes/kubernetes/issues/60646) on route controller.
111+
112+
## Proposal
113+
114+
Same as kube-controller-manager, the [cloud provider interfaces](https://github.com/kubernetes/cloud-provider/blob/master/cloud.go#L43) could be split into two parts:
115+
116+
- Instance-level interfaces: Instances and Zones
117+
- Control-plane interfaces, e.g. LoadBalancer and Routes
118+
119+
The control-plane interfaces are still kept in CCM (who’s name is `cloud-controller-manager`), and deployed on masters. For instance-level interfaces, a new daemonsets would be introduced and implement instance-level interfaces (who’s name is `cloud-node-manager`).
120+
121+
With these changes, the whole node initialization workflow would be:
122+
123+
- Kubelet specifying `--cloud-provider=external` will add a taint `node.cloudprovider.kubernetes.io/uninitialized` with an effect NoSchedule during initialization.
124+
- `cloud-node-manager` would initialize the node again with `Instances` and `Zones`.
125+
- `cloud-controller-manager` then take care of the rest things, e.g. configure the Routes and LoadBalancer for the node.
126+
127+
After node initialized, cloud-node-manager would reconcile the IP addresses from IMDS periodically.
128+
129+
Considering some providers may not require IMDS, cloud-node-manager cloud be enabled optionally by a new option `--enable-node-controller` on cloud-controller-manager. With this new option, there would be three node initialization modes after this proposal:
130+
131+
- 1) Centrally via cloud-controller-manager. All the node initialization, node IP address reconciling and other cloud provider operations are done in CCM.
132+
- `cloud-controller-manager --enable-node-controller=true`
133+
- 2) Using IMDS with cloud-node-manager.
134+
- cloud-node-manager running as a daemonset on each node
135+
- `cloud-controller-manager enable-node-controller=false`
136+
- 3) Arbitrary via custom controllers. Customers may also choose their own controllers, which implement the same functions in cloud provider interfaces. The design and deployments are out of this proposal's scope.
137+
138+
## Alternatives
139+
140+
Since there are already a lot of plugins in Kubelet, e.g. CNI, CRI, and CSI. An alternative way is introducing another cloud-provider plugin, e.g. Cloud Provider Interface (CPI).
141+
142+
When Kubelet starts, the cloud provider plugin may register itself into Kubelet, and then Kubelet invokes cloud provider plugin to initialize the node.
143+
144+
One problem is the deployment of those plugins. If daemonsets is used to deploy those cloud provider plugins, then they should be schedulable before kubelet fully initialized the nodes. That means Kubelet may need to initialize itself two times:
145+
146+
- Register the node into Kubernetes without any cloud-specific information.
147+
- Wait for cloud provider plugins registered and then invoke the plugin to add the cloud-specific information.
148+
149+
The problem of this way is cloud provider plugin would block node’s initialization, while the plugin itself could be scheduled to that node. Although taint _node.cloudprovider.kubernetes.io/uninitialized_ with an effect NoSchedule could still be applied to solve this issue, separating it to cloud-node-manager would make the whole architecture more clear.
150+
151+
## Design Details
152+
153+
### Test Plan
154+
155+
**Note:** *Section not required until targeted at a release.*
156+
157+
Consider the following in developing a test plan for this enhancement:
158+
159+
- Will there be e2e and integration tests, in addition to unit tests?
160+
- How will it be tested in isolation vs with other components?
161+
162+
No need to outline all of the test cases, just the general strategy.
163+
Anything that would count as tricky in the implementation and anything particularly challenging to test should be called out.
164+
165+
All code is expected to have adequate tests (eventually with coverage expectations).
166+
Please adhere to the [Kubernetes testing guidelines][testing-guidelines] when drafting this test plan.
167+
168+
[testing-guidelines]: https://git.k8s.io/community/contributors/devel/sig-testing/testing.md
169+
170+
### Graduation Criteria
171+
172+
**Note:** *Section not required until targeted at a release.*
173+
174+
Define graduation milestones.
175+
176+
These may be defined in terms of API maturity, or as something else. Initial KEP should keep
177+
this high-level with a focus on what signals will be looked at to determine graduation.
178+
179+
Consider the following in developing the graduation criteria for this enhancement:
180+
181+
- [Maturity levels (`alpha`, `beta`, `stable`)][maturity-levels]
182+
- [Deprecation policy][deprecation-policy]
183+
184+
Clearly define what graduation means by either linking to the [API doc definition](https://kubernetes.io/docs/concepts/overview/kubernetes-api/#api-versioning),
185+
or by redefining what graduation means.
186+
187+
In general, we try to use the same stages (alpha, beta, GA), regardless how the functionality is accessed.
188+
189+
[maturity-levels]: https://git.k8s.io/community/contributors/devel/sig-architecture/api_changes.md#alpha-beta-and-stable-versions
190+
[deprecation-policy]: https://kubernetes.io/docs/reference/using-api/deprecation-policy/
191+
192+
#### Examples
193+
194+
These are generalized examples to consider, in addition to the aforementioned [maturity levels][maturity-levels].
195+
196+
##### Alpha -> Beta Graduation
197+
198+
- Gather feedback from developers and surveys
199+
- Complete features A, B, C
200+
- Tests are in Testgrid and linked in KEP
201+
202+
##### Beta -> GA Graduation
203+
204+
- N examples of real world usage
205+
- N installs
206+
- More rigorous forms of testing e.g., downgrade tests and scalability tests
207+
- Allowing time for feedback
208+
209+
**Note:** Generally we also wait at least 2 releases between beta and GA/stable, since there's no opportunity for user feedback, or even bug reports, in back-to-back releases.
210+
211+
##### Removing a deprecated flag
212+
213+
- Announce deprecation and support policy of the existing flag
214+
- Two versions passed since introducing the functionality which deprecates the flag (to address version skew)
215+
- Address feedback on usage/changed behavior, provided on GitHub issues
216+
- Deprecate the flag
217+
218+
**For non-optional features moving to GA, the graduation criteria must include [conformance tests].**
219+
220+
[conformance tests]: https://github.com/kubernetes/community/blob/master/contributors/devel/conformance-tests.md
221+
222+
### Upgrade / Downgrade Strategy
223+
224+
If applicable, how will the component be upgraded and downgraded? Make sure this is in the test plan.
225+
226+
Consider the following in developing an upgrade/downgrade strategy for this enhancement:
227+
228+
- What changes (in invocations, configurations, API use, etc.) is an existing cluster required to make on upgrade in order to keep previous behavior?
229+
- What changes (in invocations, configurations, API use, etc.) is an existing cluster required to make on upgrade in order to make use of the enhancement?
230+
231+
### Version Skew Strategy
232+
233+
If applicable, how will the component handle version skew with other components? What are the guarantees? Make sure
234+
this is in the test plan.
235+
236+
Consider the following in developing a version skew strategy for this enhancement:
237+
238+
- Does this enhancement involve coordinating behavior in the control plane and in the kubelet? How does an n-2 kubelet without this feature available behave when this feature is used?
239+
- Will any other components on the node change? For example, changes to CSI, CRI or CNI may require updating that component before the kubelet.
240+
241+
## Implementation History
242+
243+
Major milestones in the life cycle of a KEP should be tracked in `Implementation History`.
244+
Major milestones might include
245+
246+
- the `Summary` and `Motivation` sections being merged signaling SIG acceptance
247+
- the `Proposal` section being merged signaling agreement on a proposed design
248+
- the date implementation started
249+
- the first Kubernetes release where an initial version of the KEP was available
250+
- the version of Kubernetes where the KEP graduated to general availability
251+
- when the KEP was retired or superseded

0 commit comments

Comments
 (0)