|
| 1 | +<!-- |
| 2 | +**Note:** When your KEP is complete, all of these comment blocks should be removed. |
| 3 | +
|
| 4 | +To get started with this template: |
| 5 | +
|
| 6 | +- [ ] **Pick a hosting SIG.** |
| 7 | + Make sure that the problem space is something the SIG is interested in taking |
| 8 | + up. KEPs should not be checked in without a sponsoring SIG. |
| 9 | +- [ ] **Create an issue in kubernetes/enhancements** |
| 10 | + When filing an enhancement tracking issue, please make sure to complete all |
| 11 | + fields in that template. One of the fields asks for a link to the KEP. You |
| 12 | + can leave that blank until this KEP is filed, and then go back to the |
| 13 | + enhancement and add the link. |
| 14 | +- [ ] **Make a copy of this template directory.** |
| 15 | + Copy this template into the owning SIG's directory and name it |
| 16 | + `NNNN-short-descriptive-title`, where `NNNN` is the issue number (with no |
| 17 | + leading-zero padding) assigned to your enhancement above. |
| 18 | +- [ ] **Fill out as much of the kep.yaml file as you can.** |
| 19 | + At minimum, you should fill in the "Title", "Authors", "Owning-sig", |
| 20 | + "Status", and date-related fields. |
| 21 | +- [ ] **Fill out this file as best you can.** |
| 22 | + At minimum, you should fill in the "Summary" and "Motivation" sections. |
| 23 | + These should be easy if you've preflighted the idea of the KEP with the |
| 24 | + appropriate SIG(s). |
| 25 | +- [ ] **Create a PR for this KEP.** |
| 26 | + Assign it to people in the SIG who are sponsoring this process. |
| 27 | +- [ ] **Merge early and iterate.** |
| 28 | + Avoid getting hung up on specific details and instead aim to get the goals of |
| 29 | + the KEP clarified and merged quickly. The best way to do this is to just |
| 30 | + start with the high-level sections and fill out details incrementally in |
| 31 | + subsequent PRs. |
| 32 | +
|
| 33 | +Just because a KEP is merged does not mean it is complete or approved. Any KEP |
| 34 | +marked as `provisional` is a working document and subject to change. You can |
| 35 | +denote sections that are under active debate as follows: |
| 36 | +
|
| 37 | +``` |
| 38 | +<<[UNRESOLVED optional short context or usernames ]>> |
| 39 | +Stuff that is being argued. |
| 40 | +<<[/UNRESOLVED]>> |
| 41 | +``` |
| 42 | +
|
| 43 | +When editing KEPS, aim for tightly-scoped, single-topic PRs to keep discussions |
| 44 | +focused. If you disagree with what is already in a document, open a new PR |
| 45 | +with suggested changes. |
| 46 | +
|
| 47 | +One KEP corresponds to one "feature" or "enhancement" for its whole lifecycle. |
| 48 | +You do not need a new KEP to move from beta to GA, for example. If |
| 49 | +new details emerge that belong in the KEP, edit the KEP. Once a feature has become |
| 50 | +"implemented", major changes should get new KEPs. |
| 51 | +
|
| 52 | +The canonical place for the latest set of instructions (and the likely source |
| 53 | +of this file) is [here](/keps/NNNN-kep-template/README.md). |
| 54 | +
|
| 55 | +**Note:** Any PRs to move a KEP to `implementable`, or significant changes once |
| 56 | +it is marked `implementable`, must be approved by each of the KEP approvers. |
| 57 | +If none of those approvers are still appropriate, then changes to that list |
| 58 | +should be approved by the remaining approvers and/or the owning SIG (or |
| 59 | +SIG Architecture for cross-cutting KEPs). |
| 60 | +--> |
| 61 | +# KEP-4656: Add kubelet instance configuration to configure CRI socket for each node |
| 62 | + |
| 63 | +<!-- |
| 64 | +This is the title of your KEP. Keep it short, simple, and descriptive. A good |
| 65 | +title can help communicate what the KEP is and should be considered as part of |
| 66 | +any review. |
| 67 | +--> |
| 68 | + |
| 69 | +<!-- |
| 70 | +A table of contents is helpful for quickly jumping to sections of a KEP and for |
| 71 | +highlighting any additional information provided beyond the standard KEP |
| 72 | +template. |
| 73 | +
|
| 74 | +Ensure the TOC is wrapped with |
| 75 | + <code><!-- toc --&rt;<!-- /toc --&rt;</code> |
| 76 | +tags, and then generate with `hack/update-toc.sh`. |
| 77 | +--> |
| 78 | + |
| 79 | +<!-- toc --> |
| 80 | +- [Release Signoff Checklist](#release-signoff-checklist) |
| 81 | +- [Summary](#summary) |
| 82 | +- [Motivation](#motivation) |
| 83 | + - [Goals](#goals) |
| 84 | + - [Non-Goals](#non-goals) |
| 85 | +- [Proposal](#proposal) |
| 86 | + - [Risks and Mitigations](#risks-and-mitigations) |
| 87 | +- [Design Details](#design-details) |
| 88 | + - [Test Plan](#test-plan) |
| 89 | + - [Prerequisite testing updates](#prerequisite-testing-updates) |
| 90 | + - [Unit tests](#unit-tests) |
| 91 | + - [Integration tests](#integration-tests) |
| 92 | + - [e2e tests](#e2e-tests) |
| 93 | + - [Graduation Criteria](#graduation-criteria) |
| 94 | + - [Alpha](#alpha) |
| 95 | + - [Beta](#beta) |
| 96 | + - [GA](#ga) |
| 97 | + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) |
| 98 | + - [Version Skew Strategy](#version-skew-strategy) |
| 99 | +- [Implementation History](#implementation-history) |
| 100 | +- [Drawbacks](#drawbacks) |
| 101 | +- [Alternatives](#alternatives) |
| 102 | +- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) |
| 103 | +<!-- /toc --> |
| 104 | + |
| 105 | +## Release Signoff Checklist |
| 106 | + |
| 107 | +<!-- |
| 108 | +**ACTION REQUIRED:** In order to merge code into a release, there must be an |
| 109 | +issue in [kubernetes/enhancements] referencing this KEP and targeting a release |
| 110 | +milestone **before the [Enhancement Freeze](https://git.k8s.io/sig-release/releases) |
| 111 | +of the targeted release**. |
| 112 | +
|
| 113 | +For enhancements that make changes to code or processes/procedures in core |
| 114 | +Kubernetes—i.e., [kubernetes/kubernetes], we require the following Release |
| 115 | +Signoff checklist to be completed. |
| 116 | +
|
| 117 | +Check these off as they are completed for the Release Team to track. These |
| 118 | +checklist items _must_ be updated for the enhancement to be released. |
| 119 | +--> |
| 120 | + |
| 121 | +Items marked with (R) are required *prior to targeting to a milestone / release*. |
| 122 | + |
| 123 | +- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) |
| 124 | +- [ ] (R) KEP approvers have approved the KEP status as `implementable` |
| 125 | +- [ ] (R) Design details are appropriately documented |
| 126 | +- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) |
| 127 | + - [ ] e2e Tests for all Beta API Operations (endpoints) |
| 128 | + - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) |
| 129 | + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free |
| 130 | +- [ ] (R) Graduation criteria is in place |
| 131 | + - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) |
| 132 | +- [ ] (R) Production readiness review completed |
| 133 | +- [ ] (R) Production readiness review approved |
| 134 | +- [ ] "Implementation History" section is up-to-date for milestone |
| 135 | +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] |
| 136 | +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes |
| 137 | + |
| 138 | +<!-- |
| 139 | +**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone. |
| 140 | +--> |
| 141 | + |
| 142 | +[kubernetes.io]: https://kubernetes.io/ |
| 143 | +[kubernetes/enhancements]: https://git.k8s.io/enhancements |
| 144 | +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes |
| 145 | +[kubernetes/website]: https://git.k8s.io/website |
| 146 | + |
| 147 | +## Summary |
| 148 | + |
| 149 | +The proposal is to no longer set the Container Runtime Interface (CRI) socket annotation named `kubeadm.alpha.kubernetes.io/cri-socket` from the Kubernetes Node object, which is currently added during the `kubeadm init upload-config` phase. This annotation is used to specify the CRI socket endpoint used by the kubelet on each node for communication with the container runtime. |
| 150 | + |
| 151 | +Instead of relying on the annotation, this KEP proses creating an instance config per node and overriding the `ContainerRuntimeEndpoint` in the kubelet config when calling kubeadm commands. This will eliminate the need for kubeadm to store CRI socket configuration on each Node object. |
| 152 | + |
| 153 | +## Motivation |
| 154 | + |
| 155 | +Currently, kubeadm adds a CRI socket annotation to the Node object during the `init upload-config` phase, which specifies the endpoint for the CRI that is being used by the kubelet on each node. This annotation is persistent on the Node object, even if the kubelet is updated or the CRI is changed. |
| 156 | + |
| 157 | +After migrating the container runtime endpoint flag to the instance configuration, we can use it |
| 158 | +Set the CRI socket by overriding the `ContainerRuntimeEndpoint` field in `/var/lib/kubelet/config.yaml`. |
| 159 | + |
| 160 | +### Goals |
| 161 | + |
| 162 | +* kubeadm currently adds an annotation with the key `kubeadm.alpha.kubernetes.io/cri-socket` to each Node object. We will deprecate and no longer set it. |
| 163 | +* Provide an instance configuration file named `/var/lib/kubelet/instance-config.yaml` for each node, in which the `ContainerRuntimeEndpoint` field is defined. During the `kubeadm init/join/upgrade` process, the instance configuration will be read and the `ContainerRuntimeEndpoint` field in `/var/lib/kubelet/config.yaml` will be overwritten. |
| 164 | +* The `--container-runtime-endpoint` flag is no longer written to the `/var/lib/kubelet/kubeadm-flags.env` file. |
| 165 | + |
| 166 | +### Non-Goals |
| 167 | + |
| 168 | +- Continue maintaining CRI socket paths on Node objects. |
| 169 | + |
| 170 | +## Proposal |
| 171 | + |
| 172 | +We will add a new file `/var/lib/kubelet/instance-config.yaml` to customize the CRI socket of each node. This file will be merged with `/var/lib/kubelet/config.yaml` in the process of kubeadm init/join by using the `kubeletconfiguration` patch target. If the user uses the `kubeletconfiguration` with `--patches`, the patch file provided by the user will be given priority. |
| 173 | + |
| 174 | +For different subcommands, there are the following changes: |
| 175 | + |
| 176 | +* kubeadm init: If the CRI socket provided in the kubeadm configuration is set, it will take precedence and generate the `/var/lib/kubelet/instance-config.yaml` configuration file based on it; if the CRI socket is not specified, the container runtime endpoint will be automatically detected, uploaded to the global configuration, and `/var/lib/kubelet/instance-config.yaml` will be generated. |
| 177 | + |
| 178 | +* kubeadm join: If the CRI socket provided in the kubeadm configuration is set, it will take precedence and generate the `/var/lib/kubelet/instance-config.yaml` configuration file based on it, overwriting the kubelet configuration downloaded from the global configuration(`kube-system/kubelet-config`). If no CRI socket is specified, the socket is automatically detected on the node and `/var/lib/kubelet/instance-config.yaml` is generated based on it. |
| 179 | + |
| 180 | +* kubeadm upgrade: future versions of `kubeadm upgrade apply/node` will only check `/var/lib/kubelet/instance-config.yaml`. |
| 181 | + |
| 182 | +### Risks and Mitigations |
| 183 | + |
| 184 | +## Design Details |
| 185 | + |
| 186 | +**We will add a new `NodeLocalCRISocket` feature gate. In the Alpha phase, the feature gate is disabled by default. If feature gate is disabled, kubeadm subcommands will not be changed, when the feature gate is enabled, the kubeadm subcommands change as follows:** |
| 187 | + |
| 188 | +kubeadm init: |
| 189 | + |
| 190 | +* No longer need to write the `--container-runtime-endpoint` to `/var/lib/kubelet/kubeadm-flags.env`. |
| 191 | +* No longer need to add the `kubeadm.alpha.kubernetes.io/cri-socket` annotation. |
| 192 | +* If the CRI socket provided in the kubeadm configuration is set, it is used first and the `/var/lib/kubelet/instance-config.yaml` configuration file is generated based on it. If the CRI socket is not set, the container runtime endpoint is automatically detected uploaded to the global configuration, and `/var/lib/kubelet/instance-config.yaml` will be generated. |
| 193 | + |
| 194 | +kubeadm join: |
| 195 | + |
| 196 | +* No longer need to add the `kubeadm.alpha.kubernetes.io/cri-socket` annotation. |
| 197 | +* If the CRI socket provided in the kubeadm configuration is set, it is used first and the `/var/lib/kubelet/instance-config.yaml` configuration file is generated based on it, overwriting the kubelet configuration downloaded from the global configuration; If no CRI socket is specified, the socket is automatically detected on the node and `/var/lib/kubelet/instance-config.yaml` is generated based on it. |
| 198 | + |
| 199 | +kubeadm reset: |
| 200 | + |
| 201 | +* There is no need to do anything, according to the existing process, we get CRISocketPath before deleting the /var/lib/kubelet directory, and after deleting the `/var/lib/kubelet` directory, `/var/lib/kubelet/instance-config.yaml` will also be cleaned up. |
| 202 | + |
| 203 | +kubeadm upgrade: |
| 204 | + |
| 205 | +* `kubeadm upgrade node/apply` will check the `--container-runtime-endpoint` args in the `/var/lib/kubelet/kubeadm-flags.env` file and generate `/var/lib/kubelet/instance-config.yaml` based on them, and override the `ContainerRuntimeEndpoint` field to `/var/lib/kubelet/config.yaml`. |
| 206 | + |
| 207 | +**In the Beta phase, the feature gate is enabled by default. If feature gate is disabled, kubeadm subcommands will not be changed, when the feature gate is enabled, the kubeadm subcommands change as follows:** |
| 208 | + |
| 209 | +* `kubeadm upgrade apply/node` will use `/var/lib/kubelet/instance-config.yaml`, and override the `ContainerRuntimeEndpoint` field to `/var/lib/kubelet/config.yaml`. |
| 210 | + |
| 211 | +**In the Beta phase, the feature gate is enabled by default and cannot be disabled. the kubeadm subcommands change as follows:** |
| 212 | + |
| 213 | +* `kubeadm upgrade apply/node` will use `/var/lib/kubelet/instance-config.yaml` override the `ContainerRuntimeEndpoint` field to `/var/lib/kubelet/config.yaml` only. |
| 214 | + |
| 215 | +### Test Plan |
| 216 | + |
| 217 | +[x] I/we understand the owners of the involved components may require updates to |
| 218 | +existing tests to make this code solid enough prior to committing the changes necessary |
| 219 | +to implement this enhancement. |
| 220 | + |
| 221 | +##### Prerequisite testing updates |
| 222 | + |
| 223 | +##### Unit tests |
| 224 | + |
| 225 | +At least the following kubeadm packages will require updates and new unit tests: |
| 226 | + |
| 227 | +* `cmd/kubeadm/app/cmd/phases/init` |
| 228 | + |
| 229 | +* `cmd/kubeadm/app/phases/kubelet` |
| 230 | + |
| 231 | +* `cmd/kubeadm/app/phases/upgrade` |
| 232 | + |
| 233 | +* `cmd/kubeadm/app/cmd/phases/join` |
| 234 | + |
| 235 | +##### Integration tests |
| 236 | + |
| 237 | +- N/A |
| 238 | + |
| 239 | +##### e2e tests |
| 240 | + |
| 241 | +* A new e2e test will be added by using the kinder tool. |
| 242 | + |
| 243 | +### Graduation Criteria |
| 244 | + |
| 245 | +#### Alpha |
| 246 | + |
| 247 | +- Use `NodeLocalCRISocket` feature gate to implement features. |
| 248 | +- Add corresponding e2e tests. |
| 249 | +- Added documentation for feature gates. |
| 250 | + |
| 251 | +#### Beta |
| 252 | + |
| 253 | +* Make feature gate to be enabled by default. |
| 254 | + |
| 255 | +- Gather feedback from developers and surveys. |
| 256 | +- Update the feature gate documentation. |
| 257 | +- Implement changes in kubeadm upgrade apply/node Beta phase. |
| 258 | + |
| 259 | +#### GA |
| 260 | + |
| 261 | +- Gather feedback from developers and surveys. |
| 262 | +- Implement changes in kubeadm upgrade apply/node GA phase. |
| 263 | +- Update the phases documentation. |
| 264 | +- Remove kubeadm.alpha.kubernetes.io/cri-socket annotation from https://kubernetes.io/docs/reference/labels-annotations-taints page. |
| 265 | +- Update https://kubernetes.io/docs/tasks/administer-cluster/migrating-from-dockershim/change-runtime-containerd/ page and replace update annotation with update instance-config. |
| 266 | + |
| 267 | +### Upgrade / Downgrade Strategy |
| 268 | + |
| 269 | +**Alpha**: Users can patch their `ClusterConfiguration` in the `kube-system/kubeadm-config` ConfigMap to enable the `NodeLocalCRISocket` feature gate before calling kubeadm upgrade apply, which will allow a `/var/lib/kubelet/instance-config.yaml` to be generated and overwrite the `ContainerRuntimeEndpoint` field in `/var/lib/kubelet/config.yaml` with it. |
| 270 | + |
| 271 | +**Beta**: Users can modify `ClusterConfiguration` to disable the feature gate during upgrades. This will allow them to continue using the CRI socket annotation on nodes. |
| 272 | + |
| 273 | +**GA**: Users can no longer patch ClusterConfiguration to opt out of the feature and it will be locked to be enabled by default. |
| 274 | + |
| 275 | +### Version Skew Strategy |
| 276 | + |
| 277 | +kubeadm will continue to skew from kubelet for three versions. The `ContainerRuntimeEndpoint` field in `KubeletConfiguration` was [introduced](https://github.com/kubernetes/kubernetes/pull/112136) in v1.27, so when we overwrite the `ContainerRuntimeEndpoint` field in `/var/lib/kubelet/config.yaml` through `/var/lib/kubelet/instance-config.yaml`, it will be supported on all versions of the kubelet within the skew. |
| 278 | + |
| 279 | +## Implementation History |
| 280 | + |
| 281 | +- 2024-05-23: Initial draft KEP |
| 282 | +- 2024-10-03: KEP marked as implementable |
| 283 | + |
| 284 | +## Drawbacks |
| 285 | + |
| 286 | +* This KEP will bring a breaking change. some users do read / write the `kubeadm.alpha.kubernetes.io/cri-socket` annotation or the `/var/lib/kubelet/kubeadm-flags.env` file to declare the CRI socket endpoint on the Node, because many users are familiar with them. |
| 287 | + |
| 288 | +## Alternatives |
| 289 | + |
| 290 | +* We can avoid providing feature gates and ensure the compatibility of kubeadm by implementing it in multiple versions, but we should improve user awareness by adding a feature gate. |
| 291 | +* Do nothing, continue to use the`/var/lib/kubelet/kubeadm-flags.env` file, but kubelet has deprecated the `--container-runtime-endpoint` args. |
| 292 | + |
| 293 | +## Infrastructure Needed (Optional) |
0 commit comments