You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-auth/4872-harden-kubelet-cert-validation/README.md
+24-25Lines changed: 24 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -138,7 +138,10 @@ This will require cluster administrators to reissue any non-conforming certifica
138
138
### Risks and Mitigations
139
139
140
140
This could disrupt existing clusters that are using custom kubelet serving certificates.
141
-
These clusters will need to reissue their certificates before enabling this feature. We will allow to disable the validation through a command-line flag to allow for a smooth transition.
141
+
142
+
In order to maintain compatibility by default with these clusters even after this feature goes GA, we will make it opt-in.
143
+
144
+
Before enabling this feature on clusters with custom kubelet serving certificates, cluster administrators will need to reissue those certificates.
142
145
143
146
## Design Details
144
147
@@ -147,33 +150,29 @@ These clusters will need to reissue their certificates before enabling this feat
147
150
We will introduce a feature flag `KubeletCertCNValidation` that will gate the usage of the new validation.
148
151
This gate will start off by default in Alpha, will be turned on by default in Beta and will be removed in GA.
149
152
150
-
In addition, we will allow to disable the validation through a command-line flag `--disable-kubelet-cert-cn-validation`.
151
-
This flag can only be set if the `KubeletCertCNValidation` feature flag is enabled.
152
-
This flag will allow cluster administrators to opt-out of this validation if they are using custom kubelet serving certificates that don't follow the `system:node:<nodename>` convention even after the feature gate is removed.
153
+
In addition, the validation will be opt-in and enabled through a new command-line flag `--enable-kubelet-cert-cn-validation`.
154
+
This flag can only be set if the `KubeletCertCNValidation` feature flag is enabled and if `--kubelet-certificate-authority` is set.
155
+
156
+
Making the feature opt-in maintains compatibility with existing clusters using custom kubelet serving certificates that don't follow the `system:node:<nodename>` convention even after the feature gate is removed.
153
157
154
158
#### Metrics
155
159
156
160
In order to help cluster administrators determine if it's safe to enable the feature, we propose to add a new metric `kube_apiserver_validation_kubelet_cert_cn_errors` that will track the number of errors due to the new CN validation.
157
161
In addition, we will log the error including the node name, so cluster administrators can identify which nodes are affected and need to reissue their certificates.
158
162
159
-
If the feature gate is disabled, we won't publish the metric or run any validation code at all.
163
+
If the feature gate is disabled or if `--kubelet-certificate-authority` is not set, we won't publish the metric or run any validation code at all.
160
164
161
-
If the feature gate is enabled but the feature is disabled (with `--disable-kubelet-cert-cn-validation`), we will still add the validation code to the HTTP transport, however, if the validation fails we won't return an error, we will just increment the metric counter.
165
+
If the feature gate is enabled, the kubelet CA is set (`--kubelet-certificate-authority`) but this feature is disabled, we will still run the validation code to collect the metric. However, if the validation fails we won't return an error, we will just increment the metric counter.
162
166
163
167
We intentionally don't add the node name to the metric to avoid a high cardinality.
164
168
The purpose of the metric is to easily/cheaply tell administrators if they can flip the feature on or not. If the answer is no (counter is greater than 0), the rest of the necessary information to detect the offending nodes will come from logs.
165
169
166
-
167
-
We will remove the metric once the feature is GA.
168
-
169
-
> TODO: let's discuss this in the review. We could consider adding the node name to the metric or even keeping the metric post GA if it's valuable.
170
-
171
170
### TLS insecure
172
171
173
172
Currently, if the Kube-API server is not configured with a `--kubelet-certificate-authority` the TLS client for kubelet server will skip the server certificate validation.
174
173
Additionally, `logs` requests allow to configure `InsecureSkipTLSVerifyBackend` per request to skip the server certificate validation.
175
174
176
-
To align with this behavior, we won't execute the CN validation if `--kubelet-certificate-authority` is not set or if `InsecureSkipTLSVerifyBackend` is set to true.
175
+
To align with this behavior, we won't allow to enable the validation if `--kubelet-certificate-authority` is not set and we won't execute the CN validation if `InsecureSkipTLSVerifyBackend` is set to true.
177
176
178
177
### Test Plan
179
178
@@ -195,11 +194,12 @@ Existing test coverage for the packages we anticipate modifying:
195
194
##### Integration tests
196
195
197
196
Integration tests will be added to ensure the following:
198
-
* An error is returned if `--disable-kubelet-cert-cn-validation` is set but `KubeletCertCNValidation` feature flag is not enabled.
197
+
* An error is returned if `--enable-kubelet-cert-cn-validation` is set but `KubeletCertCNValidation` feature flag is not enabled.
198
+
* An error is returned if the feature `KubeletCertCNValidation` is enabled, `--enable-kubelet-cert-cn-validation` is set to true but `--kubelet-certificate-authority` is not set.
199
199
* Validation for custom certificates works if feature flag is not enabled.
200
-
* Validation for custom certificates works if feature flag enabled and `--disable-kubelet-cert-cn-validation` is set to true.
201
-
* Validation for custom certificates fails if feature flag enabledand `--disable-kubelet-cert-cn-validation` is set to false or not set.
202
-
* Validation for kubernetes issued certificates works if feature flag enabledand `--disable-kubelet-cert-cn-validation` is set to false or not set.
200
+
* Validation for custom certificates works if feature flag enabled and `--enable-kubelet-cert-cn-validation` is not set or set to false.
201
+
* Validation for custom certificates fails if feature flag enabled, `--kubelet-certificate-authority` is set and `--enable-kubelet-cert-cn-validation` is set to true.
202
+
* Validation for kubernetes issued certificates works if feature flag enabled, `--kubelet-certificate-authority` is set and `--enable-kubelet-cert-cn-validation` is set to true.
203
203
204
204
##### e2e tests
205
205
@@ -222,9 +222,7 @@ We believe is likely end-to-end tests won't be needed as unit and integration te
222
222
223
223
### Upgrade / Downgrade Strategy
224
224
225
-
Once feature flag is on by default (starting in Beta), administrators using custom serving certs
226
-
can use the proposed flag to disable the extra validation and maintain current behavior.
227
-
They will be able to use this flag even after the feature flag is removed.
225
+
The feature is opt-in and it can be disabled at any time by just not setting the `--enable-kubelet-cert-cn-validation` flag.
228
226
229
227
### Version Skew Strategy
230
228
@@ -240,16 +238,17 @@ Not applicable.
240
238
- Feature gate name: `KubeletCertCNValidation`
241
239
- Components depending on the feature gate: kube-apiserver
242
240
-[x] Other
243
-
- Describe the mechanism: kube-apiserver command-line flag `--disable-kubelet-cert-cn-validation`
241
+
- Describe the mechanism: kube-apiserver command-line flag `--enable-kubelet-cert-cn-validation`
244
242
- Will enabling / disabling the feature require downtime of the control
245
243
plane? No. But requires restarting the kube-apiserver.
246
244
- Will enabling / disabling the feature require downtime or reprovisioning
247
245
of a node? No.
248
246
249
247
###### Does enabling the feature change any default behavior?
250
248
251
-
Yes. If a cluster is using custom kubelet serving certificates that don't follow the same convention as kubernetes issued certificates (CN is `system:node:<node-name>`),
252
-
enabling this feature will make any connection initiated by the kube-api server fail (logs, exec and port-forwarding).
249
+
Enabling the feature gate doesn't change any behavior.
250
+
251
+
Enabling the validation does change the default certificate validation behavior.
253
252
254
253
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
255
254
@@ -289,7 +288,7 @@ No.
289
288
###### How can an operator determine if the feature is in use by workloads?
290
289
291
290
The cluster administrators can check the flags passed to the kube-apiserver if they have access to the control plane nodes.
292
-
If the `--disable-kubelet-cert-cn-validation` flag is not set or set to false, the feature is being used.
291
+
If the `--enable-kubelet-cert-cn-validation` flag set to true, the feature is being used.
293
292
Alternatively the can check the `kubernetes_feature_enabled` metric.
294
293
295
294
###### How can someone using this feature know that it is working for their instance?
@@ -367,7 +366,7 @@ It's part of the API server, so the feature will be unavailable.
367
366
368
367
-[API server can't connect to Nodes with custom kubelet serving certificates that don't follow the `system:node:<node-name>` convention]
369
368
- Detection: `kubectl logs` returns a certificate validation error.
370
-
- Mitigations: disable the validation with the `--disable-kubelet-cert-cn-validation` flag.
369
+
- Mitigations: disable the validation byt not setting `--enable-kubelet-cert-cn-validation` flag.
371
370
- Diagnostics: error is returned by the API server, no additional logging needed.
372
371
- Testing: We will have tests for this, this is basically testing that the feature works.
373
372
@@ -377,7 +376,7 @@ It's part of the API server, so the feature will be unavailable.
377
376
378
377
## Drawbacks
379
378
380
-
This could disrupt clusters that are using custom kubelet serving certificates. These clusters will need to reissue their certificates before enabling this feature.
0 commit comments