Skip to content

Commit 509a2df

Browse files
authored
Merge pull request kubernetes#4928 from munnerz/4193-kep-updates-132
KEP-4193: update KEP for v1.32 release
2 parents bdc8b06 + feaef27 commit 509a2df

File tree

1 file changed

+38
-14
lines changed
  • keps/sig-auth/4193-bound-service-account-token-improvements

1 file changed

+38
-14
lines changed

keps/sig-auth/4193-bound-service-account-token-improvements/README.md

Lines changed: 38 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -49,15 +49,15 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
4949
- [x] (R) KEP approvers have approved the KEP status as `implementable`
5050
- [x] (R) Design details are appropriately documented
5151
- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
52-
- [ ] e2e Tests for all Beta API Operations (endpoints)
53-
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
52+
- [x] e2e Tests for all Beta API Operations (endpoints)
53+
- [x] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
5454
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
5555
- [x] (R) Graduation criteria is in place
5656
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
5757
- [x] (R) Production readiness review completed
5858
- [x] (R) Production readiness review approved
5959
- [x] "Implementation History" section is up-to-date for milestone
60-
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
60+
- [x] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
6161
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
6262

6363
[kubernetes.io]: https://kubernetes.io/
@@ -91,6 +91,9 @@ TokenRequest API, in a similar manner to how the ServiceAccount, Pod or Secret i
9191
Additionally, to provide a robust means of tracking token usage within the audit log we can embed a unique identifier for
9292
each token which is can then also be recorded in future audit entries made by this token.
9393

94+
As we are adding support for `node` metadata associated with Pods, we will also add the ability to bind a token/JWT
95+
to a Node object directly, similar to how a token can be bound to a Pod or Secret resource today.
96+
9497
## Motivation
9598

9699
### Goals
@@ -182,9 +185,9 @@ being issued.
182185

183186
* Adding additional cross-referencing validation checks into the TokenReview API may break some user workflows that
184187
involve deleting Node objects and restarting kubelet's to allow them to be recreated. As a result, the TokenReview
185-
behaviour changes will be gated behind an additional flag in kube-apiserver, which defaults to 'off'.
186-
This may be revisited in future once we have a better understanding of user expectations around Node objects and
187-
associated JWTs.
188+
API will **NOT** be modified to permit tightening this validation behaviour. Instead, the existing protections &
189+
mechanisms for invalidating a Node<>Pod binding (i.e. auto-deletion after a fixed time period after the Node object
190+
is deleted).
188191

189192
## Design Details
190193

@@ -362,9 +365,24 @@ enhancement:
362365

363366
### Version Skew Strategy
364367

365-
This feature does not require any coordination between clients and the apiserver, as no components require this
366-
information to be embedded. This is purely additive, and the only rollback concerns would be around third party
367-
software that consumes this information.
368+
Embedding a Pod's assigned Node name into a JWT does not require any coordination between clients and the apiserver,
369+
as no components require this information to be embedded. This is purely additive, and the only rollback concerns
370+
would be around third party software that consumes this information. This software should always verify whether a
371+
`node` claim is embedded into tokens if they require using it, and provide a fall-back behaviour (i.e. a GET to the
372+
apiserver to fetch the Pod & Node object) if they need to maintain compatibility with older apiservers.
373+
374+
Binding a token to a Node introduces a new validation mechanism, and therefore we must allow one release cycle after
375+
introducing the ability to **validate** tokens, before we can begin permitting **issuance** of these tokens.
376+
This is a critical step from a security standpoint, as otherwise an administrator could:
377+
378+
1) upgrade their apiserver/control plane.
379+
2) a user could request a token bound to a Node, expecting it to be invalidated when the Node is deleted.
380+
3) rollback the apiserver to an older version.
381+
4) the Node object is deleted.
382+
5) the token issued in (2) would now continue to be accepted/validated, despite the Node object no longer existing.
383+
384+
By graduating validation a release **earlier** than issuance, we can ensure any tokens that are bound to a Node
385+
object will be correctly validated even after a rollback.
368386

369387
## Production Readiness Review Questionnaire
370388

@@ -403,6 +421,7 @@ to ensure a safe rollback from version v1.31 to v1.30 (more info below in rollba
403421
The `ServiceAccountTokenNodeBinding` feature gate must only be enabled once the `ServiceAccountTokenNodeBindingValidation` feature has been enabled.
404422
Disabling the `ServiceAccountTokenNodeBindingValidation` feature whilst keeping `ServiceAccountTokenNodeBinding` would allow tokens that are expected to
405423
be bound to the lifetime of a particular Node to validate even if that Node no longer exists.
424+
The [rollout & rollback section](#rollout-upgrade-and-rollback-planning) below goes into further detail.
406425

407426
All other feature flags can be disabled without any unexpected adverse affects or coordination required.
408427

@@ -426,18 +445,23 @@ All other feature flags can be disabled without any unexpected adverse affects o
426445

427446
###### Does enabling the feature change any default behavior?
428447

429-
Enabling the feature gate will cause additional information to be stored/persisted into service account JWTs, as well
430-
as new audit annotations being recorded in the audit log. This is all purely additive, so no changes to existing
431-
features, schemas or fields are expected.
448+
Enabling the `ServiceAccountTokenPodNodeInfo` and/or `ServiceAccountTokenJTI` feature gate will cause additional information
449+
to be stored/persisted into service account JWTs, as well as new audit annotations being recorded in the audit log.
450+
This is all purely additive, so no changes to existing features, schemas or fields are expected.
451+
452+
Enabling the `ServiceAccountTokenNodeBinding` will permit binding tokens to Node objects, which is a change in
453+
behaviour (albeit not to an existing feature, so is not problematic).
432454

433455
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
434456

435457
Yes. Future tokens will then not embed this information. Any existing issued tokens **will** still have this
436-
information embedded however.
458+
information embedded, however.
437459

438460
If these fields are deemed to be problematic for other systems interpreting these tokens, users will need to re-issue
439461
these tokens before presenting them elsewhere.
440462

463+
Once the feature(s) have graduated to GA, it will not be possible to disable this behaviour.
464+
441465
###### What happens if we reenable the feature if it was previously rolled back?
442466

443467
Future tokens will once again include this information/no adverse effects.
@@ -510,7 +534,7 @@ as part of the UserInfo in the audit event.
510534
As none of these fields are actually used for validating/verifying a token is valid, enabling & disabling the feature
511535
does not cause any adverse side effects.
512536

513-
**For `ServiceAccountTokenNodeBinding` (alpha v1.29, beta v1.31) and `ServiceAccountTokenNodeBindingValidation` (alpha v1.29, beta v1.30, GA v1.32) feature:**
537+
**For `ServiceAccountTokenNodeBinding` (alpha v1.29, beta v1.31, GA v1.33) and `ServiceAccountTokenNodeBindingValidation` (alpha v1.29, beta v1.30, GA v1.32) feature:**
514538

515539
*Without* the feature gate enabled, service account tokens that have been bound to Node objects will not have their
516540
node reference claims validated (to ensure the referenced node exists).

0 commit comments

Comments
 (0)