You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As part of alpha implementation, the [e2e test has been updated](https://github.com/kubernetes/kubernetes/commit/2090a01e0a495301432276216bbf9af102fc431c) to cover the new credential provider configuration and the new behavior of the kubelet when the `TokenAttributes` field is set.
708
+
709
+
We created a symlink to the existing gcp credential provider executable with a different name to use for testing service account token for credential provider. The credential provider has been updated to validate the following when plugin is run in service account token mode:\
710
+
711
+
1. Check the required annotations are sent as part of the `CredentialProviderRequest.ServiceAccountAnnotations` field.
712
+
2. Check the service account token is sent as part of the `CredentialProviderRequest.ServiceAccountToken` field.
713
+
3. Extract the claims from the service account token and validate the audience claim matches the `ServiceAccountTokenAudience` field in the kubelet's credential provider configuration.
714
+
702
715
### Graduation Criteria
703
716
704
717
<!--
@@ -773,15 +786,13 @@ in back-to-back releases.
773
786
- `ServiceAccountNodeAudienceRestriction`feature gate implemented in KAS as a beta feature
774
787
- Audience validation is enabled by default for service account tokens requested by the kubelet
775
788
776
-
#### Post Alpha
777
-
778
-
- Make sure the feature is compatible with the [Ensure secret pull images KEP](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2535-ensure-secret-pulled-images).
779
-
780
789
#### Beta
781
790
782
-
- The implementation works well with the Ensure secret pull images KEP and supports pod image pull policy set to any value.
791
+
- Make the feature compatible with the [Ensure secret pull images KEP](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2535-ensure-secret-pulled-images).
783
792
- `ServiceAccountNodeAudienceRestriction`feature gate is beta in KAS and enabled by default. This feature needs to be beta/enabled by default at least one release before this KEP goes to beta. This is critical to support downgrade use cases.
784
-
- Add metrics
793
+
- Caching KSA tokens per pod-sa to prevent generating tokens during hot loop/multiple containers with images.
794
+
- Some indication of whether the credentials are SA or SA+pod-scoped
795
+
- whether that's indicated in the config or in the plugin-returned content, and what the default is if unspecified (defaulting to pod is less performance, defaulting to SA risks incorrect cross-pod caching)
785
796
786
797
#### GA
787
798
@@ -886,6 +897,14 @@ FeatureSpec{
886
897
}
887
898
```
888
899
900
+
```go
901
+
FeatureSpec{
902
+
Default: true,
903
+
LockToDefault: false,
904
+
PreRelease: featuregate.Beta,
905
+
}
906
+
```
907
+
889
908
- [x] Feature gate (also fill in values in `kep.yaml`)
- Components depending on the feature gate: kube-apiserver
@@ -933,7 +952,7 @@ Steps to disable the feature:
933
952
3. Restart the kubelet.
934
953
935
954
These steps need to be performed on all nodes in the cluster.
936
-
After restarting the kubelet on all nodes, remove the audiences used by kubelet from the KAS `--allowed-kubelet-audiences` flag.
955
+
After restarting the kubelet on all nodes, remove the allowed audiences for which the kubelet is allowed to generate service account tokens for image pulls in KAS by removing the previous `ClusterRole` or `Role` with the `request-serviceaccounts-token-audience` verb.
937
956
938
957
###### What happens if we reenable the feature if it was previously rolled back?
939
958
@@ -974,13 +993,18 @@ rollout. Similarly, consider large clusters and how enablement/disablement
974
993
will rollout across nodes.
975
994
-->
976
995
996
+
Feature is enabled but exec plugin does not properly fetch and return credentials to the kubelet.
997
+
Impact is that kubelet cannot authenticate and pull credentials from those registries.
998
+
977
999
###### What specific metrics should inform a rollback?
978
1000
979
1001
<!--
980
1002
What signals should users be paying attention to when the feature is young
981
1003
that might indicate a serious problem?
982
1004
-->
983
1005
1006
+
High error rates from `kubelet_credential_provider_plugin_error` and long durations from `kubelet_credential_provider_plugin_duration`.
1007
+
984
1008
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
985
1009
986
1010
<!--
@@ -989,12 +1013,16 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
989
1013
are missing a bunch of machinery and tooling and can't do that now.
990
1014
-->
991
1015
1016
+
No, upgrade->downgrade->upgrade were not tested. Manual validation will be done prior to promoting this feature to beta in v1.34.
1017
+
992
1018
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
993
1019
994
1020
<!--
995
1021
Even if applying deprecation policies, they may still surprise some users.
996
1022
-->
997
1023
1024
+
No.
1025
+
998
1026
### Monitoring Requirements
999
1027
1000
1028
<!--
@@ -1012,6 +1040,9 @@ checking if there are objects with field X set) may be a last resort. Avoid
1012
1040
logs or events for this purpose.
1013
1041
-->
1014
1042
1043
+
Operators can check for a kubelet config file passed into the `--image-credential-provider-config`.
1044
+
The config has a field called `imageMatches` which indicates the images a plugin will be invoked for.
1045
+
1015
1046
###### How can someone using this feature know that it is working for their instance?
1016
1047
1017
1048
<!--
@@ -1023,13 +1054,10 @@ and operation of this feature.
1023
1054
Recall that end users cannot usually observe component logs or access metrics.
1024
1055
-->
1025
1056
1026
-
- [ ] Events
1027
-
- Event Reason:
1028
-
- [ ] API .status
1029
-
- Condition name:
1030
-
- Other field:
1031
-
- [ ] Other (treat as last resort)
1032
-
- Details:
1057
+
Users can observe events for successful image pulls that use the service account token for image pull.
1058
+
1059
+
- [x] Events
1060
+
- Event Reason: " Successfully pulled image "xxx" in 11.877s (11.877s including waiting). Image size: xxx bytes."
1033
1061
1034
1062
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
1035
1063
@@ -1048,6 +1076,11 @@ These goals will help you determine what you need to measure (SLIs) in the next
1048
1076
question.
1049
1077
-->
1050
1078
1079
+
On failure to fetch credentials from an exec plugin, the kubelet will retry after some period and invoke the plugin again.
1080
+
The kubelet will retry whenever it attempts to pull an image, but until then, kubelet will not be able to authenticate to
1081
+
the registry and pull images. The SLO for successfully invoking exec plugins should be based on the SLO for successfully
1082
+
pulling images for the container registry in question.
1083
+
1051
1084
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
1052
1085
1053
1086
<!--
@@ -1093,6 +1126,8 @@ and creating new ones, as well as about cluster-level services (e.g. DNS):
1093
1126
- Impact of its degraded performance or high-error rates on the feature:
1094
1127
-->
1095
1128
1129
+
This feature depends on the existence of a credential provider plugin binary on the host and a configuration file for the plugin to be read by the kubelet.
1130
+
1096
1131
### Scalability
1097
1132
1098
1133
<!--
@@ -1222,6 +1257,8 @@ details). For now, we leave it here.
1222
1257
1223
1258
###### How does this feature react if the API server and/or etcd is unavailable?
1224
1259
1260
+
If the API server is unavailable, kubelet will not be able to fetch service account tokens for image pull. The kubelet will retry fetching the token after some period, but until then, kubelet will not be able to authenticate to the registry and pull images that rely on the credential provider plugin using service account tokens for image pull.
1261
+
1225
1262
###### What are other known failure modes?
1226
1263
1227
1264
<!--
@@ -1239,6 +1276,9 @@ For each of them, fill in the following information by copying the below templat
1239
1276
1240
1277
###### What steps should be taken if SLOs are not being met to determine the problem?
1241
1278
1279
+
- check logs of kubelet
1280
+
- check service availability of container registries used by the cluster
1281
+
1242
1282
## Implementation History
1243
1283
1244
1284
<!--
@@ -1252,6 +1292,9 @@ Major milestones might include:
0 commit comments