You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-node/2535-ensure-secret-pulled-images/README.md
+12-13Lines changed: 12 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -696,8 +696,12 @@ Yes.
696
696
697
697
###### What happens if we reenable the feature if it was previously rolled back?
698
698
699
-
Images pulled during the period the feature was disabled will not be present in the cache, and thus could incur redundant pulls/container creation failures.
700
-
However, the cache may still be present, and thus it will retain information from when it was previously enabled.
699
+
Images pulled during the period the feature was disabled will not be present in the cache and will be treated
700
+
as preloaded. Depending on the policy configured, this could either incur redundant pulls and thus container
701
+
creation failures if the image registry is unavailable, or it might cause that these images will be available
702
+
to all pods on the node that pulled them.
703
+
704
+
Only images that don't expect any credential verification should be kept on the node before the feature is reenabled.
701
705
702
706
###### Are there any tests for feature enablement/disablement?
703
707
@@ -715,17 +719,17 @@ the behavior of the pull policies will revert to the previous behavior.
715
719
716
720
###### What specific metrics should inform a rollback?
717
721
718
-
If the feature gate is enabled, but the kubelet configuration field is not enabled, the kubelet will gather metrics `image_pull_secret_recheck_miss` and
719
-
`image_pull_secret_recheck_hit` which will be both be a histogram counting the number of images that had a cache miss (despite the image potentially being present).
722
+
If the feature gate is enabled, the kubelet will gather metrics `image_pull_secret_recheck_miss` and
723
+
`image_pull_secret_recheck_hit` which are both histograms counting the number of images that had a cache miss/hit.
720
724
721
-
This will allow an admin to see how many images would have reauthorization checks done.
725
+
This will allow an admin to see how many images have authorization checks done.
722
726
723
727
A histogram was chosen to allow an admin to compare registry uptime with cache misses, as the main failure scenerio is registry unavailability
724
728
could cause pods not to come up, because the kubelet doesn't have credentials cached.
725
729
726
730
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
727
731
728
-
They can be. The presence of a feature gate and kubelet configuration will make this path safe. Plus, there are no API objects that cause issue
732
+
No. The feature does not exist at the time of writing.
729
733
730
734
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
731
735
@@ -769,31 +773,27 @@ TBD needed for Beta
769
773
770
774
### Dependencies
771
775
772
-
TBD
773
-
774
776
###### Does this feature depend on any specific services running in the cluster?
775
777
776
778
No.
777
779
778
780
### Scalability
779
781
780
-
TBD
781
-
782
782
###### Will enabling / using this feature result in any new API calls?
783
783
784
784
No.
785
785
786
786
###### Will enabling / using this feature result in introducing new API types?
787
787
788
-
No.
788
+
No REST API types, only kubelet configuration API is being extended.
789
789
790
790
###### Will enabling / using this feature result in any new calls to the cloud provider?
791
791
792
792
No.
793
793
794
794
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
795
795
796
-
No existing API objects will be unchanged.
796
+
No, existing API objects will be unchanged.
797
797
798
798
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
799
799
@@ -834,7 +834,6 @@ Why should this KEP _not_ be implemented. TBD
834
834
835
835
## Alternatives [optional]
836
836
837
-
- Make the behavior change enabled by default by changing the feature gate to true by default instead of false by default.
838
837
- Discussions went back and forth on whether this should go directly to GA as a fix or alpha as a feature gate. It seems this should be the default security posture for pullIfNotPresent as it is not clear to admins/users that an image pulled by a first pod with authentication can be used by a second pod without authentication. The performance cost should be minimal as only the manifest needs to be re-authenticated. But after further review and discussion with MrunalP we'll go ahead and have a kubelet feature gate with default off for alpha in v1.23.
839
838
- Set the flag at some other scope e.g. pod spec (doing it at the pod spec was rejected by SIG-Node).
840
839
- For beta/ga we may revisit/replace the in memory hash map in kubelet design, with an extension to the CRI API for having the container runtime
0 commit comments