You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
KEP-1965: update kube-apiserver identity KEP to reflect current state
- Lease garbage collection actually runs in kube-apiserver in a post
start hook, not kube-controller-manager
- Lease namespace is `kube-system` and not `kube-apiserver-lease`
- Lease ID is kube-apiserver-<uuid>
Signed-off-by: Andrew Sy Kim <[email protected]>
will be re-used. The heartbeat controller will be added to kube-apiserver in a
126
-
post-start hook.
127
-
128
-
Each kube-apiserver will run a lease controller in a post-start-hook to refresh
129
-
its Lease every 10s by default. A separate controller named [storageversiongc](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/storageversiongc/gc_controller.go)
130
-
running in kube-controller-manager will watch the Lease API using an informer, and
131
-
periodically resync its local cache. On processing an item, the `storageversiongc` controller
132
-
will delete the Lease if the last `renewTime` was more than `leaseDurationSeconds` ago (default to 1h).
133
-
The default `leaseDurationSeconds` is chosen to be way longer than the default
134
-
refresh period, to tolerate clock skew and/or accidental refresh failure. The
135
-
default resync period is 1h. By default, assuming negligible clock skew, a Lease
136
-
will be deleted if the kube-apiserver fails to refresh its Lease for one to two
137
-
hours. The `storageversiongc` controller will run in kube-controller-manager, to leverage leader
138
-
election and reduce conflicts.
139
-
140
-
The refresh rate, lease duration will be configurable through kube-apiserver
141
-
flags. The resync period will be configurable through a kube-controller-manager
142
-
flag.
125
+
will be re-used. The lease creation and heart beat will be managed by the `start-kube-apiserver-identity-lease-controller`
126
+
post-start-hook and expired leases will be garbage collected by the `start-kube-apiserver-identity-lease-garbage-collector`
127
+
post-start-hook in kube-apiserver. The refresh rate, lease duration will be configurable through kube-apiserver
128
+
flags
129
+
130
+
The format of the lease ID will be `kube-apiserver-<UUID>`. The UUID is newly generated on every start-up. This ID format is preferred
131
+
for the following reasons:
132
+
* No two kube-apiservers on the same host can share the same lease identity.
133
+
* Revealing the hostname of kube-apiserver may not be desirable for some Kubernetes platforms.
134
+
* The kube-apiserver version may change between restarts, which can trigger a storage version migration (see KEP on StorageVersionAPI)
135
+
136
+
In some cases it can be desirable to use a predictable ID format (e.g. kube-apiserver-<hostname>). We may consider providing
137
+
a flag in `kube-apiserver` to override the lease identity.
138
+
139
+
All kube-apiserver leases will also have a component label `k8s.io/component=kube-apiserver`.
140
+
143
141
144
142
### Test Plan
145
143
@@ -208,8 +206,8 @@ Alpha should provide basic functionality covered with tests described above.
208
206
209
207
###### Does enabling the feature change any default behavior?
210
208
211
-
A namespace `kube-apiserver-lease` will be created to store kube-apiserver identity Leases.
212
-
Old leases will be actively garbage collected by kube-controller-manager.
209
+
kube-apiserver will store identity Leases in the `kube-system` namespace.
210
+
Expired leases will be actively garbage collected by a post-start-hook in kube-apiserver.
213
211
214
212
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
215
213
@@ -229,7 +227,8 @@ However, there are no tests validating feature enablement/disablement based on t
229
227
###### How can a rollout or rollback fail? Can it impact already running workloads?
230
228
231
229
Existing workloads should not be impacteded by this feature, unless they were
232
-
looking for Lease objects in the `kube-apiserver-lease` namespace.
230
+
looking for kube-apiserver Lease objects in the `kube-system` namespace, which can be
231
+
found using the `k8s.io/component=kube-apiserver` label.
233
232
234
233
###### What specific metrics should inform a rollback?
235
234
@@ -248,7 +247,7 @@ No.
248
247
249
248
###### How can an operator determine if the feature is in use by workloads?
250
249
251
-
The existence of the `kube-apiserver-lease` namespace and Lease objects in the namespace
250
+
The existence of kube-apiserverLease objects in the`kube-system` namespace
252
251
will determine if the feature is working. Operators can check for clients that are accessing
253
252
the Lease object to see if workloads or other controllers are relying on this feature.
254
253
@@ -265,9 +264,11 @@ the Lease object to see if workloads or other controllers are relying on this fe
265
264
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
266
265
267
266
Some reasonable SLOs could be:
268
-
* Number of (non-expired) Leases in `kube-apiserver-leases` is equal to the number of expected kube-apiservers 95% of the time.
267
+
* Number of (non-expired) Leases in `kube-system` is equal to the number of expected kube-apiservers 95% of the time.
269
268
* kube-apiservers hold a lease which is not older than 2 times the frequency of the lease heart beat 95% of time.
270
269
270
+
All leases owned by kube-apiservers can be found using the `k8s.io/component=kube-apiserver` label.
271
+
271
272
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
272
273
273
274
-[X] Metrics
@@ -280,7 +281,7 @@ Some reasonable SLOs could be:
280
281
A metric measuring the last updated time for a lease could be useful, but it could introduce cardinality problems
281
282
since the lease is changed on every restart of kube-apiserver.
282
283
283
-
We may consider adding a metric exposing the count of leases in `kube-apiserver-lease`.
284
+
We may consider adding a metric exposing the count of leases in `kube-system`.
0 commit comments