Merge pull request kubernetes#2267 from wojtek-t/efficient_watch_resumption_beta

k8s-ci-robot · web-flow · commit e900e43be10c · 2021-01-14T16:15:21.000-08:00
Update "Efficient watch resumption KEP" to target Beta graduation in 1.21
diff --git a/keps/sig-api-machinery/1904-efficient-watch-resumption/README.md b/keps/sig-api-machinery/1904-efficient-watch-resumption/README.md
@@ -24,6 +24,7 @@
   - [Troubleshooting](#troubleshooting)
 - [Implementation History](#implementation-history)
 - [Drawbacks](#drawbacks)
+- [Future work](#future-work)
 - [Alternatives](#alternatives)
   - [Initialize watch cache from etcd history window](#initialize-watch-cache-from-etcd-history-window)
 <!-- /toc -->
@@ -236,9 +237,6 @@ We are going to utilize this feature to solve the problems described above.
 1. Change watch cache to utilize the resource version updates from Bookmark
    events.
 
-1. On top of recent changes that send Kubernetes Bookmark events every minute,
-   we will add a support to send them also on kube-apiserver shutdown.
-
 1. We will set the progress notify period to reasonably small value. 
    The requirement is to ensure that in case of rolling upgrade of multiple
    kube-apiservers, the next-to-be-updated one will get either a real event
@@ -325,8 +323,6 @@ n/a - watch bookmarks don't have any frequency guarantees
 
 ## Production Readiness Review Questionnaire
 
-TODO: Fill in before making `Implementable`.
-
 ### Feature Enablement and Rollback
 
 _This section must be completed when targeting alpha to a release._
@@ -355,121 +351,80 @@ _This section must be completed when targeting alpha to a release._
 _This section must be completed when targeting beta graduation to a release._
 
 * **How can a rollout fail? Can it impact already running workloads?**
-  Try to be as paranoid as possible - e.g., what if some components will restart
-   mid-rollout?
+  In case of bugs, etcd progress notify events may be incorrectly parsed leading
+  to kube-apiserver crashes.
+  It can't affect running workloads.
 
 * **What specific metrics should inform a rollback?**
+  Crashes of kube-apiserver.
 
 * **Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?**
-  Describe manual testing that was done and the outcomes.
-  Longer term, we may want to require automated upgrade/rollback tests, but we
-  are missing a bunch of machinery and tooling and can't do that now.
+  Manual tests are still to be run.
 
 * **Is the rollout accompanied by any deprecations and/or removals of features, APIs, 
 fields of API types, flags, etc.?**
-  Even if applying deprecation policies, they may still surprise some users.
+  No
 
 ### Monitoring Requirements
 
 _This section must be completed when targeting beta graduation to a release._
 
 * **How can an operator determine if the feature is in use by workloads?**
-  Ideally, this should be a metric. Operations against the Kubernetes API (e.g.,
-  checking if there are objects with field X set) may be a last resort. Avoid
-  logs or events for this purpose.
+  It's not a workload feature.
 
 * **What are the SLIs (Service Level Indicators) an operator can use to determine 
 the health of the service?**
-  - [ ] Metrics
-    - Metric name:
-    - [Optional] Aggregation method:
-    - Components exposing the metric:
-  - [ ] Other (treat as last resort)
-    - Details:
+  - [x] Metrics
+    - Metric name: etcd_bookmark_counts
+    - Components exposing the metric: kube-apiserver
 
 * **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
-  At a high level, this usually will be in the form of "high percentile of SLI
-  per day <= X". It's impossible to provide comprehensive guidance, but at the very
-  high level (needs more precise definitions) those may be things like:
-  - per-day percentage of API calls finishing with 5XX errors <= 1%
-  - 99% percentile over day of absolute value from (job creation time minus expected
-    job creation time) for cron job <= 10%
-  - 99,9% of /health requests per day finish with 200 code
+  n/a [Bookmark and watch progress notify events are best effort in their nature]
 
 * **Are there any missing metrics that would be useful to have to improve observability 
 of this feature?**
-  Describe the metrics themselves and the reasons why they weren't added (e.g., cost,
-  implementation difficulties, etc.).
+  No
 
 ### Dependencies
 
 _This section must be completed when targeting beta graduation to a release._
 
 * **Does this feature depend on any specific services running in the cluster?**
-  Think about both cluster-level services (e.g. metrics-server) as well
-  as node-level agents (e.g. specific version of CRI). Focus on external or
-  optional services that are needed. For example, if this feature depends on
-  a cloud provider API, or upon an external software-defined storage or network
-  control plane.
-
-  For each of these, fill in the following—thinking about running existing user workloads
-  and creating new ones, as well as about cluster-level services (e.g. DNS):
-  - [Dependency name]
-    - Usage description:
+  
+  - etcd
+    - Usage description: We rely on etcd support for ProgressNotify events, that
+        was added in release 3.3. However, we also rely on ability to configure
+        notifications period (default of 10m is too high), that was added in 3.5
+        and backported to 3.4.11.
       - Impact of its outage on the feature:
+        etcd outage will translate to cluster outage anyway
       - Impact of its degraded performance or high-error rates on the feature:
+        ProgressNotify events may not be send as expected
 
 
 ### Scalability
 
-_For alpha, this section is encouraged: reviewers should consider these questions
-and attempt to answer them._
-
-_For beta, this section is required: reviewers must answer these questions._
-
-_For GA, this section is required: approvers should be able to confirm the
-previous answers based on experience in the field._
-
 * **Will enabling / using this feature result in any new API calls?**
-  Describe them, providing:
-  - API call type (e.g. PATCH pods)
-  - estimated throughput
-  - originating component(s) (e.g. Kubelet, Feature-X-controller)
-  focusing mostly on:
-  - components listing and/or watching resources they didn't before
-  - API calls that may be triggered by changes of some Kubernetes resources
-    (e.g. update of object X triggers new updates of object Y)
-  - periodic API calls to reconcile state (e.g. periodic fetching state,
-    heartbeats, leader election, etc.)
+  No. Although new events are being send via etcd to kube-apiserver as part
+  of the open Watch requests.
 
 * **Will enabling / using this feature result in introducing new API types?**
-  Describe them, providing:
-  - API type
-  - Supported number of objects per cluster
-  - Supported number of objects per namespace (for namespace-scoped objects)
+  No
 
 * **Will enabling / using this feature result in any new calls to the cloud 
 provider?**
 
 * **Will enabling / using this feature result in increasing size or count of 
 the existing API objects?**
-  Describe them, providing:
-  - API type(s):
-  - Estimated increase in size: (e.g., new annotation of size 32B)
-  - Estimated amount of new objects: (e.g., new Object X for every existing Pod)
+  No
 
 * **Will enabling / using this feature result in increasing time taken by any 
 operations covered by [existing SLIs/SLOs]?**
-  Think about adding additional work or introducing new steps in between
-  (e.g. need to do X to start a container), etc. Please describe the details.
+  No
 
 * **Will enabling / using this feature result in non-negligible increase of 
 resource usage (CPU, RAM, disk, IO, ...) in any components?**
-  Things to keep in mind include: additional in-memory state, additional
-  non-trivial computations, excessive access to disks (including increased log
-  volume), significant amount of data sent and/or received over network, etc.
-  This through this both in small and large cases, again with respect to the
-  [supported limits].
+  No
 
 ### Troubleshooting
 
@@ -480,20 +435,13 @@ details). For now, we leave it here.
 _This section must be completed when targeting beta graduation to a release._
 
 * **How does this feature react if the API server and/or etcd is unavailable?**
+  The feature will not work (though it is a control-plane feature, not a workload one.
 
 * **What are other known failure modes?**
-  For each of them, fill in the following information by copying the below template:
-  - [Failure mode brief description]
-    - Detection: How can it be detected via metrics? Stated another way:
-      how can an operator troubleshoot without logging into a master or worker node?
-    - Mitigations: What can be done to stop the bleeding, especially for already
-      running user workloads?
-    - Diagnostics: What are the useful log messages and their required logging
-      levels that could help debug the issue?
-      Not required until feature graduated to beta.
-    - Testing: Are there any tests for failure mode? If not, describe why.
+  n/a
 
 * **What steps should be taken if SLOs are not being met to determine the problem?**
+  n/a
 
 [supported limits]: https://git.k8s.io/community//sig-scalability/configs-and-limits/thresholds.md
 [existing SLIs/SLOs]: https://git.k8s.io/community/sig-scalability/slos/slos.md#kubernetes-slisslos
@@ -502,11 +450,23 @@ _This section must be completed when targeting beta graduation to a release._
 
 2020-06-30: KEP Proposed.
 2020-08-04: KEP marked as implementable.
+v1.20: Feature graduated to Alpha
+2020-01-15: KEP updated to target Beta in v1.21
 
 ## Drawbacks
 
 n/a
 
+## Future work
+
+The above solution doesn't address the extensive relisting case in the
+setup with single kube-apiserver. The reason for that is that we don't send
+Kubernetes Bookmark events on kube-apiserver shutdown (which would actually be
+beneficial on its own). However, doing that properly together with ensuring
+that no request weren't dropped in the meantime (even in single kube-apiserver)
+scenario isn't trivial and probably deserves its own KEP.
+As a result, we're leving this as a future work.
+
 ## Alternatives
 
 ### Initialize watch cache from etcd history window
diff --git a/keps/sig-api-machinery/1904-efficient-watch-resumption/kep.yaml b/keps/sig-api-machinery/1904-efficient-watch-resumption/kep.yaml
@@ -23,7 +23,7 @@ stage: alpha
 # The most recent milestone for which work toward delivery of this KEP has been
 # done. This can be the current (upcoming) milestone, if it is being actively
 # worked on.
-latest-milestone: "v1.20"
+latest-milestone: "v1.21"
 
 # The milestone at which this feature was, or is targeted to be, at each stage.
 milestone:
@@ -40,5 +40,5 @@ feature-gates:
 disable-supported: true
 
 # The following PRR answers are required at beta release
-#metrics:
-#  - my_feature_metric
+metrics:
+  - etcd_bookmark_counts