You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: sig-scalability/slos/api_call_latency.md
+14-11Lines changed: 14 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,12 +4,13 @@
4
4
5
5
| Status | SLI | SLO |
6
6
| --- | --- | --- |
7
-
|__Official__| Latency<sup>[1](#footnote1)</sup> of mutating<sup>[2](#footnote2)</sup> API calls for single objects for every (resource, verb) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, verb) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day <= 1s |
8
-
|__Official__| Latency<sup>[1](#footnote1)</sup> of non-streaming read-only<sup>[3](#footnote3)</sup> API calls for every (resource, scope<sup>[4](#footnote4)</sup>) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, scope) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day: (a) <= 1s if `scope=resource` (b) <= 30s<sup>[5](#footnote5)</sup> otherwise (if `scope=namespace` or `scope=cluster`) |
7
+
|__Official__| Latency of processing<sup>[1](#footnote1)</sup> mutating<sup>[2](#footnote2)</sup> API calls for single objects for every (resource, verb) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, verb) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day <= 1s |
8
+
|__Official__| Latency of processing<sup>[1](#footnote1)</sup> non-streaming read-only<sup>[3](#footnote3)</sup> API calls for every (resource, scope<sup>[4](#footnote4)</sup>) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, scope) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day: (a) <= 1s if `scope=resource` (b) <= 30s<sup>[5](#footnote5)</sup> otherwise (if `scope=namespace` or `scope=cluster`) |
9
9
10
-
<aname="footnote1">\[1\]</a> By latency of API call in this doc we mean time
11
-
from the moment when apiserver gets the request to last byte of response sent
12
-
to the user.
10
+
<aname="footnote1">\[1\]</a> The SLI only measures latency incurred by the processing
11
+
time of the request. The processing time of a request is the moment when apiserver gets
12
+
the request to last byte of response sent to the user, excluding latency incurred by
13
+
webhooks and priority & fairness queue wait times.
13
14
14
15
<aname="footnote2">\[2\]</a> By mutating API calls we mean POST, PUT, DELETE
15
16
and PATCH.
@@ -35,15 +36,15 @@ that users are fine with listing tens of thousands of objects taking more than
35
36
- As a user of vanilla Kubernetes, I want some guarantee how quickly I get the
36
37
response from an API call.
37
38
- As an administrator of Kubernetes cluster, if I know characteristics of my
38
-
external dependencies of apiserver (e.g custom admission plugins and webhooks)
39
-
I want to be able to provide guarantees for API calls latency to users of my
40
-
cluster.
39
+
external dependencies of apiserver (e.g custom admission plugins, priority
40
+
& fairness configuration, and webhooks). I want to be able to provide
41
+
guarantees for API calls latency to users of my cluster.
41
42
42
43
### Other notes
43
44
- We obviously can’t give any guarantee in general, because cluster
44
-
administrators are allowed to register custom admission plugins or webhooks,
45
-
which we don’t have any control about and they obviously impact API call
46
-
latencies.
45
+
administrators are allowed to register custom admission plugins, webhooks,
46
+
and priority and fairness configurations, which we don’t have any control
47
+
about and they obviously impact API call latencies.
47
48
- As a result, we define the SLIs to be very generic (no matter how your
48
49
cluster is set up), but we provide SLO only for default installations (where we
49
50
have control over what apiserver is doing). This doesn’t provide a false
@@ -72,6 +73,8 @@ that all `core` components communicate with apiserver using protocol buffers.
72
73
stale data (being served from cache) and the SLO again has to be satisfied
73
74
independently of that. This makes the careful choice of requests in tests
74
75
important.
76
+
- The SLI & SLO excludes latency incurred by factors that are outside our control, specifically
77
+
from webhooks (1.23+) and API priority & fairness queue wait times (1.27+).
75
78
76
79
### TODOs
77
80
- We may consider treating `non-namespaced` resources as a separate bucket in
Copy file name to clipboardExpand all lines: sig-scalability/slos/slos.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -114,8 +114,8 @@ __TODO: Cluster churn should be moved to scalability thresholds.__
114
114
115
115
| Status | SLI | SLO | User stories, test scenarios, ... |
116
116
| --- | --- | --- | --- |
117
-
|__Official__| Latency of mutating API calls for single objects for every (resource, verb) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, verb) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day<sup>[1](#footnote1)</sup> <= 1s |[Details](./api_call_latency.md)|
118
-
|__Official__| Latency of non-streaming read-only API calls for every (resource, scope) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, scope) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day<sup>[1](#footnote1)</sup> (a) <= 1s if `scope=resource` (b) <= 30s otherwise (if `scope=namespace` or `scope=cluster`) |[Details](./api_call_latency.md)|
117
+
|__Official__| Latency of processing mutating API calls for single objects for every (resource, verb) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, verb) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day<sup>[1](#footnote1)</sup> <= 1s |[Details](./api_call_latency.md)|
118
+
|__Official__| Latency of processing non-streaming read-only API calls for every (resource, scope) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, scope) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day<sup>[1](#footnote1)</sup> (a) <= 1s if `scope=resource` (b) <= 30s otherwise (if `scope=namespace` or `scope=cluster`) |[Details](./api_call_latency.md)|
119
119
|__Official__| Startup latency of schedulable stateless pods, excluding time to pull images and run init containers, measured from pod creation timestamp to when all its containers are reported as started and observed via watch, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, 99th percentile per cluster-day<sup>[1](#footnote1)</sup> <= 5s |[Details](./pod_startup_latency.md)|
120
120
|__WIP__| Startup latency of schedulable stateful pods, excluding time to pull images, run init containers, provision volumes (in delayed binding mode) and unmount/detach volumes (from previous pod if needed), measured from pod creation timestamp to when all its containers are reported as started and observed via watch, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, 99th percentile per cluster-day<sup>[1](#footnote1)</sup> <= X where X depends on storage provider |[Details](./pod_startup_latency.md)|
121
121
|__WIP__| Latency of programming in-cluster load balancing mechanism (e.g. iptables), measured from when service spec or list of its `Ready` pods change to when it is reflected in load balancing mechanism, measured as 99th percentile over last 5 minutes aggregated across all programmers | In default Kubernetes installation, 99th percentile per cluster-day<sup>[1](#footnote1)</sup> <= X |[Details](./network_programming_latency.md)|
0 commit comments