sample-and-watermark replaced with timing histograms

MikeSpreitzer · MikeSpreitzer · commit 9360aa171135 · 2022-11-22T17:21:23.000-05:00
diff --git a/content/en/docs/concepts/cluster-administration/flow-control.md b/content/en/docs/concepts/cluster-administration/flow-control.md
@@ -508,26 +508,15 @@ poorly-behaved workloads that may be harming system health.
   last window's high water mark of number of requests actively being
   served.
 
-* `apiserver_flowcontrol_read_vs_write_request_count_samples` is a
-  histogram vector of observations of the then-current number of
-  requests, broken down by the labels `phase` (which takes on the
-  values `waiting` and `executing`) and `request_kind` (which takes on
-  the values `mutating` and `readOnly`).  The observations are made
-  periodically at a high rate.  Each observed value is a ratio,
-  between 0 and 1, of a number of requests divided by the
-  corresponding limit on the number of requests (queue length limit
-  for waiting and concurrency limit for executing).
-
-* `apiserver_flowcontrol_read_vs_write_request_count_watermarks` is a
-  histogram vector of high or low water marks of the number of
-  requests (divided by the corresponding limit to get a ratio in the
-  range 0 to 1) broken down by the labels `phase` (which takes on the
-  values `waiting` and `executing`) and `request_kind` (which takes on
-  the values `mutating` and `readOnly`); the label `mark` takes on
-  values `high` and `low`.  The water marks are accumulated over
-  windows bounded by the times when an observation was added to
-  `apiserver_flowcontrol_read_vs_write_request_count_samples`.  These
-  water marks show the range of values that occurred between samples.
+* `apiserver_flowcontrol_read_vs_write_current_requests` is a
+  histogram vector of observations, made at the end of every
+  nanosecond, of the number of requests, broken down by the labels
+  `phase` (which takes on the values `waiting` and `executing`) and
+  `request_kind` (which takes on the values `mutating` and
+  `readOnly`).  Each observed value is a ratio, between 0 and 1, of a
+  number of requests divided by the corresponding limit on the number
+  of requests (queue volume limit for waiting and concurrency limit
+  for executing).
 
 * `apiserver_flowcontrol_current_inqueue_requests` is a gauge vector
   holding the instantaneous number of queued (not executing) requests,
@@ -542,52 +531,27 @@ poorly-behaved workloads that may be harming system health.
   holding the instantaneous number of occupied seats, broken down by
   the labels `priority_level` and `flow_schema`.
 
-* `apiserver_flowcontrol_priority_level_request_count_samples` is a
-  histogram vector of observations of the then-current number of
-  requests broken down by the labels `phase` (which takes on the
-  values `waiting` and `executing`) and `priority_level`.  Each
-  histogram gets observations taken periodically, up through the last
-  activity of the relevant sort.  The observations are made at a high
-  rate.  Each observed value is a ratio, between 0 and 1, of a number
-  of requests divided by the corresponding limit on the number of
-  requests (queue length limit for waiting and concurrency limit for
-  executing).
-
-* `apiserver_flowcontrol_priority_level_request_count_watermarks` is a
-  histogram vector of high or low water marks of the number of
-  requests (divided by the corresponding limit to get a ratio in the
-  range 0 to 1) broken down by the labels `phase` (which takes on the
-  values `waiting` and `executing`) and `priority_level`; the label
-  `mark` takes on values `high` and `low`.  The water marks are
-  accumulated over windows bounded by the times when an observation
-  was added to
-  `apiserver_flowcontrol_priority_level_request_count_samples`.  These
-  water marks show the range of values that occurred between samples.
-
-* `apiserver_flowcontrol_priority_level_seat_count_samples` is a
-  histogram vector of observations of the utilization of a priority
-  level's concurrency limit, broken down by `priority_level`.  This
-  utilization is the fraction (number of seats occupied) /
-  (concurrency limit).  This metric considers all stages of execution
-  (both normal and the extra delay at the end of a write to cover for
-  the corresponding notification work) of all requests except WATCHes;
-  for those it considers only the initial stage that delivers
-  notifications of pre-existing objects.  Each histogram in the vector
-  is also labeled with `phase: executing` (there is no seat limit for
-  the waiting phase).  Each histogram gets observations taken
-  periodically, up through the last activity of the relevant sort.
-  The observations
-  are made at a high rate.  
-
-* `apiserver_flowcontrol_priority_level_seat_count_watermarks` is a
-  histogram vector of high or low water marks of the utilization of a
-  priority level's concurrency limit, broken down by `priority_level`
-  and `mark` (which takes on values `high` and `low`).  Each histogram
-  in the vector is also labeled with `phase: executing` (there is no
-  seat limit for the waiting phase).  The water marks are accumulated
-  over windows bounded by the times when an observation was added to
-  `apiserver_flowcontrol_priority_level_seat_count_samples`.  These
-  water marks show the range of values that occurred between samples.
+* `apiserver_flowcontrol_priority_level_request_utilization` is a
+  histogram vector of observations, made at the end of each
+  nanosecond, of the number of requests broken down by the labels
+  `phase` (which takes on the values `waiting` and `executing`) and
+  `priority_level`.  Each observed value is a ratio, between 0 and 1,
+  of a number of requests divided by the corresponding limit on the
+  number of requests (queue volume limit for waiting and concurrency
+  limit for executing).
+
+* `apiserver_flowcontrol_priority_level_seat_utilization` is a
+  histogram vector of observations, made at the end of each
+  nanosecond, of the utilization of a priority level's concurrency
+  limit, broken down by `priority_level`.  This utilization is the
+  fraction (number of seats occupied) / (concurrency limit).  This
+  metric considers all stages of execution (both normal and the extra
+  delay at the end of a write to cover for the corresponding
+  notification work) of all requests except WATCHes; for those it
+  considers only the initial stage that delivers notifications of
+  pre-existing objects.  Each histogram in the vector is also labeled
+  with `phase: executing` (there is no seat limit for the waiting
+  phase).
 
 * `apiserver_flowcontrol_request_queue_length_after_enqueue` is a
   histogram vector of queue lengths for the queues, broken down by