You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Adding `api/all` will also include `autoscaling/v2`.
187
+
188
+
The feature can be disabled by removing the `--runtime-config` entry.
146
189
147
190
###### What happens if we reenable the feature if it was previously rolled back?
148
191
@@ -199,6 +242,18 @@ Even if applying deprecation policies, they may still surprise some users.
199
242
<!--
200
243
This section must be completed when targeting beta to a release.
201
244
-->
245
+
The HPA requires the `metrics.k8s.io` APIs to be available in the cluster to operate. This API is served by the
246
+
Metrics Server. An operator can verify the Metrics Server is available to provide resource metrics to the HPA by running
247
+
the command `kubectl get apiservices` and looking for the status of `v1beta1.metrics.k8s.io` (version subject to change).
248
+
Operators should take care to make sure Metrics Server is up and running to maintain resource autoscaling.
249
+
250
+
The v2 HPA requires the `custom.metrics.k8s.io` and `external.metrics.k8s.io` APIs as well to retrieve custom and
251
+
external metrics. There is no default implementation of these APIs and cluster operators must install an "adapter" for
252
+
their metrics backend (e.g. [Prometheus](https://github.com/kubernetes-sigs/prometheus-adapter)).
253
+
254
+
An operator can verify the adapter is working properly by running the same kubectl for apiservices and looking for the
255
+
`v1beta1.custom.metrics.k8s.io` and `v1beta1.external.metrics.k8s.io` APIs (usually served by the same adapter).
256
+
Care should be taken to ensure the adapter and specific metrics backend is available to maintain custom metric autoscaling.
202
257
203
258
###### How can an operator determine if the feature is in use by workloads?
204
259
@@ -207,6 +262,7 @@ Ideally, this should be a metric. Operations against the Kubernetes API (e.g.,
207
262
checking if there are objects with field X set) may be a last resort. Avoid
208
263
logs or events for this purpose.
209
264
-->
265
+
All HPA objects are stored in v1 format on disk. They are up converted the requested version. And down converted upon update.
210
266
211
267
###### How can someone using this feature know that it is working for their instance?
212
268
@@ -219,13 +275,28 @@ and operation of this feature.
219
275
Recall that end users cannot usually observe component logs or access metrics.
220
276
-->
221
277
222
-
-[ ] Events
278
+
-[x ] Events
223
279
- Event Reason:
224
-
-[ ] API .status
280
+
The event type `Normal`, reason `SuccessfulRescale`, note `New size: N; reason: FOO` indicates autoscaling is operating normally.
281
+
Abnormal events type `Warning` include reasons such as `FailedRescale` and `FailedComputeMetricsReplicas` and
282
+
will include details about the error in the note.
283
+
-[ x ] API .status
225
284
- Condition name:
285
+
There are three condition types which indicate the operating status of the HPA. They are `ScalingEnabled`, `AbleToScale`
286
+
and `ScalingLimited` (see type [comments](https://pkg.go.dev/k8s.io/api/autoscaling/v2beta2#HorizontalPodAutoscalerConditionType))
287
+
Under normal operating circumstances `ScalingEnabled` and `AbleToScale` should be status `true`, indicating the HPA is
288
+
successfully reconciling the scale. `ScalingLimited` indicates user configuration is limiting the "ideal" scale with a
289
+
minimum, maximum, rate or delay. Which limit is the cause will be indicated in the message.
290
+
It's normal for this to be `true` or `false` periodically.
226
291
- Other field:
227
-
-[ ] Other (treat as last resort)
292
+
-[x ] Other (treat as last resort)
228
293
- Details:
294
+
The HPA status includes the current observed metric values, one for each given target. Using these
295
+
values an operator can verify the HPA is maintaining the desired target for the dominant metric.
296
+
The operator can also see the number of pods the HPA observed under `status.currentReplicas` and the most
297
+
recent recommendation under `status.desiredReplicas`.
298
+
The latest observed generation is echoed back in status so an operator can verify the HPA is keeping up-to-date
299
+
with configuration changes.
229
300
230
301
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
231
302
@@ -298,6 +369,22 @@ For beta, this section is required: reviewers must answer these questions.
298
369
For GA, this section is required: approvers should be able to confirm the
299
370
previous answers based on experience in the field.
300
371
-->
372
+
The HPA v2 APIs allow users to configure multiple metrics, each with a separate target. A recommendation is calculated
373
+
for each metric and the largest recommendation is used. The more metrics are added to a given HPA the longer it will
374
+
take to reconcile. The HPA is single-threaded processing recommendations one-at-a-time. When default reconciliation
375
+
period is 15 seconds. If there is too much work to do reconciliation will slow down and happen less frequently than
376
+
every 15 seconds. This will cause autoscaling to be less responsive at high scale.
377
+
378
+
Previously v1 scaled along two dimensions, number of HPA and number of pods selected by each HPA (linearly).
379
+
Now it will scale with the number of metrics defined in HPAs and the number of pods selected each metric (linearly).
380
+
381
+
Additionally, v2 adds a behavior structure which allows the user configure that rate and delay of scaling and down.
382
+
Enforcing these constraints require storing previous recommendations and scaling events in memory. The longer the
383
+
configured interval the more memory is used. The maximum window allows is 60 minutes ([code](https://pkg.go.dev/k8s.io/api/autoscaling/v2beta2#HPAScalingRules))
384
+
so 240 recommendations / events per configured metric. Each recommendation is an `int32` and `time.Time`.
385
+
Each scaling event is an `int32`, a `time.Time` and a `bool` ([code](https://pkg.go.dev/k8s.io/api/autoscaling/v2beta2#HPAScalingRules))
386
+
so the memory footprint is relatively small.
387
+
It will scale linearly with the number of metrics defined and the size of the HPA's configured window.
301
388
302
389
###### Will enabling / using this feature result in any new API calls?
303
390
@@ -323,6 +410,7 @@ Describe them, providing:
323
410
- Supported number of objects per cluster
324
411
- Supported number of objects per namespace (for namespace-scoped objects)
325
412
-->
413
+
Yes. It will introduce the new autoscaling/v2 API types.
326
414
327
415
###### Will enabling / using this feature result in any new calls to the cloud provider?
328
416
@@ -395,9 +483,6 @@ For each of them, fill in the following information by copying the below templat
395
483
- Testing: Are there any tests for failure mode? If not, describe why.
396
484
-->
397
485
398
-
###### What steps should be taken if SLOs are not being met to determine the problem?
0 commit comments