Skip to content

Commit dd926d9

Browse files
authored
Merge pull request #40763 from abrennan89/autoscaleadmin
SRVKS-573: Autoscaling docs improvements
2 parents ce42916 + 3918cfd commit dd926d9

9 files changed

+176
-107
lines changed

_topic_maps/_topic_map.yml

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3200,11 +3200,7 @@ Topics:
32003200
- Name: Serverless applications
32013201
File: serverless-applications
32023202
- Name: Autoscaling
3203-
File: serverless-autoscaling
3204-
- Name: Scale bounds
3205-
File: serverless-autoscaling-scale-bounds
3206-
- Name: Concurrency
3207-
File: serverless-autoscaling-concurrency
3203+
File: serverless-autoscaling-developer
32083204
- Name: Traffic management
32093205
File: serverless-traffic-management
32103206
- Name: Routing
@@ -3244,6 +3240,8 @@ Topics:
32443240
File: serverless-cluster-admin-serving
32453241
- Name: Configuring the Knative Serving custom resource
32463242
File: knative-serving-CR-config
3243+
- Name: Autoscaling
3244+
File: serverless-admin-autoscaling
32473245
# Ingress options
32483246
- Name: Integrating Service Mesh with OpenShift Serverless
32493247
File: serverless-ossm-setup

modules/serverless-autoscaling-minscale-kn.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,12 @@ You can use the `kn service` command with the `--min-scale` flag to create or mo
1111

1212
* Set the minimum number of replicas for the service by using the `--min-scale` flag:
1313
+
14-
.Examples
1514
[source,terminal]
1615
----
1716
$ kn service create <service_name> --image <image_uri> --min-scale <integer>
1817
----
1918
+
19+
.Example command
2020
[source,terminal]
2121
----
2222
$ kn service create example-service --image quay.io/openshift-knative/knative-eventing-sources-event-display:latest --min-scale 2
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * serverless/admin_guide/serverless-admin-autoscaling.adoc
4+
5+
[id="serverless-enable-scale-to-zero_{context}"]
6+
= Enabling scale-to-zero
7+
8+
Cluster administrators can enable or disable scale-to-zero globally for the cluster.
9+
10+
.Prerequisites
11+
12+
* You have installed {ServerlessOperatorName} and Knative Serving on your cluster.
13+
* You have cluster administrator permissions.
14+
* You are using the default Knative Pod Autoscaler. The scale to zero feature is not available if you are using the Kubernetes Horizontal Pod Autoscaler.
15+
16+
.Procedure
17+
18+
* Modify the `enable-scale-to-zero` spec in the `KnativeServing` CR:
19+
+
20+
[source,yaml]
21+
----
22+
apiVersion: operator.knative.dev/v1alpha1
23+
kind: KnativeServing
24+
metadata:
25+
name: knative-serving
26+
spec:
27+
config:
28+
autoscaler:
29+
enable-scale-to-zero: "false" <1>
30+
----
31+
<1> The `enable-scale-to-zero` spec can be either `"true"` or `"false"`. If set to true, scale-to-zero is enabled. If set to false, applications are scaled down to the configured _minimum scale bound_. The default value is `"true"`.
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * serverless/admin_guide/serverless-admin-autoscaling.adoc
4+
5+
[id="serverless-scale-to-zero-grace-period_{context}"]
6+
= Configuring the scale-to-zero grace period
7+
8+
This setting specifies an upper bound time limit that Knative waits for scale-from-zero machinery to be in place before the last replica of an application is removed.
9+
10+
.Prerequisites
11+
12+
* You have installed {ServerlessOperatorName} and Knative Serving on your cluster.
13+
* You have cluster administrator permissions.
14+
* You are using the default Knative Pod Autoscaler. The scale to zero feature is not available if you are using the Kubernetes Horizontal Pod Autoscaler.
15+
16+
.Procedure
17+
18+
* Modify the `scale-to-zero-grace-period` spec in the `KnativeServing` CR:
19+
+
20+
[source,yaml]
21+
----
22+
apiVersion: operator.knative.dev/v1alpha1
23+
kind: KnativeServing
24+
metadata:
25+
name: knative-serving
26+
spec:
27+
config:
28+
autoscaler:
29+
scale-to-zero-grace-period: "30s" <1>
30+
----
31+
<1> The grace period time in seconds. The default value is 30 seconds.
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
[id="serverless-admin-autoscaling"]
2+
= Autoscaling
3+
include::modules/common-attributes.adoc[]
4+
include::modules/serverless-document-attributes.adoc[]
5+
:context: serverless-admin-autoscaling
6+
7+
toc::[]
8+
9+
As a cluster administrator, you can set global and per-namespace default configurations for autoscaling features by modifying the `KnativeServing` custom resource (CR). This propagates changes to the relevant config maps.
10+
11+
include::modules/serverless-enable-scale-to-zero.adoc[leveloffset=+1]
12+
include::modules/serverless-scale-to-zero-grace-period.adoc[leveloffset=+1]

serverless/develop/serverless-autoscaling-concurrency.adoc

Lines changed: 0 additions & 14 deletions
This file was deleted.
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
[id="serverless-autoscaling-developer"]
2+
= Autoscaling
3+
include::modules/common-attributes.adoc[]
4+
include::modules/serverless-document-attributes.adoc[]
5+
:context: serverless-autoscaling-developer
6+
7+
toc::[]
8+
9+
Knative Serving provides automatic scaling, or _autoscaling_, for applications to match incoming demand. For example, if an application is receiving no traffic, and scale-to-zero is enabled, Knative Serving scales the application down to zero replicas. If scale-to-zero is disabled, the application is scaled down to the xref:../../serverless/develop/serverless-autoscaling-developer.adoc#serverless-autoscaling-developer-minscale[minimum number of replicas specified for applications on the cluster]. Replicas can also be scaled up to meet demand if traffic to the application increases.
10+
11+
If Knative autoscaling is enabled for your cluster, you can configure concurrency and scale bounds for your application.
12+
13+
[NOTE]
14+
====
15+
Any limits or targets set in the revision template are measured against a single instance of your application. For example, setting the `target` annotation to `50` configures the autoscaler to scale the application so that each revision handles 50 requests at a time.
16+
====
17+
18+
[id="serverless-autoscaling-developer-scale-bounds"]
19+
== Scale bounds
20+
21+
Scale bounds determine the minimum and maximum numbers of replicas that can serve an application at any given time.
22+
23+
You can set scale bounds for an application to help prevent cold starts or control computing costs.
24+
25+
[id="serverless-autoscaling-developer-minscale"]
26+
=== Minimum scale bounds
27+
28+
The minimum number of replicas that can serve an application is determined by the `minScale` annotation.
29+
30+
The `minScale` value defaults to `0` replicas if the following conditions are met:
31+
32+
* The `minScale` annotation is not set
33+
* Scaling to zero is enabled
34+
* The class `KPA` is used
35+
36+
If scale to zero is not enabled, the `minScale` value defaults to `1`.
37+
38+
// TODO: Document KPA if supported, link to docs about setting class
39+
40+
// TO DO:
41+
// Add info / links about enabling and disabling autoscaling (admin docs)
42+
// if `enable-scale-to-zero` is set to `false` in the `config-autoscaler` config map.
43+
44+
.Example service spec with `minScale` spec
45+
[source,yaml]
46+
----
47+
apiVersion: serving.knative.dev/v1
48+
kind: Service
49+
metadata:
50+
name: example-service
51+
namespace: default
52+
spec:
53+
template:
54+
metadata:
55+
annotations:
56+
autoscaling.knative.dev/minScale: "0"
57+
...
58+
----
59+
60+
include::modules/serverless-autoscaling-minscale-kn.adoc[leveloffset=+3]
61+
62+
[id="serverless-autoscaling-developer-maxscale"]
63+
=== Maximum scale bounds
64+
65+
The maximum number of replicas that can serve an application is determined by the `maxScale` annotation. If the `maxScale` annotation is not set, there is no upper limit for the number of replicas created.
66+
67+
.Example service spec with `maxScale` spec
68+
[source,yaml]
69+
----
70+
apiVersion: serving.knative.dev/v1
71+
kind: Service
72+
metadata:
73+
name: example-service
74+
namespace: default
75+
spec:
76+
template:
77+
metadata:
78+
annotations:
79+
autoscaling.knative.dev/maxScale: "10"
80+
...
81+
----
82+
83+
include::modules/serverless-autoscaling-maxscale-kn.adoc[leveloffset=+3]
84+
85+
[id="serverless-autoscaling-developer-concurrency"]
86+
== Concurrency
87+
88+
Concurrency determines the number of simultaneous requests that can be processed by each replica of an application at any given time.
89+
90+
include::modules/serverless-concurrency-limits.adoc[leveloffset=+2]
91+
include::modules/serverless-concurrency-limits-configure-soft.adoc[leveloffset=+2]
92+
include::modules/serverless-concurrency-limits-configure-hard.adoc[leveloffset=+2]
93+
include::modules/serverless-target-utilization.adoc[leveloffset=+2]
94+
95+
[id="additional-resources_serverless-autoscaling-developer"]
96+
== Additional resources
97+
98+
* Scale-to-zero can be enabled or disabled for the cluster by cluster administrators. For more information, see xref:../../serverless/admin_guide/serverless-admin-autoscaling.adoc#serverless-enable-scale-to-zero_serverless-admin-autoscaling[Enabling scale-to-zero].

serverless/develop/serverless-autoscaling-scale-bounds.adoc

Lines changed: 0 additions & 71 deletions
This file was deleted.

serverless/develop/serverless-autoscaling.adoc

Lines changed: 0 additions & 16 deletions
This file was deleted.

0 commit comments

Comments
 (0)