Skip to content

Commit 58fa259

Browse files
authored
Merge pull request #38284 from abrennan89/SRVKS-573
SRVKS-573: Updating autoscaling docs
2 parents 8cde50a + 510b974 commit 58fa259

13 files changed

+253
-163
lines changed

_topic_map.yml

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3189,9 +3189,6 @@ Topics:
31893189
# Knative services
31903190
- Name: Serverless applications
31913191
File: serverless-applications
3192-
# Autoscaling
3193-
- Name: Configuring Knative Serving autoscaling
3194-
File: configuring-knative-serving-autoscaling
31953192
- Name: Traffic management
31963193
File: serverless-traffic-management
31973194
- Name: Cluster logging with OpenShift Serverless
@@ -3208,6 +3205,17 @@ Topics:
32083205
- Name: Metrics
32093206
File: serverless-serving-metrics
32103207
#
3208+
# Autoscaling
3209+
- Name: Autoscaling
3210+
Dir: autoscaling
3211+
Topics:
3212+
- Name: About autoscaling
3213+
File: serverless-autoscaling
3214+
- Name: Scale bounds
3215+
File: serverless-autoscaling-scale-bounds
3216+
- Name: Concurrency
3217+
File: serverless-autoscaling-concurrency
3218+
#
32113219
# Knative Eventing
32123220
- Name: Knative Eventing
32133221
Dir: knative_eventing

modules/configuring-scale-bounds-knative.adoc

Lines changed: 0 additions & 32 deletions
This file was deleted.

modules/knative-serving-concurrent-autoscaling-requests.adoc

Lines changed: 0 additions & 84 deletions
This file was deleted.
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
[id="serverless-autoscaling-maxscale-kn_{context}"]
2+
= Setting the maxScale annotation by using the Knative CLI
3+
4+
You can use the `kn service` command with the `--max-scale` flag to create or modify the `--max-scale` value for a service.
5+
6+
.Procedure
7+
8+
* Set the maximum number of pods for the service by using the `--max-scale` flag:
9+
+
10+
[source,terminal]
11+
----
12+
$ kn service create <service_name> --image <image_uri> --max-scale <integer>
13+
----
14+
+
15+
.Example command
16+
[source,terminal]
17+
----
18+
$ kn service create example-service --image quay.io/openshift-knative/knative-eventing-sources-event-display:latest --max-scale 10
19+
----
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
[id="serverless-autoscaling-minscale_{context}"]
2+
= Setting the minScale annotation by using the Knative CLI
3+
4+
You can use the `kn service` command with the `--min-scale` flag to create or modify the `--min-scale` value for a service.
5+
6+
.Procedure
7+
8+
* Set the maximum number of pods for the service by using the `--min-scale` flag:
9+
+
10+
.Examples
11+
[source,terminal]
12+
----
13+
$ kn service create <service_name> --image <image_uri> --min-scale <integer>
14+
----
15+
+
16+
[source,terminal]
17+
----
18+
$ kn service create example-service --image quay.io/openshift-knative/knative-eventing-sources-event-display:latest --min-scale 2
19+
----
20+
21+
// TODO: Check if it can be used with update and other service commands.
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
[id="serverless-concurrency-limits-configure-hard_{context}"]
2+
= Configuring a hard concurrency limit
3+
4+
You can specify a hard concurrency limit for your Knative service by modifying the `containerConcurrency` spec or by using the `kn service` command with the correct flags.
5+
6+
// However, a default value can be set for the Revision's containerConcurrency field in config-defaults.yaml.
7+
// add note about this for admins to see? Need more details about config-defaults though
8+
9+
.Procedure
10+
11+
* Optional: Set the `containerConcurrency` spec for your Knative service in the spec of the `Service` custom resource:
12+
+
13+
.Example service spec
14+
[source,yaml]
15+
----
16+
apiVersion: serving.knative.dev/v1
17+
kind: Service
18+
metadata:
19+
name: example-service
20+
namespace: default
21+
spec:
22+
template:
23+
spec:
24+
containerConcurrency: 50
25+
----
26+
+
27+
The default value is `0`, which means that there is no limit on the number of requests that are permitted to flow into one pod of the service at a time.
28+
+
29+
A value greater than `0` specifies the exact number of requests that are permitted to flow into one pod of the service at a time. This example would enable a hard concurrency limit of 50 requests at a time.
30+
31+
* Optional: Use the `kn service` command to specify the `--concurrency-limit` flag:
32+
+
33+
[source,terminal]
34+
----
35+
$ kn service create <service_name> --image <image_uri> --concurrency-limit <integer>
36+
----
37+
+
38+
.Example command to create a service with a concurrency limit of 50 requests
39+
[source,terminal]
40+
----
41+
$ kn service create example-service --image quay.io/openshift-knative/knative-eventing-sources-event-display:latest --concurrency-limit 50
42+
----
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
[id="serverless-concurrency-limits-configure-soft_{context}"]
2+
= Configuring a soft concurrency target
3+
4+
You can specify a soft concurrency target for your Knative service by setting the `autoscaling.knative.dev/target` annotation in the spec, or by using the `kn service` command with the correct flags.
5+
6+
.Procedure
7+
8+
* Optional: Set the `autoscaling.knative.dev/target` annotation for your Knative service in the spec of the `Service` custom resource:
9+
+
10+
.Example service spec
11+
[source,yaml]
12+
----
13+
apiVersion: serving.knative.dev/v1
14+
kind: Service
15+
metadata:
16+
name: example-service
17+
namespace: default
18+
spec:
19+
template:
20+
metadata:
21+
annotations:
22+
autoscaling.knative.dev/target: "200"
23+
----
24+
25+
* Optional: Use the `kn service` command to specify the `--concurrency-target` flag:
26+
+
27+
[source,terminal]
28+
----
29+
$ kn service create <service_name> --image <image_uri> --concurrency-target <integer>
30+
----
31+
+
32+
.Example command to create a service with a concurrency target of 50 requests
33+
[source,terminal]
34+
----
35+
$ kn service create example-service --image quay.io/openshift-knative/knative-eventing-sources-event-display:latest --concurrency-target 50
36+
----
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
[id="serverless-concurrency-limits_{context}"]
2+
= Concurrency limits and targets
3+
4+
Concurrency can be configured as either a _soft limit_ or a _hard limit_:
5+
6+
* A soft limit is a targeted requests limit, rather than a strictly enforced bound. For example, if there is a sudden burst of traffic, the soft limit target can be exceeded.
7+
8+
* A hard limit is a strictly enforced upper bound requests limit. If concurrency reaches the hard limit, surplus requests are buffered and must wait until there is enough free capacity to execute the requests.
9+
+
10+
[IMPORTANT]
11+
====
12+
Using a hard limit configuration is only recommended if there is a clear use case for it with your application. Having a low, hard limit specified may have a negative impact on the throughput and latency of an application, and might cause cold starts.
13+
====
14+
15+
Adding a soft target and a hard limit means that the autoscaler targets the soft target number of concurrent requests, but imposes a hard limit of the hard limit value for the maximum number of requests.
16+
17+
If the hard limit value is less than the soft limit value, the soft limit value is tuned down, because there is no need to target more requests than the number that can actually be handled.

modules/serverless-workflow-autoscaling-kn.adoc

Lines changed: 0 additions & 24 deletions
This file was deleted.
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
[id="serverless-autoscaling-concurrency"]
2+
= Concurrency
3+
include::modules/common-attributes.adoc[]
4+
include::modules/serverless-document-attributes.adoc[]
5+
:context: serverless-autoscaling-concurrency
6+
7+
toc::[]
8+
9+
Concurrency determines the number of simultaneous requests that can be processed by each pod of an application at any given time.
10+
11+
include::modules/serverless-concurrency-limits.adoc[leveloffset=+1]
12+
include::modules/serverless-concurrency-limits-configure-soft.adoc[leveloffset=+2]
13+
include::modules/serverless-concurrency-limits-configure-hard.adoc[leveloffset=+2]

0 commit comments

Comments
 (0)