You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/user-guide/reference/autoscaling.md
+83-14Lines changed: 83 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Autoscaling
2
2
3
-
Numaflow is able to run with both`Horizontal Pod Autoscaling` and `Vertical Pod Autoscaling`.
3
+
Numaflow [Pipeline](../../core-concepts/pipeline.md) and [MonoVertex](../../core-concepts/monovertex.md) are both able to run with `Horizontal Pod Autoscaling` and `Vertical Pod Autoscaling`.
4
4
5
5
## Horizontal Pod Autoscaling
6
6
@@ -12,12 +12,13 @@ Numaflow is able to run with both `Horizontal Pod Autoscaling` and `Vertical Pod
12
12
13
13
### Numaflow Autoscaling
14
14
15
-
Numaflow provides `0 - N` autoscaling capability out of the box, it's available for all the `UDF`, `Sink` and most of
16
-
the [`Source`](../sources/overview.md)vertices (please check each source for more details).
15
+
Numaflow provides `0 - N` autoscaling capability out of the box, it's available for all the [MonoVertices](../../core-concepts/monovertex.md) and [Pipeline](../../core-concepts/pipeline.md)[vertices](../../core-concepts/vertex.md) including `UDF`, `Sink` and most of
16
+
the [`Source`](../sources/overview.md)types (please check each source for more details).
17
17
18
-
Numaflow autoscaling is enabled by default, there are some parameters can be tuned to achieve better results.
18
+
Numaflow autoscaling is enabled by default, there are some parameters can be fine-tuned to achieve better results.
19
19
20
20
```yaml
21
+
# A Pipeline example.
21
22
apiVersion: numaflow.numaproj.io/v1alpha1
22
23
kind: Pipeline
23
24
metadata:
@@ -37,32 +38,50 @@ spec:
37
38
targetBufferAvailability: 50# Optional, defaults to 50.
38
39
replicasPerScaleUp: 2# Optional, defaults to 2.
39
40
replicasPerScaleDown: 2# Optional, defaults to 2.
41
+
---
42
+
# A MonoVertex example.
43
+
apiVersion: numaflow.numaproj.io/v1alpha1
44
+
kind: MonoVertex
45
+
metadata:
46
+
name: my-mvtx
47
+
spec:
48
+
scale:
49
+
disabled: false # Optional, defaults to false.
50
+
min: 0# Optional, minimum replicas, defaults to 0.
51
+
max: 20# Optional, maximum replicas, defaults to 50.
52
+
lookbackSeconds: 120# Optional, defaults to 120.
53
+
scaleUpCooldownSeconds: 90# Optional, defaults to 90.
54
+
scaleDownCooldownSeconds: 90# Optional, defaults to 90.
55
+
zeroReplicaSleepSeconds: 120# Optional, defaults to 120.
56
+
targetProcessingSeconds: 20# Optional, defaults to 20.
57
+
replicasPerScaleUp: 2# Optional, defaults to 2.
58
+
replicasPerScaleDown: 2# Optional, defaults to 2.
40
59
```
41
60
42
61
- `disabled` - Whether to disable Numaflow autoscaling, defaults to `false`.
43
62
- `min`- Minimum replicas, valid value could be an integer >= 0. Defaults to `0`, which means it could be scaled down to 0.
44
63
- `max`- Maximum replicas, positive integer which should not be less than `min`, defaults to `50`. if `max` and `min`
45
64
are the same, that will be the fixed replica number.
46
-
- `lookbackSeconds`- How many seconds to lookback for vertex average processing rate (tps) and pending messages calculation,
65
+
- `lookbackSeconds`- How many seconds to lookback for average processing rate (tps) and pending messages calculation,
47
66
defaults to `120`. Rate and pending messages metrics are critical for autoscaling, you might need to tune this parameter
48
67
a bit to see better results. For example, your data source only have 1 minute data input in every 5 minutes, and you
49
68
don't want the vertices to be scaled down to `0`. In this case, you need to increase `lookbackSeconds` to overlap
50
69
5 minutes, so that the calculated average rate and pending messages won't be `0` during the silent period, in order to prevent from
51
70
scaling down to 0.
52
71
The max value allowed to be configured is `600`.
53
72
On top of this, we have dynamic lookback adjustment which tunes this parameter based on the realtime processing data.
54
-
- `scaleUpCooldownSeconds`- After a scaling operation, how many seconds to wait for the same vertex, if the follow-up
73
+
- `scaleUpCooldownSeconds`- After a scaling operation, how many seconds to wait for the same Vertex or MonoVertex, if the follow-up
55
74
operation is a scaling up, defaults to `90`. Please make sure that the time is greater than the pod to be `Running` and
56
75
start processing, because the autoscaling algorithm will divide the TPS by the number of pods even if the pod is not `Running`.
57
-
- `scaleDownCooldownSeconds`- After a scaling operation, how many seconds to wait for the same vertex, if the follow-up
76
+
- `scaleDownCooldownSeconds`- After a scaling operation, how many seconds to wait for the same Vertex or MonoVertex, if the follow-up
58
77
operation is a scaling down, defaults to `90`.
59
-
- `zeroReplicaSleepSeconds`- After scaling a source vertex replicas down to `0`, how many seconds to wait before scaling up to 1 replica to peek, defaults to `120`.
60
-
Numaflow autoscaler periodically scales up a source vertex pod to "peek" the incoming data, this is the period of time to wait before peeking.
61
-
- `targetProcessingSeconds`- It is used to tune the aggressiveness of autoscaling for source vertices, it measures how
62
-
fast you want the vertex to process all the pending messages, defaults to `20`. It is only effective for the `Source` vertices that
78
+
- `zeroReplicaSleepSeconds`- After scaling a Source Vertex (or MonoVertex) replicas down to `0`, how many seconds to wait before scaling up to 1 replica to peek, defaults to `120`.
79
+
Numaflow autoscaler periodically scales up a source vertex (or MonoVertex) pod to "peek" the incoming data, this is the period of time to wait before peeking.
80
+
- `targetProcessingSeconds`- It is used to tune the aggressiveness of autoscaling for source vertices (or MonoVertex), it measures how
81
+
fast you want the vertex to process all the pending messages, defaults to `20`. It is only effective for the MonoVertices or `Source` vertices in a Pipeline that
63
82
support autoscaling, typically increasing the value leads to lower processing rate, thus less replicas.
64
-
- `targetBufferAvailability`- Targeted buffer availability in percentage, defaults to `50`. It is only effective for `UDF`
65
-
and `Sink` vertices, it determines how aggressive you want to do for autoscaling, increasing the value will bring more replicas.
83
+
- `targetBufferAvailability`- [[Pipeline](../../core-concepts/pipeline.md) Only] Targeted buffer availability in percentage, defaults to `50`. It is only effective for `UDF`
84
+
and `Sink` vertices of a Pipeline, it determines how aggressive you want to do for autoscaling, increasing the value will bring more replicas.
66
85
- `replicasPerScaleUp`- Maximum number of replica change happens in one scale up operation, defaults to `2`. For
67
86
example, if current replica number is 3, the calculated desired replica number is 8; instead of scaling up the vertex to 8, it only does 5.
68
87
- `replicasPerScaleDown`- Maximum number of replica change happens in one scale down operation, defaults to `2`. For
@@ -72,6 +91,7 @@ spec:
72
91
To disable Numaflow autoscaling, set `disabled: true` as following.
73
92
74
93
```yaml
94
+
# A Pipeline example.
75
95
apiVersion: numaflow.numaproj.io/v1alpha1
76
96
kind: Pipeline
77
97
metadata:
@@ -81,11 +101,20 @@ spec:
81
101
- name: my-vertex
82
102
scale:
83
103
disabled: true
104
+
---
105
+
# A MonoVertex example.
106
+
apiVersion: numaflow.numaproj.io/v1alpha1
107
+
kind: MonoVertex
108
+
metadata:
109
+
name: my-mvtx
110
+
spec:
111
+
scale:
112
+
disabled: true
84
113
```
85
114
86
115
**Notes**
87
116
88
-
Numaflow autoscaling does not apply to reduce vertices, and the source vertices which do not have a way to calculate their pending messages.
117
+
Numaflow autoscaling does not apply to reduce vertices of a Pipeline, and the source vertices which do not have a way to calculate their pending messages.
89
118
90
119
- Generator
91
120
- HTTP
@@ -98,6 +127,7 @@ For User-defined Sources, if the function `Pending()` returns a negative value,
98
127
[Kubernetes HPA](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) is supported in Numaflow for any type of Vertex. To use HPA, remember to point the `scaleTargetRef` to the vertex as below, and disable Numaflow autoscaling in your Pipeline spec.
99
128
100
129
```yaml
130
+
# A Pipeline example.
101
131
apiVersion: autoscaling/v2
102
132
kind: HorizontalPodAutoscaler
103
133
metadata:
@@ -114,6 +144,24 @@ spec:
114
144
apiVersion: numaflow.numaproj.io/v1alpha1
115
145
kind: Vertex
116
146
name: my-vertex
147
+
---
148
+
# A MonoVertex example.
149
+
apiVersion: autoscaling/v2
150
+
kind: HorizontalPodAutoscaler
151
+
metadata:
152
+
name: my-mvtx-hpa
153
+
spec:
154
+
minReplicas: 1
155
+
maxReplicas: 3
156
+
metrics:
157
+
- resource:
158
+
name: cpu
159
+
targetAverageUtilization: 50
160
+
type: Resource
161
+
scaleTargetRef:
162
+
apiVersion: numaflow.numaproj.io/v1alpha1
163
+
kind: MonoVertex
164
+
name: my-mvtx
117
165
```
118
166
119
167
With the configuration above, Kubernetes HPA controller will keep the target utilization of the pods of the Vertex at 50%.
@@ -127,6 +175,7 @@ Third party autoscaling tools like [KEDA](https://keda.sh/) are also supported i
127
175
To use KEDA for vertex autoscaling, same as Kubernetes HPA, point the `scaleTargetRef` to your vertex, and disable Numaflow autoscaling in your Pipeline spec.
128
176
129
177
```yaml
178
+
# A Pipeline example.
130
179
apiVersion: keda.sh/v1alpha1
131
180
kind: ScaledObject
132
181
metadata:
@@ -137,16 +186,36 @@ spec:
137
186
kind: Vertex
138
187
name: my-vertex
139
188
... ...
189
+
---
190
+
# A MonoVertex example.
191
+
apiVersion: keda.sh/v1alpha1
192
+
kind: ScaledObject
193
+
metadata:
194
+
name: my-keda-scaler
195
+
spec:
196
+
scaleTargetRef:
197
+
apiVersion: numaflow.numaproj.io/v1alpha1
198
+
kind: MonoVertex
199
+
name: my-mvtx
200
+
... ...
140
201
```
141
202
142
203
## Vertical Pod Autoscaling
143
204
144
205
`Vertical Pod Autoscaling`can be achieved by setting the `targetRef` to `Vertex` objects as following.
Copy file name to clipboardExpand all lines: docs/user-guide/reference/conditional-forwarding.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,8 @@
1
1
# Conditional Forwarding
2
2
3
-
After processing the data, conditional forwarding is doable based on the `Tags` returned in the result.
3
+
In a [pipeline](../../core-concepts/pipeline.md), after processing the data, conditional forwarding is doable based on the `tags` returned in the result.
4
4
Below is a list of different logic operations that can be done on tags.
5
+
5
6
-**and** - forwards the message if all the tags specified are present in Message's tags.
6
7
-**or** - forwards the message if one of the tags specified is present in Message's tags.
7
8
-**not** - forwards the message if all the tags specified are not present in Message's tags.
Copy file name to clipboardExpand all lines: docs/user-guide/reference/join-vertex.md
+10-8Lines changed: 10 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
1
# Joins and Cycles
2
2
3
-
Numaflow Pipeline Edges can be defined such that multiple Vertices can forward messages to a single vertex.
3
+
Numaflow [Pipeline](../../core-concepts/pipeline.md) Edges can be defined such that multiple Vertices can forward messages to a single vertex.
4
4
5
-
### Quick Start
5
+
### Quick Start
6
6
7
7
Please see the following examples:
8
8
@@ -15,8 +15,9 @@ Please see the following examples:
15
15
## Why do we need JOIN
16
16
17
17
### Without JOIN
18
-
Without JOIN, Numaflow could only allow users to build [pipelines](https://numaflow.numaproj.io/core-concepts/pipeline/) where [vertices](https://numaflow.numaproj.io/core-concepts/vertex/)
19
-
could only read from previous *one* vertex. This meant that Numaflow could only support simple pipelines or tree-like pipelines.
18
+
19
+
Without JOIN, Numaflow could only allow users to build [pipelines](../../core-concepts/pipeline.md) where [vertices](../../core-concepts/vertex.md)
20
+
could only read from previous _one_ vertex. This meant that Numaflow could only support simple pipelines or tree-like pipelines.
20
21
Supporting pipelines where you had to read from multiple sources or UDFs were cumbersome and required creating redundant
21
22
vertices.
22
23
@@ -27,7 +28,7 @@ vertices.
27
28
### With JOIN
28
29
29
30
Join vertices allow users the flexibility to read from multiple sources, process data from multiple UDFs, and even write
30
-
to a single sink. The Pipeline Spec doesn't change at all with JOIN, now you can create multiple Edges that have the
31
+
to a single sink. The Pipeline Spec doesn't change at all with JOIN, now you can create multiple Edges that have the
31
32
same “To” Vertex, which would have otherwise been prohibited.
To achieve higher throughput(> 10K but < 30K tps), users can create multi-partitioned edges.
4
-
Multi-partitioned edges are only supported for pipelines with JetStream as ISB. Please ensure
5
-
that the JetStream is provisioned with more nodes to support higher throughput.
3
+
To achieve higher throughput(> 10K but < 30K tps), users can create [pipelines](../../core-concepts/pipeline.md) with multi-partitioned edges. Multi-partitioned edges are only supported for [pipelines](../../core-concepts/pipeline.md) with JetStream as [Inter-Step Buffer](../../core-concepts/inter-step-buffer.md). Please ensure that the JetStream is provisioned with more nodes to support higher throughput.
6
4
7
-
Since partitions are owned by the vertex reading the data, to create a multi-partitioned edge
8
-
we need to configure the vertex reading the data (to-vertex) to have multiple partitions.
5
+
Since partitions are owned by the vertex reading the data, to create a multi-partitioned edge we need to configure the vertex reading the data (to-vertex) to have multiple partitions.
9
6
10
7
The following code snippet provides an example of how to configure a vertex (in this case, the `cat` vertex) to have multiple partitions, which enables it (`cat` vertex) to read at a higher throughput.
11
8
12
9
```yaml
10
+
apiVersion: numaflow.numaproj.io/v1alpha1
11
+
kind: Pipeline
12
+
metadata:
13
+
name: my-pipeline
14
+
spec:
15
+
vertices:
13
16
- name: cat
14
17
partitions: 3
15
18
udf:
16
19
builtin:
17
20
name: cat # A built-in UDF which simply cats the message
Similar to [pipeline tuning](./pipeline-tuning.md), certain parameters can be fine-tuned for the data processing using [MonoVertex](../../core-concepts/monovertex.md).
4
+
5
+
Each [MonoVertex](../../core-concepts/monovertex.md) keeps running the cycle of reading data from a data source,
6
+
processing the data, and writing to a sink. There are some parameters can be adjusted for this cycle.
7
+
8
+
-`readBatchSize` - How many messages to read for each cycle, defaults to `500`. It works together with `readTimeout` during a read operation, concluding when either limit is reached first.
9
+
-`readTimeout` - Read timeout from the source, defaults to `1s`.
10
+
11
+
These parameters can be customized under `spec.limits` as below.
0 commit comments