numaproj
diff --git a/‎docs/user-guide/reference/autoscaling.md‎
Lines changed: 83 additions & 14 deletions b/‎docs/user-guide/reference/autoscaling.md‎
Lines changed: 83 additions & 14 deletions
diff --git a/‎docs/user-guide/reference/conditional-forwarding.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/user-guide/reference/conditional-forwarding.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/user-guide/reference/edge-tuning.md‎
Lines changed: 0 additions & 36 deletions b/‎docs/user-guide/reference/edge-tuning.md‎
Lines changed: 0 additions & 36 deletions
diff --git a/‎docs/user-guide/reference/join-vertex.md‎
Lines changed: 10 additions & 8 deletions b/‎docs/user-guide/reference/join-vertex.md‎
Lines changed: 10 additions & 8 deletions
diff --git a/‎docs/user-guide/reference/multi-partition.md‎
Lines changed: 8 additions & 8 deletions b/‎docs/user-guide/reference/multi-partition.md‎
Lines changed: 8 additions & 8 deletions
diff --git a/‎docs/user-guide/reference/mvtx-tuning.md‎
Lines changed: 22 additions & 0 deletions b/‎docs/user-guide/reference/mvtx-tuning.md‎
Lines changed: 22 additions & 0 deletions
@@ -1,6 +1,6 @@
 # Autoscaling
 
-Numaflow is able to run with both `Horizontal Pod Autoscaling` and `Vertical Pod Autoscaling`.
+Numaflow [Pipeline](../../core-concepts/pipeline.md) and [MonoVertex](../../core-concepts/monovertex.md) are both able to run with `Horizontal Pod Autoscaling` and `Vertical Pod Autoscaling`.
 
 ## Horizontal Pod Autoscaling
 
@@ -12,12 +12,13 @@ Numaflow is able to run with both `Horizontal Pod Autoscaling` and `Vertical Pod
 
 ### Numaflow Autoscaling
 
-Numaflow provides `0 - N` autoscaling capability out of the box, it's available for all the `UDF`, `Sink` and most of
-the [`Source`](../sources/overview.md) vertices (please check each source for more details).
+Numaflow provides `0 - N` autoscaling capability out of the box, it's available for all the [MonoVertices](../../core-concepts/monovertex.md) and [Pipeline](../../core-concepts/pipeline.md) [vertices](../../core-concepts/vertex.md) including `UDF`, `Sink` and most of
+the [`Source`](../sources/overview.md) types (please check each source for more details).
 
-Numaflow autoscaling is enabled by default, there are some parameters can be tuned to achieve better results.
+Numaflow autoscaling is enabled by default, there are some parameters can be fine-tuned to achieve better results.
 
 ```yaml
+# A Pipeline example.
 apiVersion: numaflow.numaproj.io/v1alpha1
 kind: Pipeline
 metadata:
@@ -37,32 +38,50 @@ spec:
         targetBufferAvailability: 50 # Optional, defaults to 50.
         replicasPerScaleUp: 2 # Optional, defaults to 2.
         replicasPerScaleDown: 2 # Optional, defaults to 2.
+---
+# A MonoVertex example.
+apiVersion: numaflow.numaproj.io/v1alpha1
+kind: MonoVertex
+metadata:
+  name: my-mvtx
+spec:
+  scale:
+    disabled: false # Optional, defaults to false.
+    min: 0 # Optional, minimum replicas, defaults to 0.
+    max: 20 # Optional, maximum replicas, defaults to 50.
+    lookbackSeconds: 120 # Optional, defaults to 120.
+    scaleUpCooldownSeconds: 90 # Optional, defaults to 90.
+    scaleDownCooldownSeconds: 90 # Optional, defaults to 90.
+    zeroReplicaSleepSeconds: 120 # Optional, defaults to 120.
+    targetProcessingSeconds: 20 # Optional, defaults to 20.
+    replicasPerScaleUp: 2 # Optional, defaults to 2.
+    replicasPerScaleDown: 2 # Optional, defaults to 2.
 ```
 
 - `disabled` - Whether to disable Numaflow autoscaling, defaults to `false`.
 - `min` - Minimum replicas, valid value could be an integer >= 0. Defaults to `0`, which means it could be scaled down to 0.
 - `max` - Maximum replicas, positive integer which should not be less than `min`, defaults to `50`. if `max` and `min`
   are the same, that will be the fixed replica number.
-- `lookbackSeconds` - How many seconds to lookback for vertex average processing rate (tps) and pending messages calculation,
+- `lookbackSeconds` - How many seconds to lookback for average processing rate (tps) and pending messages calculation,
   defaults to `120`. Rate and pending messages metrics are critical for autoscaling, you might need to tune this parameter
   a bit to see better results. For example, your data source only have 1 minute data input in every 5 minutes, and you
   don't want the vertices to be scaled down to `0`. In this case, you need to increase `lookbackSeconds` to overlap
   5 minutes, so that the calculated average rate and pending messages won't be `0` during the silent period, in order to prevent from
   scaling down to 0.
   The max value allowed to be configured is `600`.
   On top of this, we have dynamic lookback adjustment which tunes this parameter based on the realtime processing data.
-- `scaleUpCooldownSeconds` - After a scaling operation, how many seconds to wait for the same vertex, if the follow-up
+- `scaleUpCooldownSeconds` - After a scaling operation, how many seconds to wait for the same Vertex or MonoVertex, if the follow-up
   operation is a scaling up, defaults to `90`. Please make sure that the time is greater than the pod to be `Running` and
   start processing, because the autoscaling algorithm will divide the TPS by the number of pods even if the pod is not `Running`.
-- `scaleDownCooldownSeconds` - After a scaling operation, how many seconds to wait for the same vertex, if the follow-up
+- `scaleDownCooldownSeconds` - After a scaling operation, how many seconds to wait for the same Vertex or MonoVertex, if the follow-up
   operation is a scaling down, defaults to `90`.
-- `zeroReplicaSleepSeconds` - After scaling a source vertex replicas down to `0`, how many seconds to wait before scaling up to 1 replica to peek, defaults to `120`.
-  Numaflow autoscaler periodically scales up a source vertex pod to "peek" the incoming data, this is the period of time to wait before peeking.
-- `targetProcessingSeconds` - It is used to tune the aggressiveness of autoscaling for source vertices, it measures how
-  fast you want the vertex to process all the pending messages, defaults to `20`. It is only effective for the `Source` vertices that
+- `zeroReplicaSleepSeconds` - After scaling a Source Vertex (or MonoVertex) replicas down to `0`, how many seconds to wait before scaling up to 1 replica to peek, defaults to `120`.
+  Numaflow autoscaler periodically scales up a source vertex (or MonoVertex) pod to "peek" the incoming data, this is the period of time to wait before peeking.
+- `targetProcessingSeconds` - It is used to tune the aggressiveness of autoscaling for source vertices (or MonoVertex), it measures how
+  fast you want the vertex to process all the pending messages, defaults to `20`. It is only effective for the MonoVertices or `Source` vertices in a Pipeline that
   support autoscaling, typically increasing the value leads to lower processing rate, thus less replicas.
-- `targetBufferAvailability` - Targeted buffer availability in percentage, defaults to `50`. It is only effective for `UDF`
-  and `Sink` vertices, it determines how aggressive you want to do for autoscaling, increasing the value will bring more replicas.
+- `targetBufferAvailability` - [[Pipeline](../../core-concepts/pipeline.md) Only] Targeted buffer availability in percentage, defaults to `50`. It is only effective for `UDF`
+  and `Sink` vertices of a Pipeline, it determines how aggressive you want to do for autoscaling, increasing the value will bring more replicas.
 - `replicasPerScaleUp` - Maximum number of replica change happens in one scale up operation, defaults to `2`. For
   example, if current replica number is 3, the calculated desired replica number is 8; instead of scaling up the vertex to 8, it only does 5.
 - `replicasPerScaleDown` - Maximum number of replica change happens in one scale down operation, defaults to `2`. For
@@ -72,6 +91,7 @@ spec:
 To disable Numaflow autoscaling, set `disabled: true` as following.
 
 ```yaml
+# A Pipeline example.
 apiVersion: numaflow.numaproj.io/v1alpha1
 kind: Pipeline
 metadata:
@@ -81,11 +101,20 @@ spec:
     - name: my-vertex
       scale:
         disabled: true
+---
+# A MonoVertex example.
+apiVersion: numaflow.numaproj.io/v1alpha1
+kind: MonoVertex
+metadata:
+  name: my-mvtx
+spec:
+  scale:
+    disabled: true
 ```
 
 **Notes**
 
-Numaflow autoscaling does not apply to reduce vertices, and the source vertices which do not have a way to calculate their pending messages.
+Numaflow autoscaling does not apply to reduce vertices of a Pipeline, and the source vertices which do not have a way to calculate their pending messages.
 
 - Generator
 - HTTP
@@ -98,6 +127,7 @@ For User-defined Sources, if the function `Pending()` returns a negative value,
 [Kubernetes HPA](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) is supported in Numaflow for any type of Vertex. To use HPA, remember to point the `scaleTargetRef` to the vertex as below, and disable Numaflow autoscaling in your Pipeline spec.
 
 ```yaml
+# A Pipeline example.
 apiVersion: autoscaling/v2
 kind: HorizontalPodAutoscaler
 metadata:
@@ -114,6 +144,24 @@ spec:
     apiVersion: numaflow.numaproj.io/v1alpha1
     kind: Vertex
     name: my-vertex
+---
+# A MonoVertex example.
+apiVersion: autoscaling/v2
+kind: HorizontalPodAutoscaler
+metadata:
+  name: my-mvtx-hpa
+spec:
+  minReplicas: 1
+  maxReplicas: 3
+  metrics:
+    - resource:
+        name: cpu
+        targetAverageUtilization: 50
+      type: Resource
+  scaleTargetRef:
+    apiVersion: numaflow.numaproj.io/v1alpha1
+    kind: MonoVertex
+    name: my-mvtx
 ```
 
 With the configuration above, Kubernetes HPA controller will keep the target utilization of the pods of the Vertex at 50%.
@@ -127,6 +175,7 @@ Third party autoscaling tools like [KEDA](https://keda.sh/) are also supported i
 To use KEDA for vertex autoscaling, same as Kubernetes HPA, point the `scaleTargetRef` to your vertex, and disable Numaflow autoscaling in your Pipeline spec.
 
 ```yaml
+# A Pipeline example.
 apiVersion: keda.sh/v1alpha1
 kind: ScaledObject
 metadata:
@@ -137,16 +186,36 @@ spec:
     kind: Vertex
     name: my-vertex
   ... ...
+---
+# A MonoVertex example.
+apiVersion: keda.sh/v1alpha1
+kind: ScaledObject
+metadata:
+  name: my-keda-scaler
+spec:
+  scaleTargetRef:
+    apiVersion: numaflow.numaproj.io/v1alpha1
+    kind: MonoVertex
+    name: my-mvtx
+  ... ...
 ```
 
 ## Vertical Pod Autoscaling
 
 `Vertical Pod Autoscaling` can be achieved by setting the `targetRef` to `Vertex` objects as following.
 
 ```yaml
+# A Pipeline example.
 spec:
   targetRef:
     apiVersion: numaflow.numaproj.io/v1alpha1
     kind: Vertex
     name: my-vertex
+---
+# A MonoVertex example.
+spec:
+  targetRef:
+    apiVersion: numaflow.numaproj.io/v1alpha1
+    kind: MonoVertex
+    name: my-mvtx
 ```
@@ -1,7 +1,8 @@
 # Conditional Forwarding
 
-After processing the data, conditional forwarding is doable based on the `Tags` returned in the result. 
+In a [pipeline](../../core-concepts/pipeline.md), after processing the data, conditional forwarding is doable based on the `tags` returned in the result.
 Below is a list of different logic operations that can be done on tags.
+
 - **and** - forwards the message if all the tags specified are present in Message's tags.
 - **or** - forwards the message if one of the tags specified is present in Message's tags.
 - **not** - forwards the message if all the tags specified are not present in Message's tags.
@@ -34,4 +35,3 @@ edges:
           - odd-tag
           - even-tag
 ```
-
@@ -1,8 +1,8 @@
 # Joins and Cycles
 
-Numaflow Pipeline Edges can be defined such that multiple Vertices can forward messages to a single vertex.
+Numaflow [Pipeline](../../core-concepts/pipeline.md) Edges can be defined such that multiple Vertices can forward messages to a single vertex.
 
-### Quick Start 
+### Quick Start
 
 Please see the following examples:
 
@@ -15,8 +15,9 @@ Please see the following examples:
 ## Why do we need JOIN
 
 ### Without JOIN
-Without JOIN, Numaflow could only allow users to build [pipelines](https://numaflow.numaproj.io/core-concepts/pipeline/) where [vertices](https://numaflow.numaproj.io/core-concepts/vertex/)
-could only read from previous *one* vertex. This meant that Numaflow could only support simple pipelines or tree-like pipelines. 
+
+Without JOIN, Numaflow could only allow users to build [pipelines](../../core-concepts/pipeline.md) where [vertices](../../core-concepts/vertex.md)
+could only read from previous _one_ vertex. This meant that Numaflow could only support simple pipelines or tree-like pipelines.
 Supporting pipelines where you had to read from multiple sources or UDFs were cumbersome and required creating redundant
 vertices.
 
@@ -27,7 +28,7 @@ vertices.
 ### With JOIN
 
 Join vertices allow users the flexibility to read from multiple sources, process data from multiple UDFs, and even write
-to a single sink. The Pipeline Spec doesn't change at all with JOIN, now you can create multiple Edges that have the 
+to a single sink. The Pipeline Spec doesn't change at all with JOIN, now you can create multiple Edges that have the
 same “To” Vertex, which would have otherwise been prohibited.
 
 ![Join Vertex](https://miro.medium.com/v2/resize:fit:1400/1*5Ct-5otqpXTAVCNW_SJnNw.png)
@@ -38,7 +39,7 @@ There is no limitation on which vertices can be joined. For instance, one can jo
 
 ## Benefits
 
-The introduction of Join Vertex allows users to eliminate redundancy in their pipelines. It supports many-to-one data 
+The introduction of Join Vertex allows users to eliminate redundancy in their pipelines. It supports many-to-one data
 flow without needing multiple vertices performing the same job.
 
 ## Examples
@@ -50,6 +51,7 @@ By joining the sink vertices, we now only need a single vertex responsible for s
 ![Join on Sink Vertex](https://miro.medium.com/v2/resize:fit:1400/1*5Ct-5otqpXTAVCNW_SJnNw.png)
 
 #### Example
+
 [Join on Sink Vertex](https://github.com/numaproj/numaflow/blob/main/examples/11-join-on-sink.yaml)
 
 ### Join on Map Vertex
@@ -79,8 +81,8 @@ use of this is a Map UDF which does some sort of reprocessing of data under cert
 
 ![Cycle](https://miro.medium.com/v2/resize:fit:1400/1*wYokY1wa9LhI1hKYimWiKA.png)
 
-Cycles are permitted, except in the case that there's a Reduce Vertex at or downstream of the cycle. (This is because a 
-cycle inevitably produces late data, which would get dropped by the Reduce Vertex. For this reason, cycles should be 
+Cycles are permitted, except in the case that there's a Reduce Vertex at or downstream of the cycle. (This is because a
+cycle inevitably produces late data, which would get dropped by the Reduce Vertex. For this reason, cycles should be
 used sparingly.)
 
 The following examples are of Cycles:
 
@@ -1,21 +1,21 @@
 # Multi-partitioned Edges
 
-To achieve higher throughput(> 10K but < 30K tps), users can create multi-partitioned edges.
-Multi-partitioned edges are only supported for pipelines with JetStream as ISB. Please ensure
-that the JetStream is provisioned with more nodes to support higher throughput.
+To achieve higher throughput(> 10K but < 30K tps), users can create [pipelines](../../core-concepts/pipeline.md) with multi-partitioned edges. Multi-partitioned edges are only supported for [pipelines](../../core-concepts/pipeline.md) with JetStream as [Inter-Step Buffer](../../core-concepts/inter-step-buffer.md). Please ensure that the JetStream is provisioned with more nodes to support higher throughput.
 
-Since partitions are owned by the vertex reading the data, to create a multi-partitioned edge
-we need to configure the vertex reading the data (to-vertex) to have multiple partitions.
+Since partitions are owned by the vertex reading the data, to create a multi-partitioned edge we need to configure the vertex reading the data (to-vertex) to have multiple partitions.
 
 The following code snippet provides an example of how to configure a vertex (in this case, the `cat` vertex) to have multiple partitions, which enables it (`cat` vertex) to read at a higher throughput.
 
 ```yaml
+apiVersion: numaflow.numaproj.io/v1alpha1
+kind: Pipeline
+metadata:
+  name: my-pipeline
+spec:
+  vertices:
     - name: cat
       partitions: 3
       udf:
         builtin:
           name: cat # A built-in UDF which simply cats the message
 ```
-
-
-
@@ -0,0 +1,22 @@
+# MonoVertex Tuning
+
+Similar to [pipeline tuning](./pipeline-tuning.md), certain parameters can be fine-tuned for the data processing using [MonoVertex](../../core-concepts/monovertex.md).
+
+Each [MonoVertex](../../core-concepts/monovertex.md) keeps running the cycle of reading data from a data source,
+processing the data, and writing to a sink. There are some parameters can be adjusted for this cycle.
+
+- `readBatchSize` - How many messages to read for each cycle, defaults to `500`. It works together with `readTimeout` during a read operation, concluding when either limit is reached first.
+- `readTimeout` - Read timeout from the source, defaults to `1s`.
+
+These parameters can be customized under `spec.limits` as below.
+
+```yaml
+apiVersion: numaflow.numaproj.io/v1alpha1
+kind: MonoVertex
+metadata:
+  name: my-mvtx
+spec:
+  limits:
+    readBatchSize: 100
+    readTimeout: 500ms
+```