From ffd246e6b12d6d2c25b05c6cd20f034e8b8e8f5f Mon Sep 17 00:00:00 2001
From: Saylor Berman
Date: Fri, 8 Aug 2025 11:44:52 -0600
Subject: [PATCH 1/3] NGF: Update scaling documentation and API
Updates the NGF scaling doc for new capabilities being added in 2.1. HorizontalPodAutoscaling and updating worker connections. Also added the updated API reference guide for these changes.
Finally, added some missing conditions in the compatibility doc.
---
content/ngf/how-to/scaling.md | 56 ++-
.../ngf/overview/gateway-api-compatibility.md | 2 +
content/ngf/reference/api.md | 435 +++++++++++++++++-
3 files changed, 468 insertions(+), 25 deletions(-)
diff --git a/content/ngf/how-to/scaling.md b/content/ngf/how-to/scaling.md
index 8e9961798..4f08e58a0 100644
--- a/content/ngf/how-to/scaling.md
+++ b/content/ngf/how-to/scaling.md
@@ -16,36 +16,60 @@ It provides guidance on how to scale each plane effectively, and when you should
The data plane is the NGINX deployment that handles user traffic to backend applications. Every Gateway object created provisions its own NGINX deployment and configuration.
-You have two options for scaling the data plane:
+You have multiple options for scaling the data plane:
+- Increasing the number of [worker connections](https://nginx.org/en/docs/ngx_core_module.html#worker_connections) for an existing deployment
- Increasing the number of replicas for an existing deployment
- Creating a new Gateway for a new data plane
-#### When to increase replicas or create a new Gateway
+#### When to increase worker connections, replicas, or create a new Gateway
-Understanding when to increase replicas or create a new Gateway is key to managing traffic effectively.
+Understanding when to increase worker connections, replicas, or create a new Gateway is key to managing traffic effectively.
-Increasing data plane replicas is ideal when you need to handle more traffic without changing the configuration.
+Increasing worker connections or replicas is ideal when you need to handle more traffic without changing the overall routing configuration. Setting the worker connections field allows a single nginx data plane instance to handle more connections without needing to scale the replicas. However, scaling the replicas can be beneficial to reduce single points of failure.
-For example, if you're routing traffic to `api.example.com` and notice an increase in load, you can scale the replicas from 1 to 5 to better distribute the traffic and reduce latency.
+Scaling replicas can be done manually or automatically using a [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) (HPA).
-All replicas will share the same configuration from the Gateway used to set up the data plane, simplifying configuration management.
+To update worker connections (default: 1024), static replicas, or enable autoscaling, you can edit the `NginxProxy` resource:
-There are two ways to modify the number of replicas for an NGINX deployment:
+```shell
+kubectl edit nginxproxies.gateway.nginx.org ngf-proxy-config -n nginx-gateway
+```
-First, at the time of installation you can modify the field `nginx.replicas` in the `values.yaml` or add the `--set nginx.replicas=` flag to the `helm install` command:
+{{< note >}}The NginxProxy resource in this example lives in the control plane namespace (default: `nginx-gateway`) and applies to the GatewayClass, but you can also define one per Gateway. See the [Data plane configuration]({{< ref "/ngf/how-to/data-plane-configuration.md" >}}) document for more information. {{< /note >}}
-```shell
-helm install ngf oci://ghcr.io/nginx/charts/nginx-gateway-fabric --create-namespace -n nginx-gateway --set nginx.replicas=5
+- Worker connections is set using the `workerConnections` field:
+
+```yaml
+spec:
+ workerConnections: 4096
```
-Secondly, you can update the `NginxProxy` resource while NGINX is running to modify the `kubernetes.deployment.replicas` field and scale the data plane deployment dynamically:
+- Replicas are set using the `kubernetes.deployment.replicas` field:
-```shell
-kubectl edit nginxproxies.gateway.nginx.org ngf-proxy-config -n nginx-gateway
+```yaml
+spec:
+ kubernetes:
+ deployment:
+ replicas: 3
```
-The alternate way to scale the data plane is by creating a new Gateway. This is beneficial when you need distinct configurations, isolation, or separate policies.
+- Autoscaling can be enabled using the `kubernetes.deployment.autoscaling` field. The default `replicas` value will be used until the Horizontal Pod Autoscaler is running.
+
+```yaml
+spec:
+ kubernetes:
+ deployment:
+ autoscaling:
+ enable: true
+ maxReplicas: 10
+```
+
+See the `NginxProxy` section of the [API reference]({{< ref "/ngf/reference/api.md" >}}) for the full specification.
+
+All of these fields are also available at installation time by setting them in the [helm values](https://github.com/nginx/nginx-gateway-fabric/blob/main/charts/nginx-gateway-fabric/values.yaml).
+
+An alternate way to scale the data plane is by creating a new Gateway. This is beneficial when you need distinct configurations, isolation, or separate policies.
For example, if you're routing traffic to a new domain `admin.example.com` and require a different TLS certificate, stricter rate limits, or separate authentication policies, creating a new Gateway could be a good approach.
@@ -60,7 +84,9 @@ Scaling the control plane can be beneficial in the following scenarios:
1. _Higher availability_ - When a control plane pod crashes, runs out of memory, or goes down during an upgrade, it can interrupt configuration delivery. By scaling to multiple replicas, another pod can quickly step in and take over, keeping things running smoothly with minimal downtime.
1. _Faster configuration distribution_ - As the number of connected NGINX instances grows, a single control plane pod may become a bottleneck in handling connections or streaming configuration updates. Scaling the control plane improves concurrency and responsiveness when delivering configuration over gRPC.
-To scale the control plane, use the `kubectl scale` command on the control plane deployment to increase or decrease the number of replicas. For example, the following command scales the control plane deployment to 3 replicas:
+To automatically scale the control plane, you can create a [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) (HPA) in the control plane namespace (default: `nginx-gateway`). At installation time, the [NGINX Gateway Fabric helm chart](https://github.com/nginx/nginx-gateway-fabric/blob/main/charts/nginx-gateway-fabric/values.yaml) allows you to set the HPA configuration in the `nginxGateway.autoscaling` section, which will provision an HPA for you. If NGINX Gateway Fabric is already running, then you can manually define the HPA and deploy it.
+
+To manually scale the control plane, use the `kubectl scale` command on the control plane deployment to increase or decrease the number of replicas. For example, the following command scales the control plane deployment to 3 replicas:
```shell
kubectl scale deployment -n nginx-gateway ngf-nginx-gateway-fabric --replicas 3
diff --git a/content/ngf/overview/gateway-api-compatibility.md b/content/ngf/overview/gateway-api-compatibility.md
index bda158664..0b77c3cc3 100644
--- a/content/ngf/overview/gateway-api-compatibility.md
+++ b/content/ngf/overview/gateway-api-compatibility.md
@@ -136,9 +136,11 @@ See the [controller]({{< ref "/ngf/reference/cli-help.md#controller">}}) command
- `ResolvedRefs/True/ResolvedRefs`
- `ResolvedRefs/False/InvalidCertificateRef`
- `ResolvedRefs/False/InvalidRouteKinds`
+ - `ResolvedRefs/False/RefNotPermitted`
- `Conflicted/True/ProtocolConflict`
- `Conflicted/True/HostnameConflict`
- `Conflicted/False/NoConflicts`
+ - `OverlappingTLSConfig/True/OverlappingHostnames`
### HTTPRoute
diff --git a/content/ngf/reference/api.md b/content/ngf/reference/api.md
index d88e6111b..47984b6a3 100644
--- a/content/ngf/reference/api.md
+++ b/content/ngf/reference/api.md
@@ -1827,6 +1827,23 @@ If not specified, or set to false, http2 will be enabled for all servers.
+disableSNIHostValidation
+
+bool
+
+ |
+
+(Optional)
+ DisableSNIHostValidation disables the validation that ensures the SNI hostname
+matches the Host header in HTTPS requests. When disabled, HTTPS connections can
+be reused for requests to different hostnames covered by the same certificate.
+This resolves HTTP/2 connection coalescing issues with wildcard certificates but
+introduces security risks as described in Gateway API GEP-3567.
+If not specified, defaults to false (validation enabled).
+ |
+
+
+
kubernetes
@@ -1839,6 +1856,19 @@ KubernetesSpec
Kubernetes contains the configuration for the NGINX Deployment and Service Kubernetes objects.
|
+
+
+workerConnections
+
+int32
+
+ |
+
+(Optional)
+ WorkerConnections specifies the maximum number of simultaneous connections that can be opened by a worker process.
+Default is 1024.
+ |
+
@@ -1988,6 +2018,114 @@ sigs.k8s.io/gateway-api/apis/v1alpha2.PolicyStatus
+AutoscalingSpec
+
+
+
+(Appears on:
+DeploymentSpec)
+
+
+
AutoscalingSpec is the configuration for the Horizontal Pod Autoscaling.
+
+
+
+
+| Field |
+Description |
+
+
+
+
+
+behavior
+
+
+Kubernetes autoscaling/v2.HorizontalPodAutoscalerBehavior
+
+
+ |
+
+(Optional)
+ Behavior configures the scaling behavior of the target
+in both Up and Down directions (scaleUp and scaleDown fields respectively).
+If not set, the default HPAScalingRules for scale up and scale down are used.
+ |
+
+
+
+targetCPUUtilizationPercentage
+
+int32
+
+ |
+
+(Optional)
+ Target cpu utilization percentage of HPA.
+ |
+
+
+
+targetMemoryUtilizationPercentage
+
+int32
+
+ |
+
+(Optional)
+ Target memory utilization percentage of HPA.
+ |
+
+
+
+minReplicas
+
+int32
+
+ |
+
+(Optional)
+ Minimum number of replicas.
+ |
+
+
+
+metrics
+
+
+[]Kubernetes autoscaling/v2.MetricSpec
+
+
+ |
+
+(Optional)
+ Metrics configures additional metrics options.
+ |
+
+
+
+maxReplicas
+
+int32
+
+ |
+
+ Maximum number of replicas.
+ |
+
+
+
+enable
+
+bool
+
+ |
+
+ Enable or disable Horizontal Pod Autoscaler.
+ |
+
+
+
ContainerSpec
@@ -2065,6 +2203,34 @@ until the action is complete, unless the container process fails, in which case
+readinessProbe
+
+
+ReadinessProbeSpec
+
+
+ |
+
+(Optional)
+ ReadinessProbe defines the readiness probe for the NGINX container.
+ |
+
+
+
+hostPorts
+
+
+[]HostPort
+
+
+ |
+
+(Optional)
+ HostPorts are the list of ports to expose on the host.
+ |
+
+
+
volumeMounts
@@ -2099,6 +2265,20 @@ until the action is complete, unless the container process fails, in which case
+container
+
+
+ContainerSpec
+
+
+ |
+
+(Optional)
+ Container defines container fields for the NGINX container.
+ |
+
+
+
pod
@@ -2113,16 +2293,16 @@ PodSpec
|
-container
+patches
-
-ContainerSpec
+
+[]Patch
|
(Optional)
- Container defines container fields for the NGINX container.
+Patches are custom patches to apply to the NGINX DaemonSet.
|
@@ -2159,6 +2339,20 @@ int32
|
+autoscaling
+
+
+AutoscalingSpec
+
+
+ |
+
+(Optional)
+ Autoscaling defines the configuration for Horizontal Pod Autoscaling.
+ |
+
+
+
pod
@@ -2185,6 +2379,20 @@ ContainerSpec
Container defines container fields for the NGINX container.
|
+
+
+patches
+
+
+[]Patch
+
+
+ |
+
+(Optional)
+ Patches are custom patches to apply to the NGINX Deployment.
+ |
+
DisableTelemetryFeature
@@ -2238,6 +2446,48 @@ routing only to endpoints on the same node as the traffic was received on
+HostPort
+
+
+
+(Appears on:
+ContainerSpec)
+
+
+
HostPort exposes an nginx container port on the host.
+
+
+
+
+| Field |
+Description |
+
+
+
+
+
+port
+
+int32
+
+ |
+
+ Port to expose on the host.
+ |
+
+
+
+containerPort
+
+int32
+
+ |
+
+ ContainerPort is the port on the nginx container to map to the HostPort.
+ |
+
+
+
IPFamilyType
(string alias)
@@ -2749,6 +2999,23 @@ If not specified, or set to false, http2 will be enabled for all servers.
+disableSNIHostValidation
+
+bool
+
+ |
+
+(Optional)
+ DisableSNIHostValidation disables the validation that ensures the SNI hostname
+matches the Host header in HTTPS requests. When disabled, HTTPS connections can
+be reused for requests to different hostnames covered by the same certificate.
+This resolves HTTP/2 connection coalescing issues with wildcard certificates but
+introduces security risks as described in Gateway API GEP-3567.
+If not specified, defaults to false (validation enabled).
+ |
+
+
+
kubernetes
@@ -2761,6 +3028,19 @@ KubernetesSpec
Kubernetes contains the configuration for the NGINX Deployment and Service Kubernetes objects.
|
+
+
+workerConnections
+
+int32
+
+ |
+
+(Optional)
+ WorkerConnections specifies the maximum number of simultaneous connections that can be opened by a worker process.
+Default is 1024.
+ |
+
NodePort
@@ -2791,9 +3071,7 @@ int32
|
- Port is the NodePort to expose.
-kubebuilder:validation:Minimum=1
-kubebuilder:validation:Maximum=65535
+Port is the NodePort to expose.
|
@@ -2804,9 +3082,7 @@ int32
|
- ListenerPort is the Gateway listener port that this NodePort maps to.
-kubebuilder:validation:Minimum=1
-kubebuilder:validation:Maximum=65535
+ListenerPort is the Gateway listener port that this NodePort maps to.
|
@@ -2862,6 +3138,84 @@ be unique across all targetRef entries in the ObservabilityPolicy.
+Patch
+
+
+
+(Appears on:
+DaemonSetSpec,
+DeploymentSpec,
+ServiceSpec)
+
+
+
Patch defines a patch to apply to a Kubernetes object.
+
+
+
+
+| Field |
+Description |
+
+
+
+
+
+type
+
+
+PatchType
+
+
+ |
+
+(Optional)
+ Type is the type of patch. Defaults to StrategicMerge.
+ |
+
+
+
+value
+
+k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.JSON
+
+ |
+
+(Optional)
+ Value is the patch data as raw JSON.
+For StrategicMerge and Merge patches, this should be a JSON object.
+For JSONPatch patches, this should be a JSON array of patch operations.
+ |
+
+
+
+PatchType
+(string alias)
+
+
+(Appears on:
+Patch)
+
+
+
PatchType specifies the type of patch.
+
+
+
+
+| Value |
+Description |
+
+
+"JSONPatch" |
+PatchTypeJSONPatch uses JSON patch (RFC 6902).
+ |
+
"Merge" |
+PatchTypeMerge uses merge patch (RFC 7386).
+ |
+
"StrategicMerge" |
+PatchTypeStrategicMerge uses strategic merge patch.
+ |
+
+
PodSpec
@@ -3003,6 +3357,53 @@ image isn’t present.
+ReadinessProbeSpec
+
+
+
+(Appears on:
+ContainerSpec)
+
+
+
ReadinessProbeSpec defines the configuration for the NGINX readiness probe.
+
+
+
+
+| Field |
+Description |
+
+
+
+
+
+port
+
+int32
+
+ |
+
+(Optional)
+ Port is the port on which the readiness endpoint is exposed.
+If not specified, the default port is 8081.
+ |
+
+
+
+initialDelaySeconds
+
+int32
+
+ |
+
+(Optional)
+ InitialDelaySeconds is the number of seconds after the container has
+started before the readiness probe is initiated.
+If not specified, the default is 3 seconds.
+ |
+
+
+
RewriteClientIP
@@ -3286,6 +3687,20 @@ Each NodePort MUST map to a Gateway listener port, otherwise it will be ignored.
The default NodePort range enforced by Kubernetes is 30000-32767.
+
+
+patches
+
+
+[]Patch
+
+
+ |
+
+(Optional)
+ Patches are custom patches to apply to the NGINX Service.
+ |
+
ServiceType
From 566243be6cfd03d15191dc423fda806c7cf2e3a4 Mon Sep 17 00:00:00 2001
From: Saylor Berman
Date: Fri, 8 Aug 2025 12:54:17 -0600
Subject: [PATCH 2/3] Code review
---
content/ngf/how-to/scaling.md | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/content/ngf/how-to/scaling.md b/content/ngf/how-to/scaling.md
index 4f08e58a0..60e739837 100644
--- a/content/ngf/how-to/scaling.md
+++ b/content/ngf/how-to/scaling.md
@@ -26,11 +26,11 @@ You have multiple options for scaling the data plane:
Understanding when to increase worker connections, replicas, or create a new Gateway is key to managing traffic effectively.
-Increasing worker connections or replicas is ideal when you need to handle more traffic without changing the overall routing configuration. Setting the worker connections field allows a single nginx data plane instance to handle more connections without needing to scale the replicas. However, scaling the replicas can be beneficial to reduce single points of failure.
+Increasing worker connections or replicas is ideal when you need to handle more traffic without changing the overall routing configuration. Setting the worker connections field allows a single NGINX data plane instance to handle more connections without needing to scale the replicas. However, scaling the replicas can be beneficial to reduce single points of failure.
Scaling replicas can be done manually or automatically using a [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) (HPA).
-To update worker connections (default: 1024), static replicas, or enable autoscaling, you can edit the `NginxProxy` resource:
+To update worker connections (default: 1024), replicas, or enable autoscaling, you can edit the `NginxProxy` resource:
```shell
kubectl edit nginxproxies.gateway.nginx.org ngf-proxy-config -n nginx-gateway
From 33e33f20affbfb0b8211ac28f450441d834e56d7 Mon Sep 17 00:00:00 2001
From: Saylor Berman
Date: Mon, 11 Aug 2025 08:17:20 -0600
Subject: [PATCH 3/3] Change callout type
---
content/ngf/how-to/scaling.md | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/content/ngf/how-to/scaling.md b/content/ngf/how-to/scaling.md
index 60e739837..b316cf1d0 100644
--- a/content/ngf/how-to/scaling.md
+++ b/content/ngf/how-to/scaling.md
@@ -36,7 +36,11 @@ To update worker connections (default: 1024), replicas, or enable autoscaling, y
kubectl edit nginxproxies.gateway.nginx.org ngf-proxy-config -n nginx-gateway
```
-{{< note >}}The NginxProxy resource in this example lives in the control plane namespace (default: `nginx-gateway`) and applies to the GatewayClass, but you can also define one per Gateway. See the [Data plane configuration]({{< ref "/ngf/how-to/data-plane-configuration.md" >}}) document for more information. {{< /note >}}
+{{< call-out "note" >}}
+
+The NginxProxy resource in this example lives in the control plane namespace (default: `nginx-gateway`) and applies to the GatewayClass, but you can also define one per Gateway. See the [Data plane configuration]({{< ref "/ngf/how-to/data-plane-configuration.md" >}}) document for more information.
+
+{{< /call-out >}}
- Worker connections is set using the `workerConnections` field: