generated from nginx/template-repository
-
Notifications
You must be signed in to change notification settings - Fork 120
NGF: Scaling control plane and data plane pods #490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
salonichf5
merged 9 commits into
nginx:ngf-feature-cp-dp-split
from
salonichf5:docs/scaling-ngf
May 8, 2025
Merged
Changes from 1 commit
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
3915a25
docs: scaling control plane and data plane pods
salonichf5 e61a1dc
Update content/ngf/how-to/scaling.md
salonichf5 3744e0a
Apply suggestions from code review
salonichf5 4d1c847
update based on reviews
salonichf5 540a795
Apply suggestions from code review
salonichf5 639e36c
fix pod name
salonichf5 eb1ad77
fix formatting issues
salonichf5 1499b66
Update content/ngf/how-to/scaling.md
salonichf5 375b1a2
fix space
salonichf5 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
--- | ||
title: Scaling control plane and data plane | ||
weight: 700 | ||
toc: true | ||
type: how-to | ||
product: NGF | ||
docs: DOCS-0000 | ||
--- | ||
|
||
Scaling the control plane and data plane has its own set of trade-offs. This guide walks you through how to scale each component effectively and helps you decide when to scale the data plane versus creating a new gateway, based on your traffic patterns. | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
--- | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
### Scaling the data plane | ||
|
||
Data plane constitutes of a single container running both agent and nginx processes. Agent recieves configuration from control plane over a streaming RPC. | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
Every gateway created, provisions a new NGINX deployment with its own configuration. We have a couple of options on how to scale data plane deployments. You can do so either by increasing the number of replicas for data plane pod or creating a new gateway to provision a new data plane. | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
#### When to create a new gateway vs Scale Data plane replicas | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
When using NGINX Gateway Fabric, understanding when to scale the data plane vs when to create a new gateway is key to managing traffic effectively. | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
Scaling data plane replicas is ideal when you need to handle more traffic without changing the configuration. For example, if you're routing traffic to `api.example.com` and notice an increase in load, you can scale the replicas from 1 to 5 to better distribute the traffic and reduce latency. All replicas will share the same configuration from the gateway used to set up the data plane, making configuration management easy. However, this approach can be limiting if you need to customize configurations for different use cases. Additionally, a fault in the configuration can affect all replicas, creating a potential single point of failure for that gateway. | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
You can increase the number of replicas for an NGINX deployment by modifying the field `nginx.replicas` in the `values.yaml` or add the `--set nginx.replicas=` flag to the `helm install` command. Below is an example to do so: | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
```text | ||
helm install ngf oci://ghcr.io/nginx/charts/nginx-gateway-fabric --create-namespace -n nginx-gateway --set nginx.replicas=5 | ||
``` | ||
|
||
Creating a new gateway is beneficial when you need distinct configurations, isolation, or separate policies. For example, if you're routing traffic to a new domain `admin.example.com` and require a different TLS certificate, stricter rate limits, or separate authentication policies, creating a new gateway is the right approach. It allows safe experimentation with isolated configurations and makes it easier to enforce security boundaries and apply specific routing rules. However, this comes with increased resource overhead since a new gateway creates a new deployment which can introduce complexity when managing multiple configurations if not well-organized. | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
--- | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
### Scaling the control plane | ||
|
||
Control plane is a kubernetes deployment in one container running the controller. It communicates with the agent (data plane) over gRPC to deliver configurations. With leader election enabled, the control plane can be scaled horizontally by running multiple replicas, although only the pod with leader lease can actively manage configuration status updates. | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
Scaling the control plane can be beneficial in the following scenarios: | ||
|
||
1. *Higher Availability* - When the control plane pod crashes, runs out of memory, or goes down during an upgrade, it can interrupt configuration delivery. By scaling to multiple replicas, leader election makes sure another pod can quickly step in and take over, keeping things running smoothly with minimal downtime. | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
2. *Faster Configuration Distribution* - As the number of connected NGINX instances grows, a single control plane pod may become a bottleneck in handling connections or streaming configuration updates. Scaling the control plane improves concurrency and responsiveness when delivering configuration over gRPC. | ||
3. *Improved Resilience* - Running multiple control plane replicas provides fault tolerance. Even if the leader fails, another replica can quickly take over the leader lease, preventing disruptions in config management and status updates. | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
To scale the control plane, use the `kubectl scale` command on the control plane deployment to increase or decrease the number of replicas. For example, the following command scales the control plane deployment to 3 replicas: | ||
|
||
```text | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
kubectl scale deployment -n nginx-gateway ngf-nginx-gateway-fabric --replicas 3 | ||
``` | ||
|
||
#### Known risks around scaling control plane | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
When scaling the control plane, it's important to understand how status updates are handled for data plane pods. | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
NGINX instances receive configurations from the control plane pods. However, only the leader control plane pod is allowed to write status updates to gateway resources. If an NGINX instance connects to a non-leader pod, the resource status may not be updated, which can result in configurations not being applied correctly. To prevent this, make sure leader election is stable, monitor for frequent leader changes, and avoid scaling the control plane pods excessively unless needed. | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
To identify which control plane pod currently holds the leader election lease, retrieve the leases in the same namespace as the control plane pods. For example: | ||
sjberman marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```text | ||
kubectl get leases -n nginx-gateway | ||
``` | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
The current leader lease is held by the pod `ngf-nginx-gateway-fabric-b45ffc8d6-d9z2g_2ef81ced-f19d-41a0-9fcd-a68d89380d10`: | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
```text | ||
NAME HOLDER AGE | ||
ngf-nginx-gateway-fabric-leader-election ngf-nginx-gateway-fabric-b45ffc8d6-d9z2g_2ef81ced-f19d-41a0-9fcd-a68d89380d10 16d | ||
``` | ||
salonichf5 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
--- |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.