NLB-5919 Inform users of the new nginxaas.capacity.percentage metric

arussellf5 · arussellf5 · commit 9e1d3ea4a4f5 · 2025-03-03T09:48:49.000-07:00
diff --git a/content/nginxaas-azure/changelog.md b/content/nginxaas-azure/changelog.md
@@ -11,6 +11,12 @@ To see a list of currently active issues, visit the [Known issues]({{< relref "/
 
 To review older entries, visit the [Changelog archive]({{< relref "/nginxaas-azure/changelog-archive" >}}) section.
 
+## March 3, 2025
+
+- {{% icon-resolved %}} **Percentage capacity metric**
+
+   We are introducing the new percentage capacity metric `nginxaas.capacity.percentage`, which more accurately estimates the load on your deployment than the old consumed NCUs metric. The new capacity metric expresses the capacity consumed as a percentage of the deployment's total capacity. Please modify any alerts and monitoring on deployment performance to take account of the new percentage capacity metric. The consumed NCUs metric is now on the path to deprecation. Please see [Scaling guidance]({{< relref "/nginxaas-azure/quickstart/scaling.md">}}) for more details.
+
 ## February 10, 2025
 
 - {{% icon-feature %}} **NGINXaaS Load Balancer for Kubernetes is now Generally Available**
diff --git a/content/nginxaas-azure/monitoring/metrics-catalog.md b/content/nginxaas-azure/monitoring/metrics-catalog.md
@@ -35,13 +35,15 @@ The metrics are categorized by the namespace used in Azure Monitor. The dimensio
 | --------------------- | -------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------- |
 | ncu.provisioned       |                | count    | The number of successfully provisioned NCUs during the aggregation interval. During scaling events, this may lag behind `ncu.requested` as the system works to achieve the request. Available for Standard plan deployments.                                                                                              | deployment      |
 | ncu.requested         |                | count    | The requested number of NCUs during the aggregation interval. Describes the goal state of the system. Available for Standard plan deployments.                                                                                                                                                                            | deployment      |
-| ncu.consumed          |                | count    | The estimated number of NCUs used to handle the current traffic. This may burst above the `ncu.provisioned`. This can be used to guide scaling out or in to match your workload. See [Scaling Guidance]({{< relref "/nginxaas-azure/quickstart/scaling.md#iterative-approach" >}}) for details. Available for Standard plan deployments. | deployment      |
+| nginxaas.capacity.percentage          |                | count    | The percentage of the deployment's total capacity being used. This may burst above 100%. This can be used to guide scaling out or in to match your workload. See [Scaling Guidance]({{< relref "/nginxaas-azure/quickstart/scaling.md#iterative-approach" >}}) for details. Available for Standard plan deployments. | deployment      |
 | system.worker_connections | pid process_name | count | The number of nginx worker connections used on the dataplane. This metric is one of the factors which determines the deployment's consumed NCU value.  | deployment |
 | nginxaas.certificates | name status    | count    | The number of certificates added to the NGINXaaS deployment dimensioned by the name of the certificate and its status. Refer to [Certificate Health]({{< relref "/nginxaas-azure/getting-started/ssl-tls-certificates/overview.md#monitor-certificates" >}}) to learn more about the status dimension.             | deployment      |
 | nginxaas.maxmind      | status         | count    | The status of any MaxMind license in use for downloading geoip2 databases. Refer to [License Health]({{< relref "/nginxaas-azure/quickstart/geoip2.md#monitoring" >}}) to learn more about the status dimension.                                                                                                                 | deployment      |
 
 {{</bootstrap-table>}}
 
+{{< note >}}The `ncu.consumed` metric is now deprecated and is on the path to retirement. Please change any alerting on this metric to use the new Capacity Percentage metric.{{< /note >}}
+
 ### NGINX connections statistics
 
 {{<bootstrap-table "table table-striped table-bordered">}}
@@ -223,6 +225,9 @@ The metrics are categorized by the namespace used in Azure Monitor. The dimensio
 | system.interface.packets_sent| interface | count | System Interface Packets Sent. | deployment |
 | system.interface.total_bytes| interface | count | System Interface Total Bytes,  sum of bytes_sent and bytes_rcvd. | deployment |
 | system.interface.egress_throughput| interface | count | System Interface Egress Throughput, i.e. bytes sent per second| deployment |
+| system.listener_backlog.max| interface | count | The fullness (expressed as a fraction) of the fullest backlog queue for a listen address. | deployment |
+| system.listener_backlog.length| interface | count | The number of items in a specific backlog queue, labelled by listen address. | deployment |
+| system.listener_backlog.queue_limit| interface | count | The capacity of a specific backlog queue, labelled by listen address. | deployment |
 
 {{</bootstrap-table>}}
 
diff --git a/content/nginxaas-azure/quickstart/scaling.md b/content/nginxaas-azure/quickstart/scaling.md
@@ -15,12 +15,12 @@ An NGINXaaS deployment can be scaled out to increase the capacity (increasing th
 
 In this document you will learn:
 
-* What an NGINX Capacity Unit (NCU) is
-* How to manually scale your deployment
-* How to enable autoscaling on your deployment
-* What capacity restrictions apply for your Marketplace plan
-* How to monitor capacity usage
-* How to estimate the amount of capacity to provision
+- What an NGINX Capacity Unit (NCU) is
+- How to manually scale your deployment
+- How to enable autoscaling on your deployment
+- What capacity restrictions apply for your Marketplace plan
+- How to monitor capacity usage
+- How to estimate the amount of capacity to provision
 
 ## NGINX Capacity Unit (NCU)
 
@@ -50,11 +50,11 @@ To enable autoscaling using the Azure Portal,
 
 ### Scaling rules
 
-NGINXaaS automatically adjusts the number of NCUs based on "scaling rules." A scaling rule defines when to scale, what direction to scale, and how much to scale. NGINXaaS will evaluate the following scaling rules, in order, based on consumed and provisioned NCU metrics.
+NGINXaaS automatically adjusts the number of NCUs based on "scaling rules." A scaling rule defines when to scale, what direction to scale, and how much to scale. NGINXaaS will evaluate the following scaling rules, in order, based on the percentage capacity consumed metric and the provisioned NCU metric.
 
- - *Moderate Increase Rule*: Over the last 5 minutes, if the average consumed NCUs is greater than or equal to 70% of the average provisioned NCUs, increase capacity by 20%.
- - *Urgent Increase Rule*: Over the last minute, if the number of consumed NCUs is greater than or equal to 85% of the number of provisioned NCUs, increase capacity by 20%.
- - *Decrease Rule*: Over the last 10 minutes, if the average consumed NCUs is less than or equal to 60% of the average provisioned NCUs, decrease capacity by 10%.
+- *Moderate Increase Rule*: Over the last 5 minutes, if the average capacity consumed is greater than or equal to 70% of the average provisioned NCUs, increase capacity by 20%.
+- *Urgent Increase Rule*: Over the last minute, if the capacity consumed is greater than or equal to 85% of the number of provisioned NCUs, increase capacity by 20%.
+- *Decrease Rule*: Over the last 10 minutes, if the average capacity consumed is less than or equal to 60% of the average provisioned NCUs, decrease capacity by 10%.
 
 To avoid creating a loop between scaling rules, NGINXaaS will not apply a scaling rule if it predicts that doing so would immediately trigger an opposing rule. For example, if the the "Urgent Increase Rule" is triggered due to a sudden spike in traffic, but the new capacity will cause the "Decrease Rule" to trigger immediately after, the autoscaler will not increase capacity. This prevents the deployment's capacity from increasing and decreasing erratically.
 
@@ -63,6 +63,7 @@ To avoid creating a loop between scaling rules, NGINXaaS will not apply a scalin
 The following table outlines constraints on the specified capacity based on the chosen Marketplace plan, including the minimum capacity required for a deployment to be highly available, the maximum capacity, and what value the capacity must be a multiple of. By default, an NGINXaaS for Azure deployment will be created with the corresponding minimum capacity.
 
 {{<bootstrap-table "table table-striped table-bordered">}}
+
 | **Marketplace Plan**         | **Minimum Capacity (NCUs)** | **Maximum Capacity (NCUs)** | **Multiple of**            |
 |------------------------------|-----------------------------|-----------------------------|----------------------------|
 | Standard                     | 10                          | 500                         | 10                         |
@@ -78,51 +79,52 @@ The following table outlines constraints on the specified capacity based on the
 
 NGINXaaS provides metrics for visibility of the current and historical capacity values. These metrics, in the `NGINXaaS Statistics` namespace, include:
 
-* NCUs Requested: `ncu.requested` -- how many NCUs have been requested using the API. This is the goal state of the system at that point in time.
-* NCUs Provisioned: `ncu.provisioned` -- how many NCUs have been successfully provisioned by the service.
-  * This is the basis for [billing]({{< relref "/nginxaas-azure/billing/overview.md" >}}).
-  * This may differ from `ncu.requested` temporarily during scale-out/scale-in events or during automatic remediation for a hardware failure.
-* NCUs Consumed: `ncu.consumed` -- how many NCUs the current workload is using.
-  * If this is under 60% of the provisioned capacity, consider scaling in to reduce costs. If this is over 70% of the provisioned capacity, consider scaling out; otherwise, requests may fail or take longer than expected. Alternatively, enable autoscaling, so your deployment can automatically scale based on the consumed and provisioned capacity.
-  * This value may burst higher than `ncu.requested` due to variation in provisioned hardware. You will still only be billed for the minimum of `ncu.requested` and `ncu.provisioned`.
+- NCUs Requested: `ncu.requested` -- how many NCUs have been requested using the API. This is the goal state of the system at that point in time.
+- NCUs Provisioned: `ncu.provisioned` -- how many NCUs have been successfully provisioned by the service.
+  - This is the basis for [billing]({{< relref "/nginxaas-azure/billing/overview.md" >}}).
+  - This may differ from `ncu.requested` temporarily during scale-out/scale-in events or during automatic remediation for a hardware failure.
+- Capacity Percentage: `nginxaas.capacity.percentage` -- the percentage of the current workload's total capacity that is being used.
+  - If this is under 60%, consider scaling in to reduce costs. If this is over 70%, consider scaling out; otherwise, requests may fail or take longer than expected. Alternatively, enable autoscaling, so your deployment can automatically scale based on the amount of capacity consumed.
+  - This value may burst higher than `ncu.requested` due to variation in provisioned hardware. You will still only be billed for the minimum of `ncu.requested` and `ncu.provisioned`.
 
 See the [Metrics Catalog]({{< relref "/nginxaas-azure/monitoring/metrics-catalog.md" >}}) for a reference of all metrics.
 
 {{< note >}}These metrics aren't visible unless enabled, see how to [Enable Monitoring]({{< relref "/nginxaas-azure/monitoring/enable-monitoring.md" >}}) for details.{{< /note >}}
+{{< note >}}The NCUs Consumed metric is now deprecated and is on the path to retirement. Please change any alerting on this metric to use the new Capacity Percentage metric.{{< /note >}}
 
 ## Estimating how many NCUs to provision
 
 To calculate how many NCUs to provision, take the highest value across the parameters that make up an NCU:
 
-* CPU
-* Bandwidth
-* Concurrent connections
+- CPU
+- Bandwidth
+- Concurrent connections
 
 Example 1: "I need to support 2,000 concurrent connections but only 4 Mbps of traffic. I need 52 ACUs." You would need `Max(52/20, 4/60, 2000/400)` = `Max(2.6, 0.07, 5)` = At least 5 NCUs.
 
 Example 2: "I don't know any of these yet!" Either start with the minimum and [adjust capacity](#adjusting-capacity) with the [iterative approach](#iterative-approach) described below, or [enable autoscaling](#autoscaling).
 
-In addition to the maximum capacity needed, we recommend adding a 10% to 20% buffer of additional NCUs to account for unexpected spikes in traffic. Monitor the [NCUs Consumed metric](#metrics) over time to determine your peak usage levels and adjust your requested capacity accordingly.
+In addition to the maximum capacity needed, we recommend adding a 10% to 20% buffer of additional capacity to account for unexpected spikes in traffic. Monitor the [Percentage Capacity Metric](#metrics) over time to determine your peak usage levels and adjust your requested capacity accordingly.
 
 ### Iterative approach
 
 1. Make an estimate by either:
-    * using the [Usage and Cost Estimator]({{< relref "/nginxaas-azure/billing/usage-and-cost-estimator.md" >}})
-    * compare to a [reference workload](#reference-workloads)
-2. Observe the `ncu.consumed` [metric](#metrics) in Azure Monitor of your workload
+    - using the [Usage and Cost Estimator]({{< relref "/nginxaas-azure/billing/usage-and-cost-estimator.md" >}})
+    - compare to a [reference workload](#reference-workloads)
+2. Observe the `nginxaas.capacity.percentage` [metric](#metrics) in Azure Monitor of your workload
 3. Decide what headroom factor you wish to have
-4. Multiply the headroom factor by the consumed NCUs to get the target NCUs.
+4. Multiply the headroom factor by the provisioned NCUs to get the target NCUs.
 5. [Adjust capacity](#adjusting-capacity)  to the target NCUs
 6. repeat from step 2 -- it is always good to check back after making a change
 
 *Example*:
 
 1. I am really unsure what size I needed so I just specified the default capacity,  `20NCUs`.
-2. I observe that my `ncu.consumed` is currently at `18NCUs`.
+2. I observe that my `nginxaas.capacity.percentage` is currently at `90%`.
 3. This is early morning, traffic. I think midday traffic could be 3x what it is now.
-4. `18 * 3 = 54` is my target capacity.
+4. `90% * 3 = 270%. 2.7 * 20 NCUs = 54 NCUs` 54 NCUs is my target capacity.
 5. I can see that I need to scale by multiples of 10 so I'm going to scale out to `60NCUs`.
-6. At midday I can see that I overestimated the traffic I would be getting and it was still a busy day. We peaked at `41NCUs`, let me scale in to `50NCUs` to reduce my cost.
+6. At midday I can see that I overestimated the traffic I would be getting and it was still a busy day. We peaked at `68%` of capacity, let me scale in to `50NCUs` to reduce my cost.
 
 ### Reference workloads