|
| 1 | +--- |
| 2 | +title: Metrics and alerts for Azure NAT Gateway |
| 3 | +titleSuffix: Azure NAT Gateway |
| 4 | +description: Get started learning about Azure Monitor metrics and alerts available for monitoring Azure NAT Gateway. |
| 5 | +author: asudbring |
| 6 | +ms.service: nat-gateway |
| 7 | +ms.topic: how-to |
| 8 | +ms.date: 04/29/2024 |
| 9 | +ms.author: allensu |
| 10 | +# Customer intent: As an IT administrator, I want to understand available Azure Monitor metrics and alerts for Virtual Network NAT. |
| 11 | +--- |
| 12 | +# What is Azure NAT Gateway metrics and alerts? |
| 13 | + |
| 14 | +This article provides an overview of all NAT gateway metrics and diagnostic capabilities. This article provides general guidance on how to use metrics and alerts to monitor, manage, and [troubleshoot](troubleshoot-nat.md) your NAT gateway resource. |
| 15 | + |
| 16 | +Azure NAT Gateway provides the following diagnostic capabilities: |
| 17 | + |
| 18 | +- Multi-dimensional metrics and alerts through Azure Monitor. You can use these metrics to monitor and manage your NAT gateway and to assist you in troubleshooting issues. |
| 19 | + |
| 20 | +- Network Insights: Azure Monitor Insights provides you with visual tools to view, monitor, and assist you in diagnosing issues with your NAT gateway resource. Insights provide you with a topological map of your Azure setup and metrics dashboards. |
| 21 | + |
| 22 | +:::image type="content" source="./media/nat-gateway-resource/nat-gateway-deployment.png" alt-text="Diagram of a NAT gateway resource with virtual machines."::: |
| 23 | + |
| 24 | +*Figure: Azure NAT Gateway for outbound to Internet* |
| 25 | + |
| 26 | +## Metrics overview |
| 27 | + |
| 28 | +NAT gateway provides the following multi-dimensional metrics in Azure Monitor: |
| 29 | + |
| 30 | +| Metric | Description | Recommended aggregation | Dimensions | |
| 31 | +|---|---|---|---| |
| 32 | +| Bytes | Bytes processed inbound and outbound | Sum | **Direction (In; Out)**, **Protocol (6 TCP; 17 UDP)** | |
| 33 | +| Packets | Packets processed inbound and outbound | Sum | **Direction (In; Out)**, **Protocol (6 TCP; 17 UDP)** | |
| 34 | +| Dropped Packets | Packets dropped by the NAT gateway | Sum | / | |
| 35 | +| SNAT Connection Count | Number of new SNAT connections over a given interval of time | Sum | **Connection State (Attempted, Failed)**, **Protocol (6 TCP; 17 UDP)** | |
| 36 | +| Total SNAT Connection Count | Total number of active SNAT connections | Sum | **Protocol (6 TCP; 17 UDP)** | |
| 37 | +| Datapath Availability | Availability of the data path of the NAT gateway. Used to determine whether the NAT gateway endpoints are available for outbound traffic flow. | Avg | **Availability (0, 100)** | |
| 38 | + |
| 39 | +>[!NOTE] |
| 40 | +> Count aggregation is not recommended for any of the NAT gateway metrics. Count aggregation adds up the number of metric values and not the metric values themselves. Use Sum aggregation instead to get the best representation of data values for connection count, bytes, and packets metrics. |
| 41 | +> |
| 42 | +> Use average for best represented health data for the datapath availability metric. |
| 43 | +> |
| 44 | +> For information about aggregation types, see [aggregation types](/azure/azure-monitor/essentials/metrics-aggregation-explained#aggregation-types). |
| 45 | +
|
| 46 | +## Where to find my NAT gateway metrics |
| 47 | + |
| 48 | +NAT gateway metrics can be found in the following locations in the Azure portal. |
| 49 | + |
| 50 | +- **Metrics** page under **Monitoring** from a NAT gateway's resource page. |
| 51 | + |
| 52 | +- **Insights** page under **Monitoring** from a NAT gateway's resource page. |
| 53 | + |
| 54 | + :::image type="content" source="./media/nat-metrics/nat-insights-metrics.png" alt-text="Screenshot of the insights and metrics options in NAT gateway overview."::: |
| 55 | + |
| 56 | +- Azure Monitor page under **Metrics**. |
| 57 | + |
| 58 | + :::image type="content" source="./media/nat-metrics/azure-monitor.png" alt-text="Screenshot of the metrics section of Azure Monitor."::: |
| 59 | + |
| 60 | +To view any one of your metrics for a given NAT gateway resource: |
| 61 | + |
| 62 | +1. Select the NAT gateway resource you would like to monitor. |
| 63 | + |
| 64 | +1. In the **Metric** drop-down menu, select one of the provided metrics. |
| 65 | + |
| 66 | +1. In the **Aggregation** drop-down menu, select the recommended aggregation listed in the [metrics overview](#metrics-overview) table. |
| 67 | + |
| 68 | + :::image type="content" source="./media/nat-metrics/nat-metrics-1.png" alt-text="Screenshot of the metrics set up in NAT gateway resource."::: |
| 69 | + |
| 70 | +1. To adjust the time frame over which the chosen metric is presented on the metrics graph or to adjust how frequently the chosen metric is measured, select the **Time** window in the top right corner of the metrics page and make your adjustments. |
| 71 | + |
| 72 | + :::image type="content" source="./media/nat-metrics/nat-metrics-2.png" alt-text="Screenshot of the metrics time setup configuration in NAT gateway resource."::: |
| 73 | + |
| 74 | +## How to use NAT gateway metrics |
| 75 | + |
| 76 | +The following sections detail how to use each NAT gateway metric to monitor, manage, and troubleshoot your NAT gateway resource. |
| 77 | + |
| 78 | +### Bytes |
| 79 | + |
| 80 | +The **Bytes** metric shows you the amount of data going outbound through NAT gateway and returning inbound in response to an outbound connection. |
| 81 | + |
| 82 | +Use this metric to: |
| 83 | + |
| 84 | +- View the amount of data being processed through NAT gateway to connect outbound or return inbound. |
| 85 | + |
| 86 | +To view the amount of data passing through NAT gateway: |
| 87 | + |
| 88 | +1. Select the NAT gateway resource you would like to monitor. |
| 89 | + |
| 90 | +1. In the **Metric** drop-down menu, select the **Bytes** metric. |
| 91 | + |
| 92 | +1. In the **Aggregation** drop-down menu, select **Sum**. |
| 93 | + |
| 94 | +1. Select to **Add filter**. |
| 95 | + |
| 96 | +1. In the **Property** drop-down menu, select **Direction (Out | In)**. |
| 97 | + |
| 98 | +1. In the **Values** drop-down menu, select **Out**, **In**, or both. |
| 99 | + |
| 100 | +1. To see data processed inbound or outbound as their own individual lines in the metric graph, select **Apply splitting**. |
| 101 | + |
| 102 | +1. In the **Values** drop-down menu, select **Direction (Out | In)**. |
| 103 | + |
| 104 | +### Packets |
| 105 | + |
| 106 | +The packets metric shows you the number of data packets passing through NAT gateway. |
| 107 | + |
| 108 | +Use this metric to: |
| 109 | + |
| 110 | +- Verify that traffic is passing outbound or returning inbound through NAT gateway. |
| 111 | + |
| 112 | +- View the amount of traffic going outbound through NAT gateway or returning inbound. |
| 113 | + |
| 114 | +To view the number of packets sent in one or both directions through NAT gateway, follow the same steps in the [Bytes](#bytes) section. |
| 115 | + |
| 116 | +### Dropped packets |
| 117 | + |
| 118 | +The dropped packets metric shows you the number of data packets dropped by NAT gateway when traffic goes outbound or returns inbound in response to an outbound connection. |
| 119 | + |
| 120 | +Use this metric to: |
| 121 | + |
| 122 | +- Check if periods of dropped packets coincide with periods of failed SNAT connections with the [SNAT Connection Count](#snat-connection-count) metric. |
| 123 | + |
| 124 | +- Help determine if you're experiencing a pattern of failed outbound connections or SNAT port exhaustion. |
| 125 | + |
| 126 | +Possible reasons for dropped packets: |
| 127 | + |
| 128 | +- Outbound connectivity failure can cause packets to drop. Connectivity failure can happen for various reasons. See the [NAT gateway connectivity troubleshooting guide](/azure/nat-gateway/troubleshoot-nat-connectivity) to help you further diagnose. |
| 129 | + |
| 130 | +### SNAT connection count |
| 131 | + |
| 132 | +The SNAT connection count metric shows you the number of new SNAT connections within a specified time frame. This metric can be filtered by **Attempted** and **Failed** connection states. A failed connection volume greater than zero can indicate SNAT port exhaustion. |
| 133 | + |
| 134 | +Use this metric to: |
| 135 | + |
| 136 | +- Evaluate the health of your outbound connections. |
| 137 | + |
| 138 | +- Help diagnose if your NAT gateway is experiencing SNAT port exhaustion. |
| 139 | + |
| 140 | +- Determine if you're experiencing a pattern of failed outbound connections. |
| 141 | + |
| 142 | +To view the connection state of your connections: |
| 143 | + |
| 144 | +1. Select the NAT gateway resource you would like to monitor. |
| 145 | + |
| 146 | +1. In the **Metric** drop-down menu, select the **SNAT Connection Count** metric. |
| 147 | + |
| 148 | +1. In the **Aggregation** drop-down menu, select **Sum**. |
| 149 | + |
| 150 | +1. Select to **Add filter**. |
| 151 | + |
| 152 | +1. In the **Property** drop-down menu, select **Connection State**. |
| 153 | + |
| 154 | +1. In the **Values** drop-down menu, select **Attempted**, **Failed**, or both. |
| 155 | + |
| 156 | +1. To see attempted and failed connections as their own individual lines in the metric graph, select **Apply splitting**. |
| 157 | + |
| 158 | +1. In the **Values** drop-down menu, select **Connection State**. |
| 159 | + |
| 160 | + :::image type="content" source="./media/nat-metrics/nat-metrics-3.png" alt-text="Screenshot of the metrics configuration."::: |
| 161 | + |
| 162 | +### Total SNAT connection count |
| 163 | + |
| 164 | +The **Total SNAT connection count** metric shows you the total number of active SNAT connections passing through NAT gateway. |
| 165 | + |
| 166 | +You can use this metric to: |
| 167 | + |
| 168 | +- Evaluate the volume of connections passing through NAT gateway. |
| 169 | + |
| 170 | +- Determine if you're nearing the connection limit of NAT gateway. |
| 171 | + |
| 172 | +- Help assess if you're experiencing a pattern of failed outbound connections. |
| 173 | + |
| 174 | +Possible reasons for failed connections: |
| 175 | + |
| 176 | +- A pattern of failed connections can happen for various reasons. See the [NAT gateway connectivity troubleshooting guide](/azure/nat-gateway/troubleshoot-nat-connectivity) to help you further diagnose. |
| 177 | + |
| 178 | +>[!NOTE] |
| 179 | +> When NAT gateway is attached to a subnet and public IP address, the Azure platform verifies NAT gateway is healthy by conducting health checks. These health checks appear in NAT gateway's SNAT Connection Count metrics. The amount of health check related connections may vary as the health check service is optimized, but is negligible and doesn’t impact NAT gateway’s ability to connect outbound. |
| 180 | +
|
| 181 | +### Datapath availability |
| 182 | + |
| 183 | +The datapath availability metric measures the health of the NAT gateway resource over time. This metric indicates if NAT gateway is available for directing outbound traffic to the internet. This metric is a reflection of the health of the Azure infrastructure. |
| 184 | + |
| 185 | +You can use this metric to: |
| 186 | + |
| 187 | +- Monitor the availability of NAT gateway. |
| 188 | + |
| 189 | +- Investigate the platform where your NAT gateway is deployed and determine if it’s healthy. |
| 190 | + |
| 191 | +- Isolate whether an event is related to your NAT gateway or to the underlying data plane. |
| 192 | + |
| 193 | +Possible reasons for a drop in data path availability include: |
| 194 | + |
| 195 | +- An infrastructure outage. |
| 196 | + |
| 197 | +- There aren't healthy VMs available in your NAT gateway configured subnet. For more information, see the [NAT gateway connectivity troubleshooting guide](/azure/nat-gateway/troubleshoot-nat-connectivity). |
| 198 | + |
| 199 | +## Alerts |
| 200 | + |
| 201 | +Alerts can be configured in Azure Monitor for all NAT gateway metrics. These alerts proactively notify you when important conditions are found in your monitoring data. They allow you to identify and address potential issues with NAT gateway. |
| 202 | + |
| 203 | +For more information about how metric alerts work, see [Azure Monitor Metric Alerts](../azure-monitor/alerts/alerts-metric-overview.md). The following guidance describes how to configure some common and recommended types of alerts for your NAT gateway. |
| 204 | + |
| 205 | +### Alerts for datapath availability degradation |
| 206 | + |
| 207 | +Set up an alert on datapath availability to help you detect issues with the health of NAT gateway. |
| 208 | + |
| 209 | +The recommended guidance is to alert on NAT gateway’s datapath availability when it drops below 90% over a 15-minute period. This configuration is indicative of a NAT gateway resource being in a degraded state. |
| 210 | + |
| 211 | +To set up a datapath availability alert, follow these steps: |
| 212 | + |
| 213 | +1. From the NAT gateway resource page, select **Alerts**. |
| 214 | + |
| 215 | +1. Select **Create alert rule**. |
| 216 | + |
| 217 | +1. From the signal list, select **Datapath Availability**. |
| 218 | + |
| 219 | +1. From the **Operator** drop-down menu, select **Less than**. |
| 220 | + |
| 221 | +1. From the **Aggregation type** drop-down menu, select **Average**. |
| 222 | + |
| 223 | +1. In the **Threshold value** box, enter **90%**. |
| 224 | + |
| 225 | +1. From the **Unit** drop-down menu, select **Count**. |
| 226 | + |
| 227 | +1. From the **Aggregation granularity (Period)** drop-down menu, select **15 minutes**. |
| 228 | + |
| 229 | +1. Create an **Action** for your alert by providing a name, notification type, and type of action that is performed when the alert is triggered. |
| 230 | + |
| 231 | +1. Before deploying your action, **test the action group**. |
| 232 | + |
| 233 | +1. Select **Create** to create the alert rule. |
| 234 | + |
| 235 | +>[!NOTE] |
| 236 | +>Aggregation granularity is the period of time over which the datapath availability is measured to determine if it has dropped below the threshold value. |
| 237 | +Setting the aggregation granularity to less than 5 minutes may trigger false positive alerts that detect noise in the datapath. |
| 238 | + |
| 239 | +### Alerts for SNAT port exhaustion |
| 240 | + |
| 241 | +Set up an alert on the **SNAT connection count** metric to notify you of connection failures on your NAT gateway. A failed connection volume greater than zero can indicate that you reached the connection limit on your NAT gateway or that you hit SNAT port exhaustion. Investigate further to determine the root cause of these failures. |
| 242 | + |
| 243 | +To create the alert, use the following steps: |
| 244 | + |
| 245 | +1. From the NAT gateway resource page, select **Alerts**. |
| 246 | + |
| 247 | +1. Select **Create alert rule**. |
| 248 | + |
| 249 | +1. From the signal list, select **SNAT Connection Count**. |
| 250 | + |
| 251 | +1. From the **Aggregation type** drop-down menu, select **Total**. |
| 252 | + |
| 253 | +1. From the **Operator** drop-down menu, select **Greater than**. |
| 254 | + |
| 255 | +1. From the **Unit** drop-down menu, select **Count**. |
| 256 | + |
| 257 | +1. In the **Threshold value** box, enter 0. |
| 258 | + |
| 259 | +1. In the Split by dimensions section, select **Connection State** under Dimension name. |
| 260 | + |
| 261 | +1. Under Dimension values, select **Failed** connections. |
| 262 | + |
| 263 | +1. From the When to evaluate section, select **1 minute** under the **Check every** drop-down menu. |
| 264 | + |
| 265 | +1. For the lookback period, select **5 minutes** from the drop-down menu options. |
| 266 | + |
| 267 | +1. Create an **Action** for your alert by providing a name, notification type, and type of action that is performed when the alert is triggered. |
| 268 | + |
| 269 | +1. Before deploying your action, **test the action group**. |
| 270 | + |
| 271 | +1. Select **Create** to create the alert rule. |
| 272 | + |
| 273 | +>[!NOTE] |
| 274 | +>SNAT port exhaustion on your NAT gateway resource is uncommon. If you see SNAT port exhaustion, check if NAT gateway's idle timeout timer is set higher than the default amount of 4 minutes. A long idle timeout timer setting can cause SNAT ports too be in hold down for longer, which results in exhausting SNAT port inventory sooner. You can also scale your NAT gateway with additional public IPs to increase NAT gateway's overall SNAT port inventory. To troubleshoot these kinds of issues, refer to the [NAT gateway connectivity troubleshooting guide](/azure/nat-gateway/troubleshoot-nat-connectivity#snat-exhaustion-due-to-nat-gateway-configuration). |
| 275 | +
|
| 276 | +### Alerts for NAT gateway resource health |
| 277 | + |
| 278 | +[Azure Resource Health](/azure/service-health/overview) provides information on the health state of your NAT gateway resource. The resource health of your NAT gateway is evaluated by measuring the datapath availability of your NAT gateway endpoint. You can set up alerts to notify you when the health state of your NAT gateway resource changes. To learn more about NAT gateway resource health and setting up alerts, see: |
| 279 | + |
| 280 | +* [Azure NAT Gateway Resource Health](/azure/nat-gateway/resource-health) |
| 281 | + |
| 282 | +* [NAT Gateway Resource Health Alerts](/azure/nat-gateway/resource-health#resource-health-alerts) |
| 283 | + |
| 284 | +* [How to create Resource Health Alerts in the Azure portal](/azure/service-health/resource-health-alert-monitor-guide) |
| 285 | + |
| 286 | +## Network Insights |
| 287 | + |
| 288 | +[Azure Monitor Network Insights](../network-watcher/network-insights-overview.md) allows you to visualize your Azure infrastructure setup and to review all metrics for your NAT gateway resource from a preconfigured metrics dashboard. These visual tools help you diagnose and troubleshoot any issues with your NAT gateway resource. |
| 289 | + |
| 290 | +### View the topology of your Azure architectural setup |
| 291 | + |
| 292 | +To view a topological map of your setup in Azure: |
| 293 | + |
| 294 | +1. From your NAT gateway’s resource page, select **Insights** from the **Monitoring** section. |
| 295 | + |
| 296 | +1. On the landing page for **Insights**, there's a topology map of your NAT gateway setup. This map shows the relationship between the different components of your network (subnets, virtual machines, public IP addresses). |
| 297 | + |
| 298 | +1. Hover over any component in the topology map to view configuration information. |
| 299 | + |
| 300 | + :::image type="content" source="./media/nat-metrics/nat-insights.png" alt-text="Screenshot of the Insights section of NAT gateway."::: |
| 301 | + |
| 302 | +### View all NAT gateway metrics in a dashboard |
| 303 | + |
| 304 | +The metrics dashboard can be used to better understand the performance and health of your NAT gateway resource. The metrics dashboard shows a view of all metrics for NAT gateway on a single page. |
| 305 | + |
| 306 | +- All NAT gateway metrics can be viewed in a dashboard when selecting **Show Metrics Pane**. |
| 307 | + |
| 308 | + :::image type="content" source="./media/nat-metrics/nat-metrics-pane.png" alt-text="Screenshot of the show metrics pane."::: |
| 309 | + |
| 310 | +- A full page view of all NAT gateway metrics can be viewed when selecting **View Detailed Metrics**. |
| 311 | + |
| 312 | + :::image type="content" source="./media/nat-metrics/detailed-metrics.png" alt-text="Screenshot of the view detailed metrics."::: |
| 313 | + |
| 314 | +For more information on what each metric is showing you and how to analyze these metrics, see [How to use NAT gateway metrics](#how-to-use-nat-gateway-metrics). |
| 315 | + |
| 316 | +## Metrics FAQ |
| 317 | + |
| 318 | +### What type of metrics are available for NAT gateway? |
| 319 | + |
| 320 | +The NAT gateway supports [multi-dimensional metrics](/azure/azure-monitor/essentials/data-platform-metrics#multi-dimensional-metrics). You can filter the multi-dimensional metrics by different dimensions to gain greater insight into the provided data. The [SNAT connection count](#snat-connection-count) metric allows you to filter the connections by Attempted and Failed connections, enabling you to distinguish between different types of connections made by the NAT gateway. |
| 321 | + |
| 322 | +Refer to the dimensions column in the [metrics overview](#metrics-overview) table to see which dimensions are available for each NAT gateway metric. |
| 323 | + |
| 324 | +### How do I store NAT gateway metrics long-term? |
| 325 | + |
| 326 | +All [platform metrics are stored](/azure/azure-monitor/essentials/data-platform-metrics#retention-of-metrics) for 93 days. If you require long term access to your NAT gateway metrics data, NAT gateway metrics can be retrieved by using the [metrics REST API](/rest/api/monitor/metrics/list). For more information on how to use the API, see the [Azure monitoring REST API walkthrough](/azure/azure-monitor/essentials/rest-api-walkthrough). |
| 327 | + |
| 328 | +>[!NOTE] |
| 329 | +>Diagnostic Settings [doesn’t support the export of multi-dimensional metrics](/azure/azure-monitor/reference/supported-metrics/metrics-index#exporting-platform-metrics-to-other-locations) to another location, such as Azure Storage and Log Analytics. |
| 330 | +> |
| 331 | +>To retrieve NAT gateway metrics, use the metrics REST API. |
| 332 | +
|
| 333 | +### How do I interpret metrics charts? |
| 334 | + |
| 335 | +Refer to [troubleshooting metrics charts](/azure/azure-monitor/essentials/metrics-troubleshoot) if you run into issues with creating, customizing or interpreting charts in Azure metrics explorer. |
| 336 | + |
| 337 | +## Next steps |
| 338 | + |
| 339 | +* Learn about [Azure NAT Gateway](nat-overview.md) |
| 340 | +* Learn about [NAT gateway resource](nat-gateway-resource.md) |
| 341 | +* Learn about [Azure Monitor](../azure-monitor/overview.md) |
| 342 | +* Learn about [troubleshooting NAT gateway resources](troubleshoot-nat.md). |
| 343 | +* Learn about [troubleshooting NAT gateway connectivity](/azure/nat-gateway/troubleshoot-nat-connectivity) |
0 commit comments