|
| 1 | +--- |
| 2 | +title: Azure ExpressRoute Gateway Resiliency Validation (preview) |
| 3 | +description: This article helps you understand the Azure ExpressRoute Gateway Resiliency Validation feature and how to use it. |
| 4 | +services: expressroute |
| 5 | +author: duongau |
| 6 | +ms.service: azure-expressroute |
| 7 | +ms.topic: conceptual |
| 8 | +ms.date: 03/24/2025 |
| 9 | +ms.author: duau |
| 10 | +ms.custom: ai-usage |
| 11 | +--- |
| 12 | + |
| 13 | +# Azure ExpressRoute Gateway Resiliency Validation (preview) |
| 14 | + |
| 15 | +Ensuring uninterrupted connectivity to Azure workloads through ExpressRoute is essential for maintaining business continuity. We're committed to providing you with new capabilities to help maintain a resilient network. The *gateway resiliency validation* feature assesses how resilient your network is by testing a failure scenario and validating the failover mechanisms. By proactively testing your network resiliency, you can ensure that your workloads remain available and can recover quickly from disruptions. |
| 16 | + |
| 17 | +Another key aspect of this feature is the ability to identify misconfigurations and provide insights about your ExpressRoute connections from the ExpressRoute gateway perspective. This proactive approach allows you to validate the network behavior before major changes are implemented while also ensuring that your network is prepared for unexpected events. |
| 18 | + |
| 19 | +> [!IMPORTANT] |
| 20 | +> **Azure ExpressRoute Resiliency Validation** is currently in PREVIEW. |
| 21 | +> See the [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/) for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability. |
| 22 | +
|
| 23 | +## Key features |
| 24 | + |
| 25 | +- **Simulate circuit failover** - Connections are disconnected temporarily from the gateway of interest to the selected ExpressRoute circuit to simulate a failover from one peering location to another. |
| 26 | +- **Route redundancy** - Insights into duplicate routes are provided for all prefixes received from the selected peering location. |
| 27 | +- **Traffic visualization** - Visualize traffic going through the ExpressRoute gateway and all connections to an ExpressRoute circuit during testing. |
| 28 | +- **Test history** - Detailed information of previously conducted tests. |
| 29 | + |
| 30 | +### Common use cases |
| 31 | + |
| 32 | +- Facilitate in identifying and solving potential problems within your network to enhance the overall reliability and resiliency of your network infrastructure. |
| 33 | + |
| 34 | +- Essential for high availability and disaster recovery (HA/DR) procedures and migration validation. It ensures your systems are prepared for unplanned events and maintains seamless operations by validating maintenance behavior at the workload level. |
| 35 | + |
| 36 | +- Serves as a prerequisite for migrating from one ExpressRoute peering location to another, ensuring network resiliency before implementing major changes. |
| 37 | + |
| 38 | +### Limitations |
| 39 | + |
| 40 | +- The Resiliency Validation feature is available only for ExpressRoute gateways connected to ExpressRoute circuits in at least two distinct peering locations. |
| 41 | +- The **Route List** tab can only be refreshed once per hour. |
| 42 | +- This feature isn't supported for Virtual WAN or ExpressRoute Metro. |
| 43 | +- You can't run the Resiliency Validation test if there are any ongoing tests or if any of the circuits are currently undergoing maintenance. |
| 44 | + |
| 45 | +## Prerequisites |
| 46 | + |
| 47 | +- To participate in the preview, contact the [**ExpressRoute PM **](mailto:[email protected]) team. |
| 48 | +- Ensure that you have an ExpressRoute circuit in at least two distinct peering locations and an ExpressRoute gateway connected to those circuits. |
| 49 | + |
| 50 | +## Using the gateway resiliency validation |
| 51 | + |
| 52 | +The gateway resiliency validation can be accessed from any ExpressRoute gateway resource by navigating to the **Monitoring** section in the left-hand menu. |
| 53 | + |
| 54 | +:::image type="content" source="media/resiliency-validation/resiliency-validation.png" alt-text="Screenshot of the Resiliency Validation feature, accessible under the monitoring section in the left-hand menu of the ExpressRoute gateway resource."::: |
| 55 | + |
| 56 | +The dashboard provides a detailed overview of all ExpressRoute circuits connected to the ExpressRoute virtual network gateway, categorized by peering location. It displays the most recent test status, the timestamp of the last test conducted, the results of the latest test, and an action button to initiate a new test. |
| 57 | + |
| 58 | +> [!WARNING] |
| 59 | +> During the test, the ExpressRoute circuit disconnects from the ExpressRoute gateway, causing a temporary loss of connectivity for nonredundant routes. Ensure your routing policies are configured to support traffic failover. |
| 60 | +
|
| 61 | +### Starting the test |
| 62 | + |
| 63 | +1. Navigate to the desired peering location and select the **Start new test** button. |
| 64 | + |
| 65 | +1. Review the autopopulated configuration, which includes: |
| 66 | + |
| 67 | + - Gateway name |
| 68 | + - Peering location |
| 69 | + - Route redundancy information |
| 70 | + - Traffic details |
| 71 | + - Status of all connections to the ExpressRoute gateway |
| 72 | + |
| 73 | +1. Ensure that all critical routes are marked as redundant by reviewing the **Route List** tab. |
| 74 | + |
| 75 | + :::image type="content" source="media/resiliency-validation/route-list.png" alt-text="Screenshot showing the Route List tab with details of redundant and nonredundant routes."::: |
| 76 | + |
| 77 | +1. Confirm that the circuits listed on this page aren't undergoing maintenance by selecting the first checkbox. |
| 78 | + |
| 79 | +1. Acknowledge that you reviewed the **Route List** tab and that all critical routes are marked as redundant by selecting the second checkbox. |
| 80 | + |
| 81 | +1. Enter the name of the gateway to confirm that you're aware of the potential effect of the test on your network. |
| 82 | + |
| 83 | +1. Select **Start Simulation** to initiate the test. |
| 84 | + |
| 85 | + :::image type="content" source="media/resiliency-validation/start-test.png" alt-text="Screenshot showing the Resiliency Validation testing page."::: |
| 86 | + |
| 87 | +1. The resiliency validation status shows as **In progress**. |
| 88 | + |
| 89 | +### During the test |
| 90 | + |
| 91 | +1. Navigate to the **Test Status** tab to validate connectivity to your Azure workloads through each redundant connection. Review the traffic flow graph for the ExpressRoute gateway, which displays the average bits per second traffic flow. The tab also provides ingress and egress traffic information for connected and disconnected peering locations. |
| 92 | + |
| 93 | + :::image type="content" source="media/resiliency-validation/test-status.png" alt-text="Screenshot of the traffic flow graph for an ExpressRoute gateway and the traffic data on the connections to the gateway."::: |
| 94 | + |
| 95 | +1. Validate connectivity from your on-premises network to your Azure workloads through the redundant connection by sending data packets. Tools like [iPerf](https://iperf.fr/) can be used for this purpose. |
| 96 | + |
| 97 | +1. Select the **Stop Simulation** button to end the test. Confirm if the test was completed successfully when prompted. |
| 98 | + |
| 99 | +1. Once confirmed, connectivity for all connections to the ExpressRoute gateway gets restored. |
| 100 | + |
| 101 | +1. You can view the test result by selecting **View** under the *Test History* column on the dashboard for the selected peering location. |
| 102 | + |
| 103 | +## Frequently asked questions |
| 104 | + |
| 105 | +1. Can control the gateway validation tests other than the Azure portal? |
| 106 | + |
| 107 | + Yes, you can use REST API to start and stop the Gateway resiliency validation tests. |
| 108 | + |
| 109 | +2. What happens if I don't terminate a test? |
| 110 | + |
| 111 | + The tests continue to run indefinitely. |
| 112 | + |
| 113 | +3. What metrics or alerts are available to monitor during the test? |
| 114 | + |
| 115 | + The purpose of configuring redundant connections is to ensure network resilience during outages. If a single circuit is utilized at more than 50% of its bandwidth, packet drops might occur. During validation tests, the **Test Status** tab helps monitor traffic through the connections. You should expect [alerts](monitor-expressroute.md#alerts) if they're configured, providing an opportunity to validate their effectiveness. |
| 116 | + |
| 117 | + For more information, see [Circuit utilization](monitor-expressroute-reference.md#category-circuit-traffic) or [Connection traffic](monitor-expressroute-reference.md#category-traffic) for metrics you can set up alerts on. |
| 118 | + |
| 119 | +4. Can I control traffic on demand using the gateway resiliency validation tool? |
| 120 | + |
| 121 | + Yes, the gateway resiliency validation tool allows you to control traffic on demand. This is useful for testing different traffic scenarios and ensuring your network can handle various failovers. It can also be used to validate connectivity after successful site migrations before disconnecting the redundant circuit. |
| 122 | + |
| 123 | +5. Are there specific Role-Based Access Controls (RBAC) policies for this feature? |
| 124 | + |
| 125 | + Yes, there are specific RBAC policies to ensure that only authorized users with contributor access to the gateway can initiate downtime. |
| 126 | + |
| 127 | +6. When can I run this feature in a Virtual WAN setup or other resiliency models? |
| 128 | + |
| 129 | + For feedback or other requests, contact the [**ExpressRoute PM**](mailto:[email protected]). |
| 130 | + |
| 131 | +7. Does this feature work with FastPath and Private Link? |
| 132 | + |
| 133 | + For FastPath, although the data path bypasses the gateway, the gateway still manages control plane activities like route management. During a disconnect between the ExpressRoute circuit and the ExpressRoute gateway, routes are withdrawn from the gateway. However, connectivity for the failover connection to FastPath and Private Link is maintained during the failover. |
| 134 | + |
| 135 | +8. Is packet loss expected during this activity? |
| 136 | + |
| 137 | + During the failover simulation, a brief connectivity disruption occurs as BGP (Border Gateway Protocol) reestablishes. Performance tests using iPerf on TCP (Transmission Control Protocol) up to 500 Mbps show no packet loss. However, in a real outage scenario, some packet loss occurs until the traffic successfully fails over. |
| 138 | + |
| 139 | +9. How long does it take to fail over? |
| 140 | + |
| 141 | + Once the simulation start, it can take up to 15 seconds for the traffic to fail over. |
| 142 | + |
| 143 | +## Next steps |
| 144 | + |
| 145 | +- Learn more about the [ExpressRoute gateway](expressroute-about-virtual-network-gateways.md) and how to [monitor ExpressRoute circuits](monitor-expressroute.md). |
| 146 | +- Learn about [ExpressRoute Resiliency Insights](resiliency-insights.md). |
0 commit comments