|
| 1 | +--- |
| 2 | +title: Azure ExpressRoute Gateway Resiliency Validation (preview) |
| 3 | +description: This article helps you understand the Azure ExpressRoute Gateway Resiliency Validation feature and how to use it. |
| 4 | +services: expressroute |
| 5 | +author: duongau |
| 6 | +ms.service: azure-expressroute |
| 7 | +ms.topic: conceptual |
| 8 | +ms.date: 03/31/2025 |
| 9 | +ms.author: duau |
| 10 | +ms.custom: ai-usage |
| 11 | +--- |
| 12 | + |
| 13 | +# Azure ExpressRoute Gateway Resiliency Validation (preview) |
| 14 | + |
| 15 | +Resiliency validation is a capability designed to assess the resiliency of network connectivity for ExpressRoute-enabled workloads. This feature allows you to perform site failovers for your virtual network gateway, helping to evaluate network resiliency during site outages and validate setup during migrations by testing the effectiveness of failover mechanisms. By proactively testing your network, you can ensure continuous connectivity to Azure workloads and ensure the robustness of your connections. |
| 16 | + |
| 17 | +> [!IMPORTANT] |
| 18 | +> **Azure ExpressRoute Resiliency Validation** is currently in PREVIEW. |
| 19 | +> See the [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/) for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability. |
| 20 | +
|
| 21 | +## Key features |
| 22 | + |
| 23 | +- **Simulate circuit failover** - Connections are disconnected temporarily from the gateway of interest to the selected ExpressRoute circuit to simulate a failover from one peering location to another. |
| 24 | +- **Route redundancy** - Insights into duplicate routes are provided for all prefixes received from the selected peering location. |
| 25 | +- **Traffic visualization** - Visualize traffic on the ExpressRoute gateway and all connections associated to it during testing. |
| 26 | +- **Test history** - Detailed information of previously conducted tests. |
| 27 | + |
| 28 | +### Common use cases |
| 29 | + |
| 30 | +- Facilitate in identifying and solving potential problems within your network to enhance the overall reliability and resiliency of your network infrastructure. |
| 31 | + |
| 32 | +- Essential for high availability and disaster recovery (HA/DR) procedures and migration validation. It ensures your systems are prepared for unplanned events and maintains seamless operations by validating maintenance behavior at the workload level. |
| 33 | + |
| 34 | +- Serves as a prerequisite for migrating from one ExpressRoute peering location to another, ensuring network resiliency before implementing major changes. |
| 35 | + |
| 36 | +### Limitations |
| 37 | + |
| 38 | +- The Resiliency Validation feature is available only for ExpressRoute gateways connected to ExpressRoute circuits in at least two distinct peering locations. |
| 39 | +- The **Route List** tab can only be refreshed once per hour. |
| 40 | +- This feature isn't supported for Virtual WAN or ExpressRoute Metro. |
| 41 | +- You can't run the Resiliency Validation test if there are any ongoing tests or if any of the circuits are currently undergoing maintenance. |
| 42 | + |
| 43 | +## Prerequisites |
| 44 | + |
| 45 | +- To participate in the preview, contact the [**Azure ExpressRoute **](mailto:[email protected]) team. |
| 46 | +- Ensure that you have an ExpressRoute circuit in at least two distinct peering locations and an ExpressRoute virtual network gateway connected to those circuits. |
| 47 | + |
| 48 | +## Using the gateway resiliency validation |
| 49 | + |
| 50 | +The gateway resiliency validation can be accessed from any ExpressRoute gateway resource by navigating to the **Monitoring** section in the left-hand menu. |
| 51 | + |
| 52 | +:::image type="content" source="media/resiliency-validation/resiliency-validation.png" alt-text="Screenshot of the Resiliency Validation feature, accessible under the monitoring section in the left-hand menu of the ExpressRoute gateway resource."::: |
| 53 | + |
| 54 | +The dashboard provides a detailed overview of all ExpressRoute circuits connected to the ExpressRoute virtual network gateway, categorized by peering location. It displays the most recent test status, the timestamp of the last test conducted, the results of the latest test, and an action button to initiate a new test. |
| 55 | + |
| 56 | +> [!IMPORTANT] |
| 57 | +> - During the test, the ExpressRoute virtual network gateway disconnect from the target ExpressRoute circuit, causing a temporary loss of connectivity for nonredundant routes. Ensure your routing policies are configured to support traffic failover. |
| 58 | +> - The targeted ExpressRoute circuit maintains connectivity to other ExpressRoute virtual network gateways, and the gateway doing the test maintains connectivity to other ExpressRoute circuits. |
| 59 | +
|
| 60 | +### Starting the test |
| 61 | + |
| 62 | +1. Navigate to the desired peering location and select the **Start new test** button. |
| 63 | + |
| 64 | +1. Review the autopopulated configuration, which includes: |
| 65 | + |
| 66 | + - Gateway name |
| 67 | + - Peering location |
| 68 | + - Route redundancy information |
| 69 | + - Traffic details |
| 70 | + - Status of all connections to the ExpressRoute gateway |
| 71 | + |
| 72 | +1. Ensure that all critical routes are marked as redundant by reviewing the **Route List** tab. |
| 73 | + |
| 74 | + :::image type="content" source="media/resiliency-validation/route-list.png" alt-text="Screenshot showing the Route List tab with details of redundant and nonredundant routes."::: |
| 75 | + |
| 76 | +1. Confirm that the circuits listed on this page aren't undergoing maintenance by selecting the first checkbox. |
| 77 | + |
| 78 | +1. Acknowledge that you reviewed the **Route List** tab and that all critical routes are marked as redundant by selecting the second checkbox. |
| 79 | + |
| 80 | +1. Enter the name of the gateway to confirm that you're aware of the potential effect of the test on your network. |
| 81 | + |
| 82 | +1. Select **Start Simulation** to initiate the test. |
| 83 | + |
| 84 | + :::image type="content" source="media/resiliency-validation/start-test.png" alt-text="Screenshot showing the Resiliency Validation testing page."::: |
| 85 | + |
| 86 | +1. The resiliency validation status shows as **In progress**. |
| 87 | + |
| 88 | +### During the test |
| 89 | + |
| 90 | +1. Navigate to the **Test Status** tab to validate connectivity to your Azure workloads through each redundant connection. Review the traffic flow graph for the ExpressRoute gateway, which displays the average bits per second traffic flow. The tab also provides ingress and egress traffic information for connected and disconnected peering locations. |
| 91 | + |
| 92 | + :::image type="content" source="media/resiliency-validation/test-status.png" alt-text="Screenshot showing the traffic flow graph for an ExpressRoute gateway and traffic data for connections to the gateway."::: |
| 93 | + |
| 94 | + > [!NOTE] |
| 95 | + > Traffic metrics are updated every minute and displayed in the **Test Status** tab. Allow up to 5 minutes for the metrics to appear after initiating the test. |
| 96 | +
|
| 97 | +1. Validate connectivity from your on-premises network to your Azure workloads through the redundant connection by sending data packets. Tools like [iPerf](https://iperf.fr/) can be used for this purpose. |
| 98 | + |
| 99 | +1. Select the **Stop Simulation** button to end the test. Confirm if the test was completed successfully when prompted and select the failover peering location. |
| 100 | + |
| 101 | +1. Once confirmed, connectivity for all connections to the ExpressRoute gateway gets restored. |
| 102 | + |
| 103 | +1. You can view the test report by selecting **View** under the *Test History* column on the dashboard for the selected peering location. |
| 104 | + |
| 105 | +## Frequently asked questions |
| 106 | + |
| 107 | +1. Why can't I see the Resiliency Insights feature in my ExpressRoute virtual network gateway? |
| 108 | + |
| 109 | + - The Resiliency Insights feature is currently in preview. To gain access, contact the [Azure ExpressRoute team ](mailto:[email protected]) for onboarding. |
| 110 | + - This feature is only available for ExpressRoute virtual network gateways configured in a Max Resiliency model. It isn't supported for Virtual WAN ExpressRoute gateways. |
| 111 | + - You must have Contributor-level authorization to access this feature. |
| 112 | + |
| 113 | +1. Why is the Route List not updated to the latest? |
| 114 | + |
| 115 | + The Route List tab has a polling interval of 1 hour. This means the pane won't refresh for 1 hour from the last updated time. |
| 116 | + |
| 117 | +1. Does the feature support Microsoft Peering or VPN connectivity? |
| 118 | + |
| 119 | + No, the Resiliency Insights feature supports only ExpressRoute Private Peering connectivity. It doesn't support Microsoft Peering or VPN connectivity. |
| 120 | + |
| 121 | +1. Can control the gateway validation tests other than the Azure portal? |
| 122 | + |
| 123 | + Yes, you can use REST API to start and stop the Gateway resiliency validation tests. |
| 124 | + |
| 125 | +1. What happens if I don't terminate a test? |
| 126 | + |
| 127 | + The test continues to run indefinitely. |
| 128 | + |
| 129 | +1. What metrics or alerts can I monitor during the resiliency validation test? |
| 130 | + |
| 131 | + To ensure network resilience during outages, redundant connections should be configured. During a failover, if the backup circuit exceeds 100% of its bandwidth, packet drops might occur. Use [Circuit QoS](monitor-expressroute-reference.md#category-circuit-qos) metrics to monitor packet drops caused by rate limiting. Additionally, the **Test Status** tab in the Resiliency Validation feature provides traffic monitoring for the connections. Ensure alerts are configured to validate their effectiveness during the test. |
| 132 | + |
| 133 | +1. Can I control traffic on demand using the gateway resiliency validation tool? |
| 134 | + |
| 135 | + Yes, if the routes are advertised redundantly through circuits in different peering locations, the gateway resiliency validation tool allows you to control traffic on demand by failing traffic over to connections in an alternative site. |
| 136 | + |
| 137 | +1. Does this feature support FastPath and Private Link? |
| 138 | + |
| 139 | + For FastPath, while the data path bypasses the gateway, the gateway still handles control plane activities such as route management. During a disconnect between the ExpressRoute circuit and the gateway, routes are withdrawn from the affected circuit. However, if redundant circuits are properly configured, connectivity for failover connections to FastPath and Private Link is maintained during the failover. |
| 140 | + |
| 141 | +1. Is packet loss expected during a failover simulation? |
| 142 | + |
| 143 | + A brief connectivity disruption occurs during the failover simulation as BGP (Border Gateway Protocol) reconverges. Performance tests using iPerf on TCP (up to 500 Mbps) show no packet loss during the simulation. However, in an actual outage scenario, some packet loss can occur until traffic successfully fails over. |
| 144 | + |
| 145 | +1. How long does a failover take? |
| 146 | + |
| 147 | + Once the simulation begins, traffic failover typically completes within 15 seconds. |
| 148 | + |
| 149 | +## Next steps |
| 150 | + |
| 151 | +- Learn more about the [ExpressRoute gateway](expressroute-about-virtual-network-gateways.md) and how to [monitor ExpressRoute circuits](monitor-expressroute.md). |
| 152 | +- Learn about [ExpressRoute Resiliency Insights](resiliency-insights.md). |
0 commit comments