Skip to content

Commit af4a35c

Browse files
authored
Merge pull request #103223 from rambk/BackUpVPN
Create ExpressRoute VPN Backup Doc
2 parents 6225c8b + c1840e0 commit af4a35c

File tree

4 files changed

+303
-0
lines changed

4 files changed

+303
-0
lines changed

articles/expressroute/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,8 @@
7070
items:
7171
- name: Private Peering
7272
href: designing-for-disaster-recovery-with-expressroute-privatepeering.md
73+
- name: Using VPN as a backup
74+
href: use-s2s-vpn-as-backup-for-expressroute-privatepeering.md
7375
- name: How-to guides
7476
items:
7577
- name: Create and modify a circuit
21.7 KB
Loading
43 KB
Loading
Lines changed: 301 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,301 @@
1+
---
2+
title: 'Using S2S VPN as a backup for Azure ExpressRoute Private Peering | Microsoft Docs'
3+
description: This page provides architectural recommendations for backing up Azure ExpressRoute private peering with S2S VPN.
4+
services: networking
5+
author: rambk
6+
7+
ms.service: expressroute
8+
ms.topic: article
9+
ms.date: 02/05/2020
10+
ms.author: rambala
11+
12+
---
13+
14+
# Using S2S VPN as a backup for ExpressRoute private peering
15+
16+
In the article titled [Designing for disaster recovery with ExpressRoute private peering][DR-PP], we discussed the need for backup connectivity solution for an ExpressRoute private peering connectivity and how to use geo-redundant ExpressRoute circuits for the purpose. In this article, let us consider how to leverage and maintain site-to-site (S2S) VPN as a back for ExpressRoute private peering.
17+
18+
Unlike geo-redundant ExpressRoute circuits, you can use ExpressRoute-VPN disaster recovery combination only in active-passive mode. A major challenge of using any backup network connectivity in the passive mode is that the passive connection would often fail alongside the primary connection. The common reason for the failures of the passive connection is lack of active maintenance. Therefore, in this article let's focus on how to verify and actively maintain S2S VPN connectivity that is backing up an ExpressRoute private peering.
19+
20+
>[!NOTE]
21+
>When a given route is advertised via both ExpressRoute and VPN, Azure would prefer routing over ExpressRoute.
22+
>
23+
24+
In this article, let's see how to verify the connectivity both from the Azure perspective and the customer side network edge perspective. Ability to validate from either end will help irrespective of whether or not you manage the customer side network devices that peer with the Microsoft network entities.
25+
26+
## Example topology
27+
28+
In our setup, we have an on-premises network connected to an Azure hub VNet via both an ExpressRoute circuit and a S2S VPN connection. The Azure hub VNet is in turn peered to a spoke VNet, as shown in the diagram below:
29+
30+
[![1]][1]
31+
32+
In the setup, the ExpressRoute circuit is terminated on a pair of "Customer Edge" (CE) routers at the on-premises. The on-premises LAN is connected to the CE routers via a pair of firewalls that operate in leader-follower mode. The S2S VPN is directly terminated on the firewalls.
33+
34+
The following table lists the key IP prefixes of the topology:
35+
36+
| **Entity** | **Prefix** |
37+
| --- | --- |
38+
| On-premises LAN | 10.1.11.0/25 |
39+
| Azure Hub VNet | 10.17.11.0/25 |
40+
| Azure spoke VNet | 10.17.11.128/26 |
41+
| On-premises test server | 10.1.11.10 |
42+
| Spoke VNet test VM | 10.17.11.132 |
43+
| ExpressRoute primary connection p2p subnet | 192.168.11.16/30 |
44+
| ExpressRoute secondary connection p2p subnet | 192.168.11.20/30 |
45+
| VPN gateway primary BGP peer IP | 10.17.11.76 |
46+
| VPN gateway secondary BGP peer IP | 10.17.11.77 |
47+
| On-premises firewall VPN BGP peer IP | 192.168.11.88 |
48+
| Primary CE router i/f towards firewall IP | 192.168.11.0/31 |
49+
| Firewall i/f towards primary CE router IP | 192.168.11.1/31 |
50+
| Secondary CE router i/f towards firewall IP | 192.168.11.2/31 |
51+
| Firewall i/f towards secondary CE router IP | 192.168.11.3/31 |
52+
53+
54+
The following table lists the ASNs of the topology:
55+
56+
| **Autonomous system** | **ASN** |
57+
| --- | --- |
58+
| On-premises | 65020 |
59+
| Microsoft Enterprise Edge | 12076 |
60+
| Virtual Network GW (ExR) | 65515 |
61+
| Virtual Network GW (VPN) | 65515 |
62+
63+
## High availability without asymmetricity
64+
65+
### Configuring for high availability
66+
67+
[Configure ExpressRoute and Site-to-Site coexisting connections][Conf-CoExist] discusses how to configure the coexisting ExpressRoute circuit and S2S VPN connections. As we discussed in [Designing for high availability with ExpressRoute][HA], to improve ExpressRoute high availability our setup maintains the network redundancy (avoids single point of failure) all the way up to the endpoints. Also both the primary and secondary connections of the ExpressRoute circuits are configured to operate in active-active mode by advertising the on-premises prefixes the same way through both the connections.
68+
69+
The on-premises route advertisement of the primary CE router through the primary connection of the ExpressRoute circuit is show below (Junos commands):
70+
71+
user@SEA-MX03-01> show route advertising-protocol bgp 192.168.11.18
72+
73+
Cust11.inet.0: 8 destinations, 8 routes (7 active, 0 holddown, 1 hidden)
74+
Prefix Nexthop MED Lclpref AS path
75+
* 10.1.11.0/25 Self I
76+
77+
The on-premises route advertisement of the secondary CE router through the secondary connection of the ExpressRoute circuit is show below (Junos commands):
78+
79+
user@SEA-MX03-02> show route advertising-protocol bgp 192.168.11.22
80+
81+
Cust11.inet.0: 8 destinations, 8 routes (7 active, 0 holddown, 1 hidden)
82+
Prefix Nexthop MED Lclpref AS path
83+
* 10.1.11.0/25 Self I
84+
85+
To improve the high availability of the backup connection, the S2S VPN is also configured in the active-active mode. The Azure VPN gateway configuration is shown below. Note as part of the VPN configuration VPN the BGP peer IP addresses of the gateway--10.17.11.76 and 10.17.11.77--are also listed.
86+
87+
[![2]][2]
88+
89+
The on-premises route is advertised by the firewalls to the primary and secondary BGP peers of the VPN gateway. The route advertisements are shown below (Junos):
90+
91+
user@SEA-SRX42-01> show route advertising-protocol bgp 10.17.11.76
92+
93+
Cust11.inet.0: 14 destinations, 21 routes (14 active, 0 holddown, 0 hidden)
94+
Prefix Nexthop MED Lclpref AS path
95+
* 10.1.11.0/25 Self I
96+
97+
{primary:node0}
98+
user@SEA-SRX42-01> show route advertising-protocol bgp 10.17.11.77
99+
100+
Cust11.inet.0: 14 destinations, 21 routes (14 active, 0 holddown, 0 hidden)
101+
Prefix Nexthop MED Lclpref AS path
102+
* 10.1.11.0/25 Self I
103+
104+
>[!NOTE]
105+
>Configuring the S2S VPN in active-active mode not only provides high-availability to your disaster recovery backup network connectivity, but also provides higher throughput to the backup connectivity. In other words, configuring S2S VPN in active-active mode is recommended as it force create multiple underlying tunnels.
106+
>
107+
108+
### Configuring for symmetric traffic flow
109+
110+
We noted that when a given on-premises route is advertised via both ExpressRoute and S2S VPN, Azure would prefer the ExpressRoute path. To force Azure prefer S2S VPN path over the coexisting ExpressRoute, you need to advertise more specific routes (longer prefix with bigger subnet mask) via the VPN connection. Our objective here is to use the VPN connections as back only. So, the default path selection behavior of Azure is in-line with our objective.
111+
112+
It is our responsibility to ensure that the traffic destined to Azure from on-premises also prefers ExpressRoute path over S2S VPN. The default local preference of the CE routers and firewalls in our on-premises setup is 100. So, by configuring the local preference of the routes received through the ExpressRoute private peerings greater than 100 (say 150), we can make the traffic destined to Azure prefer ExpressRoute circuit in the steady state.
113+
114+
The BGP configuration of the primary CE router that terminates the primary connection of the ExpressRoute circuit is shown below. Note the value of the local preference of the routes advertised over the iBGP session is configured to be 150. Similarly, we need to ensure the local preference of the secondary CE router that terminates the secondary connection of the ExpressRoute circuit is also configured to be 150.
115+
116+
user@SEA-MX03-01> show configuration routing-instances Cust11
117+
description "Customer 11 VRF";
118+
instance-type virtual-router;
119+
interface xe-0/0/0:0.110;
120+
interface ae0.11;
121+
protocols {
122+
bgp {
123+
group ibgp {
124+
type internal;
125+
local-preference 150;
126+
neighbor 192.168.11.1;
127+
}
128+
group ebgp {
129+
peer-as 12076;
130+
bfd-liveness-detection {
131+
minimum-interval 300;
132+
multiplier 3;
133+
}
134+
neighbor 192.168.11.18;
135+
}
136+
}
137+
}
138+
139+
The routing table of the on-premises firewalls confirms (shown below) that for the on-premises traffic that is destined to Azure the preferred path is over ExpressRoute in the steady state.
140+
141+
user@SEA-SRX42-01> show route table Cust11.inet.0 10.17.11.0/24
142+
143+
Cust11.inet.0: 14 destinations, 21 routes (14 active, 0 holddown, 0 hidden)
144+
+ = Active Route, - = Last Active, * = Both
145+
146+
10.17.11.0/25 *[BGP/170] 2d 00:34:04, localpref 150
147+
AS path: 12076 I, validation-state: unverified
148+
> to 192.168.11.0 via reth1.11
149+
to 192.168.11.2 via reth2.11
150+
[BGP/170] 2d 00:34:01, localpref 150
151+
AS path: 12076 I, validation-state: unverified
152+
> to 192.168.11.2 via reth2.11
153+
[BGP/170] 2d 21:12:13, localpref 100, from 10.17.11.76
154+
AS path: 65515 I, validation-state: unverified
155+
> via st0.118
156+
[BGP/170] 2d 00:41:51, localpref 100, from 10.17.11.77
157+
AS path: 65515 I, validation-state: unverified
158+
> via st0.119
159+
10.17.11.76/32 *[Static/5] 2d 21:12:16
160+
> via st0.118
161+
10.17.11.77/32 *[Static/5] 2d 00:41:56
162+
> via st0.119
163+
10.17.11.128/26 *[BGP/170] 2d 00:34:04, localpref 150
164+
AS path: 12076 I, validation-state: unverified
165+
> to 192.168.11.0 via reth1.11
166+
to 192.168.11.2 via reth2.11
167+
[BGP/170] 2d 00:34:01, localpref 150
168+
AS path: 12076 I, validation-state: unverified
169+
> to 192.168.11.2 via reth2.11
170+
[BGP/170] 2d 21:12:13, localpref 100, from 10.17.11.76
171+
AS path: 65515 I, validation-state: unverified
172+
> via st0.118
173+
[BGP/170] 2d 00:41:51, localpref 100, from 10.17.11.77
174+
AS path: 65515 I, validation-state: unverified
175+
> via st0.119
176+
177+
In the above route table, for the hub and spoke VNet routes--10.17.11.0/25 and 10.17.11.128/26--we see ExpressRoute circuit is preferred over VPN connections. The 192.168.11.0 and 192.168.11.2 are IPs on firewall interface towards CE routers.
178+
179+
## Validation of route exchange over S2S VPN
180+
181+
Earlier in this article, we verified on-premises route advertisement of the firewalls to the primary and secondary BGP peers of the VPN gateway. Additionally, let's confirm Azure routes received by the firewalls from the primary and secondary BGP peers of the VPN gateway.
182+
183+
user@SEA-SRX42-01> show route receive-protocol bgp 10.17.11.76 table Cust11.inet.0
184+
185+
Cust11.inet.0: 14 destinations, 21 routes (14 active, 0 holddown, 0 hidden)
186+
Prefix Nexthop MED Lclpref AS path
187+
10.17.11.0/25 10.17.11.76 65515 I
188+
10.17.11.128/26 10.17.11.76 65515 I
189+
190+
{primary:node0}
191+
user@SEA-SRX42-01> show route receive-protocol bgp 10.17.11.77 table Cust11.inet.0
192+
193+
Cust11.inet.0: 14 destinations, 21 routes (14 active, 0 holddown, 0 hidden)
194+
Prefix Nexthop MED Lclpref AS path
195+
10.17.11.0/25 10.17.11.77 65515 I
196+
10.17.11.128/26 10.17.11.77 65515 I
197+
198+
Similarly let's verify for on-premises network route prefixes received by the Azure VPN gateway.
199+
200+
PS C:\Users\user> Get-AzVirtualNetworkGatewayLearnedRoute -ResourceGroupName SEA-Cust11 -VirtualNetworkGatewayName SEA-Cust11-VNet01-gw-vpn | where {$_.Network -eq "10.1.11.0/25"} | select Network, NextHop, AsPath, Weight
201+
202+
Network NextHop AsPath Weight
203+
------- ------- ------ ------
204+
10.1.11.0/25 192.168.11.88 65020 32768
205+
10.1.11.0/25 10.17.11.76 65020 32768
206+
10.1.11.0/25 10.17.11.69 12076-65020 32769
207+
10.1.11.0/25 10.17.11.69 12076-65020 32769
208+
10.1.11.0/25 192.168.11.88 65020 32768
209+
10.1.11.0/25 10.17.11.77 65020 32768
210+
10.1.11.0/25 10.17.11.69 12076-65020 32769
211+
10.1.11.0/25 10.17.11.69 12076-65020 32769
212+
213+
As seen above, the VPN gateway has routes received both by the primary and secondary BGP peers of the VPN gateway. It also has visibility over the routes received via primary and secondary ExpressRoute connections (the ones with AS-path prepended with 12076). To confirm the routes received via VPN connections, we need to know the on-premises BGP peer IP of the connections. In our setup under consideration, it is 192.168.11.88 and we do see the routes received from it.
214+
215+
Next, let's verify the routes advertised by the Azure VPN gateway to the on-premises firewall BGP peer (192.168.11.88).
216+
217+
PS C:\Users\user> Get-AzVirtualNetworkGatewayAdvertisedRoute -Peer 192.168.11.88 -ResourceGroupName SEA-Cust11 -VirtualNetworkGatewayName SEA-Cust11-VNet01-gw-vpn | select Network, NextHop, AsPath, Weight
218+
219+
Network NextHop AsPath Weight
220+
------- ------- ------ ------
221+
10.17.11.0/25 10.17.11.76 65515 0
222+
10.17.11.128/26 10.17.11.76 65515 0
223+
10.17.11.0/25 10.17.11.77 65515 0
224+
10.17.11.128/26 10.17.11.77 65515 0
225+
226+
227+
Failure to see route exchanges indicate connection failure. See [Troubleshooting: An Azure site-to-site VPN connection cannot connect and stops working][VPN Troubleshoot] for help with troubleshooting the VPN connection.
228+
229+
## Testing failover
230+
231+
Now that we have confirmed successful route exchanges over the VPN connection (control plane), we are set to switch traffic (data plane) from the ExpressRoute connectivity to the VPN connectivity.
232+
233+
>[!NOTE]
234+
>In production environments failover testing has to be done during scheduled network maintenance work-window as it can be service disruptive.
235+
>
236+
237+
Prior to do the traffic switch, let's trace route the current path in our setup from the on-premises test server to the test VM in the spoke VNet.
238+
239+
C:\Users\PathLabUser>tracert 10.17.11.132
240+
241+
Tracing route to 10.17.11.132 over a maximum of 30 hops
242+
243+
1 <1 ms <1 ms <1 ms 10.1.11.1
244+
2 <1 ms <1 ms 11 ms 192.168.11.0
245+
3 <1 ms <1 ms <1 ms 192.168.11.18
246+
4 * * * Request timed out.
247+
5 6 ms 6 ms 5 ms 10.17.11.132
248+
249+
Trace complete.
250+
251+
The primary and secondary ExpressRoute point-to-point connection subnets of our setup are, respectively, 192.168.11.16/30 and 192.168.11.20/30. In the above trace route, in step 3 we see that we are hitting 192.168.11.18, which is the interface IP of the primary MSEE. Presence of MSEE interface confirms that as expected our current path is over the ExpressRoute.
252+
253+
As reported in the [Reset ExpressRoute circuit peerings][RST], let's use the following powershell commands to disable both the primary and secondary peering of the ExpressRoute circuit.
254+
255+
$ckt = Get-AzExpressRouteCircuit -Name "expressroute name" -ResourceGroupName "SEA-Cust11"
256+
$ckt.Peerings[0].State = "Disabled"
257+
Set-AzExpressRouteCircuit -ExpressRouteCircuit $ckt
258+
259+
The failover switch time depends on the BGP convergence time. In our setup, the failover switch takes a few seconds (less than 10). After the switch, repeat of the traceroute shows the following path:
260+
261+
C:\Users\PathLabUser>tracert 10.17.11.132
262+
263+
Tracing route to 10.17.11.132 over a maximum of 30 hops
264+
265+
1 <1 ms <1 ms <1 ms 10.1.11.1
266+
2 * * * Request timed out.
267+
3 6 ms 7 ms 9 ms 10.17.11.132
268+
269+
Trace complete.
270+
271+
The traceroute result confirms that the backup connection via S2S VPN is active and can provide service continuity if both the primary and secondary ExpressRoute connections fail. To complete the failover testing, let's enable the ExpressRoute connections back and normalize the traffic flow, using the following set of commands.
272+
273+
$ckt = Get-AzExpressRouteCircuit -Name "expressroute name" -ResourceGroupName "SEA-Cust11"
274+
$ckt.Peerings[0].State = "Enabled"
275+
Set-AzExpressRouteCircuit -ExpressRouteCircuit $ckt
276+
277+
To confirm the traffic is switched back to ExpressRoute, repeat the traceroute and ensure that it is going through the ExpressRoute private peering.
278+
279+
## Next steps
280+
281+
ExpressRoute is designed for high availability with no single point of failure within the Microsoft network. Still an ExpressRoute circuit is confined to a single geographical region and to a service provider. S2S VPN can be a good disaster recovery passive backup solution to an ExpressRoute circuit. For a dependable passive backup connection solution, regular maintenance of the passive configuration and periodical validation the connection are important. It is essential not to let the VPN configuration become stale, and to periodically (say every quarter) repeat the validation and failover test steps described in this article during maintenance window.
282+
283+
To enable monitoring and alerts based on VPN gateway metrics, see [Set up alerts on VPN Gateway metrics][VPN-alerts].
284+
285+
To expedite BGP convergence following an ExpressRoute failure, [Configure BFD over ExpressRoute][BFD].
286+
287+
<!--Image References-->
288+
[1]: ./media/use-s2s-vpn-as-backup-for-expressroute-privatepeering/topology.png "topology under consideration"
289+
[2]: ./media/use-s2s-vpn-as-backup-for-expressroute-privatepeering/vpn-gw-config.png "VPN GW configuration"
290+
291+
<!--Link References-->
292+
[DR-PP]: https://docs.microsoft.com/azure/expressroute/designing-for-disaster-recovery-with-expressroute-privatepeering
293+
[Conf-CoExist]: https://docs.microsoft.com/azure/expressroute/expressroute-howto-coexist-resource-manager
294+
[HA]: https://docs.microsoft.com/azure/expressroute/designing-for-high-availability-with-expressroute
295+
[VPN Troubleshoot]: https://docs.microsoft.com/azure/vpn-gateway/vpn-gateway-troubleshoot-site-to-site-cannot-connect
296+
[VPN-alerts]: https://docs.microsoft.com/azure/vpn-gateway/vpn-gateway-howto-setup-alerts-virtual-network-gateway-metric
297+
[BFD]: https://docs.microsoft.com/azure/expressroute/expressroute-bfd
298+
[RST]: https://docs.microsoft.com/azure/expressroute/expressroute-howto-reset-peering
299+
300+
301+

0 commit comments

Comments
 (0)