Skip to content

Commit 7c2bb0f

Browse files
Fix formatting and update metadata date.
Co-authored-by: Rene van den Bedem <[email protected]> Co-authored-by: Ricky Perez <[email protected]>
1 parent ac0dae3 commit 7c2bb0f

File tree

1 file changed

+58
-60
lines changed

1 file changed

+58
-60
lines changed

articles/azure-vmware/azure-vmware-solution-nsx-scale-and-performance-recommendations-for-vmware-hcx.md

Lines changed: 58 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: NSX Scale and Performance Recommendations for VMware HCX
33
description: Learn about the default NSX Topology in Azure VMware Solution and recommended practices to mitigate performance issues around HCX migration use cases.
44
ms.topic: how-to
55
ms.service: azure-vmware
6-
ms.date: 12/18/2024
6+
ms.date: 12/19/2024
77
ms.custom: engagement-fy25
88
---
99

@@ -15,99 +15,97 @@ In this article, learn about the default NSX topology in Azure VMware Solution,
1515

1616
The Azure VMware Solution NSX default topology has the following configuration:
1717

18-
* Three node NSX Manager cluster.
18+
* Three node NSX Manager cluster.
1919

20-
* NSX Edge and Gateway for North-bound traffic:
20+
* NSX Edge and Gateway for North-bound traffic:
2121

22-
* Two Large Form Factor NSX Edges, deployed in an NSX Edge cluster.
22+
* Two Large Form Factor NSX Edges, deployed in an NSX Edge cluster.
2323

24-
* A Default NSX Tier-0 Gateway in Active/Active mode.
24+
* A Default NSX Tier-0 Gateway in Active/Active mode.
2525

26-
* A Default NSX Tier-1 Gateway in Active/Standby mode.
26+
* A Default NSX Tier-1 Gateway in Active/Standby mode.
27+
28+
* A Default HCX-UPLINK segment connected to default Tier-1 Gateway.
2729

28-
* A Default HCX-UPLINK segment connected to default Tier-1 Gateway.
29-
3030
Customers typically host their application workloads by creating new NSX segments and attaching them to the default Tier-1 Gateway. Additionally, customers with an HCX migration use case use the default HCX-uplink segment, which is also connected to the default Tier-1 Gateway.
3131

3232
The default NSX topology for Azure VMware Solution, where all traffic exits through the default Tier-1 Gateway, may not be optimal based on customer traffic flows and throughput requirements.
3333

34-
### Potential Challenge:
34+
### Potential Challenge
3535

3636
Here are some potential challenges and the recommended configurations to optimize the NSX Edge data path resource.
3737

38-
* All the north-bound network traffic (Migrations, L2 Extensions, VM traffic outbound of Azure VMware Solution) uses the default Tier-1 Gateway which is in Active/Standby mode.
38+
* All the north-bound network traffic (Migrations, L2 Extensions, VM traffic outbound of Azure VMware Solution) uses the default Tier-1 Gateway which is in Active/Standby mode.
39+
40+
* In the default Active/Standby mode, the Tier-1 Gateway only uses the Active Edge VM for all north-bound traffic.
3941

40-
* In the default Active/Standby mode, the Tier-1 Gateway only uses the Active Edge VM for all north-bound traffic.
42+
* The second Edge VM, which is standby, is not used for north-bound traffic.
4143

42-
* The second Edge VM, which is standby, is not used for north-bound traffic.
44+
* Depending on the throughput requirements, and flows this could potentially create a bottleneck on the Active Edge VM.
4345

44-
* Depending on the throughput requirements, and flows this could potentially create a bottleneck on the Active Edge VM.
46+
### Recommended Practices
4547

46-
### Recommended Practices:
4748
It is possible to change the NSX North-bound network connectivity to distribute the traffic evenly to both Edge VMs. Creating an additional Tier-1 Gateways and distributing the NSX segments across multiple Tier-1 Gateways evenly distributes traffic across the Edge VMs. For an HCX migration use case, the recommendation would be to move HCX Layer 2 (L2) Extension and migration traffic to a newly created Tier-1 Gateway, so it uses the NSX Edge resource optimally.
4849

4950
To make an Active Edge for a given Tier-1 Gateway predictable, it is recommended to create an additional Tier-1 Gateway with the High Availability (HA) Mode set to Active/Standby with the Failover mode set to preemptive. This configuration allows you to select a different active Edge VM than the one in use by the default Tier-1 Gateway. This naturally splits north-bound traffic across multiple Tier-1 Gateways, so both NSX Edges are optimally utilized, thus avoiding a potential bottleneck with the default NSX topology.
5051

51-
:::image type="content" source="media/nsxt/default-nsx-topology.png" alt-text="Diagram showing the default nsx topology in Azure VMware Solution." border="false" lightbox="media/nsxt/default-nsx-topology.png":::
52+
:::image type="content" source="media/nsxt/default-nsx-topology.png" alt-text="Diagram showing the default NSX topology in Azure VMware Solution." border="false" lightbox="media/nsxt/default-nsx-topology.png":::
5253

5354
### NSX Edge performance characteristics
5455

5556
Each of the NSX Edge Virtual machine (EVM) can support up to approximately ~20 Gbps based on the number of flows, packet size, and services enabled on the NSX gateways. Each of the Edge VMs (Large form factors) has four Data Plane Development Kit (DPDK) enabled CPU cores, essentially each of the DPDK core could process up to ~5 Gbps traffic, based on flow hashing, packet size, and services enabled on NSX gateway. For more information on NSX Edge performance, see the VMware NSX-T Reference Design Guide section 8.6.2.*
5657

5758
## Monitor, Identify, and Fix potential Edge data path Performance Bottlenecks
5859

59-
Using the built-in NSX alarm framework is recommended to monitor and identify key NSX edge performance metrics.
60+
Using the built-in NSX alarm framework is recommended to monitor and identify key NSX Edge performance metrics.
6061

61-
### How to Monitor and Identify NSX Edge Data Path Resource Constraints:
62+
### How to Monitor and Identify NSX Edge Data Path Resource Constraints
6263

6364
NSX Edge performance can be monitored and identified by using the built-in NSX alarm framework. The following critical NSX Edge alarms identify the NSX Edge data path resource constraints:
6465

65-
1. Edge NIC Out of Transmit/Receive buffer.
66+
1. Edge NIC Out of Transmit/Receive buffer.
6667

67-
2. Edge Datapath CPU high.
68+
2. Edge Datapath CPU high.
6869

69-
3. Edge Datapath NIC throughput high.
70+
3. Edge Datapath NIC throughput high.
7071

71-
:::image type="content" source="media/nsxt/nsx-edge-critical-alerts.png" alt-text="Diagram showing nsx edge health critical alerts." border="false" lightbox="media/nsxt/nsx-edge-critical-alerts.png":::
72+
:::image type="content" source="media/nsxt/nsx-edge-critical-alerts.png" alt-text="Diagram showing NSX Edge health critical alerts." border="false" lightbox="media/nsxt/nsx-edge-critical-alerts.png":::
7273

73-
## How to fix the NSX Edge resource constraints:
74+
## How to fix the NSX Edge resource constraints
7475

75-
To validate the issue, check the historic and real-time traffic throughput:
76+
To validate the issue, check Historic/Realtime traffic throughput at the alarm time for the correlation.
7677

77-
Validation of issue:
78-
79-
* Historic/Realtime traffic throughput – check traffic throughput at the alarm time for the correlation.
80-
81-
:::image type="content" source="media/nsxt/nsx-edge-performance-charts.png" alt-text="Diagram showing nsx edge vm performance charts." border="false" lightbox="media/nsxt/nsx-edge-performance-charts.png":::
78+
:::image type="content" source="media/nsxt/nsx-edge-performance-charts.png" alt-text="Diagram showing NSX Edge VM performance charts." border="false" lightbox="media/nsxt/nsx-edge-performance-charts.png":::
8279

8380
To mitigate the issue, here are a few options to consider.
8481

8582
Mitigation options:
86-
1. Edge Scale-UP: NSX Edge Scale-UP from Large (four DPDK CPU) to X-Large (eight DPDK CPU) form factor could resolve part of the issue.
83+
84+
1. Edge Scale-UP: NSX Edge Scale-UP from Large (four DPDK CPU) to X-Large (eight DPDK CPU) form factor could resolve part of the issue.
8785

8886
* Edge Scale up provides additional CPU and memory for data path packet processing.
8987

9088
* Edge Scale up may not help if you have one or more heavy flows, for example, HCX Network Extension (NE) to Network Extension (NE) traffic, as this traffic could potentially pin to one of the DPDK CPU cores.
9189

92-
2. Tier-1 Gateway Topology Change: Change the Azure VMware Solution NSX default Tier-1 Gateway topology with multiple Tier-1 Gateways to split the traffic across multiple Edge VMs
90+
2. Tier-1 Gateway Topology Change: Change the Azure VMware Solution NSX default Tier-1 Gateway topology with multiple Tier-1 Gateways to split the traffic across multiple Edge VMs
9391

9492
* More details in the next section with an example of HCX migration use case.
9593

96-
3. Edge Scale-OUT: If customer has large number of Hosts in the SDDC and workloads, NSX Edge Scale-OUT (from two Edges to four Edges) could be an option to add additional NSX Edge data path resources.
94+
3. Edge Scale-OUT: If customer has large number of Hosts in the SDDC and workloads, NSX Edge Scale-OUT (from two Edges to four Edges) could be an option to add additional NSX Edge data path resources.
9795

9896
* However, NSX Edge Scale-OUT is effective only with a change in the NSX default Tier-1 Gateway topology to distribute the traffic optimally across all four Edge VMs. More details in the next section with an example of HCX migration use case.
9997

100-
### Default and configuration recommendations to the NSX Edge data path performance.
98+
### Default and configuration recommendations to the NSX Edge data path performance
10199

102100
Here are a few configuration recommendations to mitigate an NSX Edge VMs performance challenges.
103101

104-
1. By default, Edge VMs are part of Azure VMware Solution management resource pool on vCenter. All appliances in the management resource pool have dedicated computing resources assigned.
102+
1. By default, Edge VMs are part of Azure VMware Solution management resource pool on vCenter. All appliances in the management resource pool have dedicated computing resources assigned.
105103

106-
2. By default, Edge VMs are hosted on different Hosts with anti-affinity rules applied, to avoid multiple heavy packet processing workloads on same hosts.
104+
2. By default, Edge VMs are hosted on different Hosts with anti-affinity rules applied, to avoid multiple heavy packet processing workloads on same hosts.
107105

108-
3. Disable the Tier-1 Gateway Firewall if it is not required to get better packet processing power. (By default, the Tier-1 Gateway Firewall is enabled).
106+
3. Disable the Tier-1 Gateway Firewall if it is not required to get better packet processing power. (By default, the Tier-1 Gateway Firewall is enabled).
109107

110-
4. Verify that NSX Edge VMs and HCX Network Extension (NE) appliances are on separate hosts, to avoid multiple heavy packet processing workloads on same hosts.
108+
4. Verify that NSX Edge VMs and HCX Network Extension (NE) appliances are on separate hosts, to avoid multiple heavy packet processing workloads on same hosts.
111109

112110
5. Verify for HCX migration use case, that the HCX Network Extension (NE) and HCX Interconnect (IX) appliances have the CPU reserved. Reserving the CPU allows HCX to optimally process the HCX migration traffic. (By default, these appliances have no CPU reservations).
113111

@@ -123,7 +121,7 @@ Given the nature of HCX use case traffic pattern and default Azure VMware Soluti
123121
124122
In general, creating additional Tier-1 Gateways and distributing segments across multiple Tier-1 Gateways helps to mitigate potential NSX Edge data path bottleneck. The steps outlined show how to create and move an HCX uplink segment to the new Tier-1 Gateway. This allows you to separate out HCX traffic from workload VM traffic.
125123

126-
:::image type="content" source="media/nsxt/nsx-traffic-flow-additional-tier-1-gateway.png" alt-text="Diagram showing nsx traffic flow in Azure VMware Solution with an additional Tier-1 gateway." border="false" lightbox="media/nsxt/nsx-traffic-flow-additional-tier-1-gateway.png":::
124+
:::image type="content" source="media/nsxt/nsx-traffic-flow-additional-tier-1-gateway.png" alt-text="Diagram showing NSX traffic flow in Azure VMware Solution with an additional Tier-1 gateway." border="false" lightbox="media/nsxt/nsx-traffic-flow-additional-tier-1-gateway.png":::
127125

128126
### Detailed Steps (Mitigate Edge VM bottleneck)
129127

@@ -133,34 +131,34 @@ The creation of an additional Tier-1 Gateway can help mitigate potential Edge VM
133131

134132
Distributed Only Option:
135133

136-
1. No Edge Cluster can be selected.
134+
1. No Edge Cluster can be selected.
137135

138-
2. All connected Segments and Service Ports must be advertised.
136+
2. All connected Segments and Service Ports must be advertised.
139137

140-
3. No stateful services are available in the Distributed Only option.
138+
3. No stateful services are available in the Distributed Only option.
141139

142-
:::image type="content" source="media/nsxt/nsx-tier-1-gateway-distributed-only.png" alt-text="Diagram showing nsx tier-1 gateway distributed only option." border="false" lightbox="media/nsxt/nsx-tier-1-gateway-distributed-only.png":::
140+
:::image type="content" source="media/nsxt/nsx-tier-1-gateway-distributed-only.png" alt-text="Diagram showing NSX Tier-1 gateway distributed only option." border="false" lightbox="media/nsxt/nsx-tier-1-gateway-distributed-only.png":::
143141

144142
>[!IMPORTANT]
145143
>In a Distributed Only High Availability (HA) Mode, traffic is distributed across all Edge VMs. Workload traffic and Migration traffic may traverse the Active Edge at the same time.
146144
147145
Active/Standby Option:
148146

149-
1. Select the **Edge Cluster**.
147+
1. Select the **Edge Cluster**.
150148

151-
2. For Auto Allocate Edges- Select **No** on the radio button.
149+
2. For Auto Allocate Edges- Select **No** on the radio button.
152150

153-
3. Select the **Edge VM** that is not currently active as the preferred option.
151+
3. Select the **Edge VM** that is not currently active as the preferred option.
154152

155-
4. For the **Fail Over** setting, select **Preemptive**, this ensures that traffic will always failback to the preferred Edge VM selected in Step 3.
153+
4. For the **Fail Over** setting, select **Preemptive**, this ensures that traffic will always failback to the preferred Edge VM selected in Step 3.
156154

157-
5. Select **All Connected Segments and Service Ports** to be advertised.
155+
5. Select **All Connected Segments and Service Ports** to be advertised.
158156

159157
6. Select **Save**.
160158

161159
An Active/Standby configuration with the preferred Edge VM defined allows you to force traffic the Edge VM that is not the Active Edge on the Default Tier-1 Gateway. If the Edge cluster is scaled-out to four Edges, creating the new Tier-1 Gateway and selecting Edge VM 03 and Edge VM 04 may be a better option to isolate HCX traffic completely.
162160

163-
:::image type="content" source="media/nsxt/nsx-tier-1-gateway-active-standby.png" alt-text="Diagram showing nsx tier-1 gateway active standby option." border="false" lightbox="media/nsxt/nsx-tier-1-gateway-active-standby.png":::
161+
:::image type="content" source="media/nsxt/nsx-tier-1-gateway-active-standby.png" alt-text="Diagram showing NSX Tier-1 gateway active standby option." border="false" lightbox="media/nsxt/nsx-tier-1-gateway-active-standby.png":::
164162

165163
>[!NOTE]
166164
>Microsoft Recommends the Active/Standby HA Mode when additional Tier-1 Gateways are created. This allows customers to seperate Workload and migration traffic across different Edge VMs.
@@ -174,42 +172,42 @@ Select the newly created Tier-1 Gateway when creating your new NSX Segment.
174172
>[!NOTE]
175173
>When creating a new NSX Segment, customers can utilize the Azure VMware Solution reserved IP space. For example, a new segment can be created with an IP range of 10.18.75.129/26, assuming the following IP space 10.18.72.0/22 was used to create the Azure VMware Solution Private Cloud.
176174
177-
:::image type="content" source="media/nsxt/nsx-segment-creation.png" alt-text="Diagram showing the creation of a nsx segment." border="false" lightbox="media/nsxt/nsx-segment-creation.png":::
175+
:::image type="content" source="media/nsxt/nsx-segment-creation.png" alt-text="Diagram showing the creation of an NSX segment." border="false" lightbox="media/nsxt/nsx-segment-creation.png":::
178176

179177
## Create an HCX Network Profile
180178

181179
For detailed steps on how to Create an HCX Network Profile. [HCX Network Profile](configure-vmware-hcx.md#create-network-profiles)
182180

183-
1. Navigate to the HCX Portal select **Interconnect**, and then select **Network Profile**.
181+
1. Navigate to the HCX Portal select **Interconnect**, and then select **Network Profile**.
184182

185-
2. Select **Create Network Profile**.
183+
2. Select **Create Network Profile**.
186184

187-
3. Select **NSX Network**, and choose the newly created **HCX Uplink segment**.
185+
3. Select **NSX Network**, and choose the newly created **HCX Uplink segment**.
188186

189-
4. Add the desired **IP Pool range**.
187+
4. Add the desired **IP Pool range**.
190188

191-
5. (Optional) Select **HCX Uplink** as the HCX Traffic Type.
189+
5. (Optional) Select **HCX Uplink** as the HCX Traffic Type.
192190

193-
6. Select **Create**.
191+
6. Select **Create**.
194192

195-
:::image type="content" source="media/hcx/hcx-uplink-network-profile.png" alt-text="Diagram showing the creation of a hcx network profile." border="false" lightbox="media/nsxt/hcx-uplink-network-profile.png":::
193+
:::image type="content" source="media/hcx/hcx-uplink-network-profile.png" alt-text="Diagram showing the creation of a HCX network profile." border="false" lightbox="media/nsxt/hcx-uplink-network-profile.png":::
196194

197195
Once the new HCX Uplink Network Profile is created, update the existing Service Mesh and edit the default uplink profile with the newly created Network Profile.
198196

199197
:::image type="content" source="media/hcx/hcx-service-mesh-edit.png" alt-text="Diagram showing how to edit an existing HCX service mesh." border="false" lightbox="media/nsxt/hcx-service-mesh-edit.png":::
200198

201-
7. Select the existing **Service Mesh** and select **Edit**.
199+
7. Select the existing **Service Mesh** and select **Edit**.
202200

203-
8. Edit the default Uplink with the newly created Network Profile.
201+
8. Edit the default Uplink with the newly created Network Profile.
204202

205-
9. Select **Service Mesh Change**.
203+
9. Select **Service Mesh Change**.
206204

207-
:::image type="content" source="media/hcx/hcx-in-service-mode.png" alt-text="Diagram showing how to edit an in service mode on a hcx Network extension appliance." border="false" lightbox="media/nsxt/hcx-in-service-mode.png":::
205+
:::image type="content" source="media/hcx/hcx-in-service-mode.png" alt-text="Diagram showing how to edit an in service mode on a HCX Network extension appliance." border="false" lightbox="media/nsxt/hcx-in-service-mode.png":::
208206

209207
>[!Note]
210208
>In-Service Mode of the HCX Network Extension appliances should be considered to reduce downtime during this Service Mesh edit.
211209
212-
10. Select **Finish**.
210+
10. Select **Finish**.
213211

214212
[!IMPORTANT]Downtime varies depending on the Service Mesh change created. It is recommended to allocate 5 minutes of downtime for these changes to take effect.
215213

0 commit comments

Comments
 (0)