Skip to content

Commit f740b27

Browse files
authored
Merge pull request #110612 from erichrt/patch-8
Make individual metric examples expandable
2 parents dc800f4 + b9deb65 commit f740b27

File tree

1 file changed

+20
-6
lines changed

1 file changed

+20
-6
lines changed

articles/load-balancer/load-balancer-standard-diagnostics.md

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ To configure alerts:
8181
### <a name = "DiagnosticScenarios"></a>Common diagnostic scenarios and recommended views
8282

8383
#### Is the data path up and available for my load balancer VIP?
84+
<details><summary>Expand</summary>
8485

8586
The VIP availability metric describes the health of the data path within the region to the compute host where your VMs are located. The metric is a reflection of the health of the Azure infrastructure. You can use the metric to:
8687
- Monitor the external availability of your service
@@ -108,9 +109,11 @@ VIP availability fails for the following reasons:
108109
For diagnostic purposes, you can use the [Data Path Availability metric together with the health probe status](#vipavailabilityandhealthprobes).
109110

110111
Use **Average** as the aggregation for most scenarios.
112+
</details>
111113

112114
#### Are the back-end instances for my VIP responding to probes?
113-
115+
<details>
116+
<summary>Expand</summary>
114117
The health probe status metric describes the health of your application deployment as configured by you when you configure the health probe of your load balancer. The load balancer uses the status of the health probe to determine where to send new flows. Health probes originate from an Azure infrastructure address and are visible within the guest OS of the VM.
115118

116119
To get the health probe status for your Standard Load Balancer resources:
@@ -122,9 +125,11 @@ Health probes fail for the following reasons:
122125
- Your probe is not permitted by the Network Security Group, the VM's guest OS firewall, or the application layer filters.
123126

124127
Use **Average** as the aggregation for most scenarios.
128+
</details>
125129

126130
#### How do I check my outbound connection statistics?
127-
131+
<details>
132+
<summary>Expand</summary>
128133
The SNAT connections metric describes the volume of successful and failed connections for [outbound flows](https://aka.ms/lboutbound).
129134

130135
A failed connections volume of greater than zero indicates SNAT port exhaustion. You must investigate further to determine what may be causing these failures. SNAT port exhaustion manifests as a failure to establish an [outbound flow](https://aka.ms/lboutbound). Review the article about outbound connections to understand the scenarios and mechanisms at work, and to learn how to mitigate and design to avoid SNAT port exhaustion.
@@ -136,10 +141,12 @@ To get SNAT connection statistics:
136141
![SNAT connection](./media/load-balancer-standard-diagnostics/LBMetrics-SNATConnection.png)
137142

138143
*Figure: Load Balancer SNAT connection count*
144+
</details>
139145

140146

141147
#### How do I check my SNAT port usage and allocation?
142-
148+
<details>
149+
<summary>Expand</summary>
143150
The SNAT Usage metric indicates how many unique flows are established between an internet source and a backend VM or virtual machine scale set that is behind a load balancer and does not have a public IP address. By comparing this with the SNAT Allocation metric, you can determine if your service is experiencing or at risk of SNAT exhaustion and resulting outbound flow failure.
144151

145152
If your metrics indicate risk of [outbound flow](https://aka.ms/lboutbound) failure, reference the article and take steps to mitigate this to ensure service health.
@@ -161,20 +168,24 @@ To view SNAT port usage and allocation:
161168
![SNAT usage by backend instance](./media/load-balancer-standard-diagnostics/snat-usage-split.png)
162169

163170
*Figure: TCP SNAT port usage per backend instance*
171+
</details>
164172

165173
#### How do I check inbound/outbound connection attempts for my service?
166-
174+
<details>
175+
<summary>Expand</summary>
167176
A SYN packets metric describes the volume of TCP SYN packets, which have arrived or were sent (for [outbound flows](https://aka.ms/lboutbound)) that are associated with a specific front end. You can use this metric to understand TCP connection attempts to your service.
168177

169178
Use **Total** as the aggregation for most scenarios.
170179

171180
![SYN connection](./media/load-balancer-standard-diagnostics/LBMetrics-SYNCount.png)
172181

173182
*Figure: Load Balancer SYN count*
183+
</details>
174184

175185

176186
#### How do I check my network bandwidth consumption?
177-
187+
<details>
188+
<summary>Expand</summary>
178189
The bytes and packet counters metric describes the volume of bytes and packets that are sent or received by your service on a per-front-end basis.
179190

180191
Use **Total** as the aggregation for most scenarios.
@@ -188,9 +199,11 @@ To get byte or packet count statistics:
188199
![Byte count](./media/load-balancer-standard-diagnostics/LBMetrics-ByteCount.png)
189200

190201
*Figure: Load Balancer byte count*
202+
</details>
191203

192204
#### <a name = "vipavailabilityandhealthprobes"></a>How do I diagnose my load balancer deployment?
193-
205+
<details>
206+
<summary>Expand</summary>
194207
By using a combination of the VIP availability and health probe metrics on a single chart you can identify where to look for the problem and resolve the problem. You can gain assurance that Azure is working correctly and use this knowledge to conclusively determine that the configuration or application is the root cause.
195208

196209
You can use health probe metrics to understand how Azure views the health of your deployment as per the configuration you have provided. Looking at health probes is always a great first step in monitoring or determining a cause.
@@ -206,6 +219,7 @@ The chart displays the following information:
206219
- The health probe status (DIP availability), indicated by the purple trace, is at 0 percent at the beginning of the chart. The circled area in green highlights where the health probe status (DIP availability) became healthy, and at which point the customer's deployment was able to accept new flows.
207220

208221
The chart allows customers to troubleshoot the deployment on their own without having to guess or ask support whether other issues are occurring. The service was unavailable because health probes were failing due to either a misconfiguration or a failed application.
222+
</details>
209223

210224
## <a name = "ResourceHealth"></a>Resource health status
211225

0 commit comments

Comments
 (0)