You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/vpn-gateway/troubleshoot-vpn-with-azure-diagnostics.md
+39-31Lines changed: 39 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,29 +21,29 @@ The following logs are available* in Azure:
21
21
|--- | --- |
22
22
|**GatewayDiagnosticLog**| Contains diagnostic logs for gateway configuration events, primary changes, and maintenance events. |
23
23
|**TunnelDiagnosticLog**| Contains tunnel state change events. Tunnel connect/disconnect events have a summarized reason for the state change if applicable. |
24
-
|**RouteDiagnosticLog**| Logs changes to static routes and BGP events that occur on the gateway. |
25
-
|**IKEDiagnosticLog**| Logs IKE control messages and events on the gateway. |
24
+
|**RouteDiagnosticLog**| Logs changes to static routes and BGP (Border Gateway Protocol) events that occur on the gateway. |
25
+
|**IKEDiagnosticLog**| Logs IKE (Internet Key Exchange) control messages and events on the gateway. |
26
26
|**P2SDiagnosticLog**| Logs point-to-site control messages and events on the gateway. |
27
27
28
28
*for Policy Based gateways, only GatewayDiagnosticLog and RouteDiagnosticLog are available.
29
29
30
-
Notice that there are several columns available in these tables. In this article, we are only presenting the most relevant ones for easier log consumption.
30
+
Notice that there are several columns available in these tables. In this article, we're only presenting the most relevant ones for easier log consumption.
31
31
32
32
## <aname="setup"></a>Set up logging
33
33
34
34
Follow this procedure to learn how set up diagnostic log events from Azure VPN Gateway using Azure Log Analytics:
35
35
36
-
1. Create a Log Analytics Workspace using [this article](../azure-monitor/logs/quick-create-workspace.md).
36
+
1. Create a new Log Analytics Workspace using the steps found in [create a Log Analytics Workspace](../azure-monitor/logs/quick-create-workspace.md).
37
37
38
-
2.Find your VPN gateway on the Monitor > Diagnostics settings blade.
38
+
2.Locate your VPN gateway on the **Monitor > Diagnostics** settings page.
39
39
40
-
:::image type="content" source="./media/troubleshoot-vpn-with-azure-diagnostics/setup_step2.png " alt-text="Screenshot of the Diagnostic settings blade." lightbox="./media/troubleshoot-vpn-with-azure-diagnostics/setup_step2.png":::
40
+
:::image type="content" source="./media/troubleshoot-vpn-with-azure-diagnostics/setup_step2.png " alt-text="Screenshot of the Diagnostic settings page." lightbox="./media/troubleshoot-vpn-with-azure-diagnostics/setup_step2.png":::
41
41
42
-
3. Select the gateway and click on "Add Diagnostic Setting".
42
+
3. Select the VPN gateway and then select **Add Diagnostic Setting**.
43
43
44
44
:::image type="content" source="./media/troubleshoot-vpn-with-azure-diagnostics/setup_step3.png " alt-text="Screenshot of the Add diagnostic setting interface." lightbox="./media/troubleshoot-vpn-with-azure-diagnostics/setup_step3.png":::
45
45
46
-
4.Fill in the diagnostic setting name, select all the log categories and choose the Log Analytics Workspace.
46
+
4.Input the **Diagnostic setting name**, choose all the **Log categories** and select the appropriate **Log Analytics Workspace**.
47
47
48
48
:::image type="content" source="./media/troubleshoot-vpn-with-azure-diagnostics/setup_step4.png " alt-text="Detailed screenshot of the Add diagnostic setting properties." lightbox="./media/troubleshoot-vpn-with-azure-diagnostics/setup_step4.png":::
49
49
@@ -63,25 +63,26 @@ AzureDiagnostics
63
63
| sort by TimeGenerated asc
64
64
```
65
65
66
-
This query on **GatewayDiagnosticLog**will show you multiple columns.
66
+
This query on **GatewayDiagnosticLog**shows you multiple columns.
67
67
68
68
|***Name***|***Description***|
69
69
|--- | --- |
70
70
|**TimeGenerated**| the timestamp of each event, in UTC timezone.|
71
71
|**OperationName**|the event that happened. It can be either of *SetGatewayConfiguration, SetConnectionConfiguration, HostMaintenanceEvent, GatewayTenantPrimaryChanged, MigrateCustomerSubscription, GatewayResourceMove, ValidateGatewayConfiguration*.|
72
72
|**Message**| the detail of what operation is happening, and lists successful/failure results.|
73
73
74
-
The example below shows the activity logged when a new configuration was applied:
74
+
The following example shows the activity logged when a new configuration was applied:
75
75
76
76
:::image type="content" source="./media/troubleshoot-vpn-with-azure-diagnostics/image-26-set-gateway.png" alt-text="Example of a Set Gateway Operation seen in GatewayDiagnosticLog.":::
77
77
78
78
79
-
Notice that a SetGatewayConfiguration will be logged every time some configuration is modified both on a VPN Gateway or a Local Network Gateway.
80
-
Cross referencing the results from the **GatewayDiagnosticLog** table with those of the **TunnelDiagnosticLog** table can help us determine if a tunnel connectivity failure has started at the same time as a configuration was changed, or a maintenance took place. If so, we have a great pointer towards the possible root cause.
79
+
Notice that a **SetGatewayConfiguration** gets logged every time a configuration is modified both on a VPN Gateway or a Local Network Gateway.
80
+
81
+
Comparing the results from the **GatewayDiagnosticLog** table with the results of the **TunnelDiagnosticLog** table can help determine if a tunnel connectivity failure happened during a configuration change or maintenance activity. If so, it provides a significant indication towards the potential root cause.
The **TunnelDiagnosticLog** table is very useful to inspect the historical connectivity statuses of the tunnel.
85
+
The **TunnelDiagnosticLog** table is useful to inspect the historical connectivity statuses of the tunnel.
85
86
86
87
Here you have a sample query as reference.
87
88
@@ -93,14 +94,14 @@ AzureDiagnostics
93
94
| sort by TimeGenerated asc
94
95
```
95
96
96
-
This query on **TunnelDiagnosticLog**will show you multiple columns.
97
+
This query on **TunnelDiagnosticLog**shows you multiple columns.
97
98
98
99
99
100
|***Name***|***Description***|
100
101
|--- | --- |
101
102
|**TimeGenerated**| the timestamp of each event, in UTC timezone.|
102
103
|**OperationName**| the event that happened. It can be either *TunnelConnected* or *TunnelDisconnected*.|
103
-
|**remoteIP\_s**| the IP address of the on-premises VPN device. In real world scenarios, it is useful to filter by the IP address of the relevant on-premises device shall there be more than one.|
104
+
|**remoteIP\_s**| the IP address of the on-premises VPN device. In real world scenarios, it's useful to filter by the IP address of the relevant on-premises device shall there be more than one.|
104
105
|**Instance\_s**| the gateway role instance that triggered the event. It can be either GatewayTenantWorker\_IN\_0 or GatewayTenantWorker\_IN\_1, which are the names of the two instances of the gateway.|
105
106
|**Resource**| indicates the name of the VPN gateway. |
106
107
|**ResourceGroup**| indicates the resource group where the gateway is.|
@@ -111,14 +112,14 @@ Example output:
111
112
:::image type="content" source="./media/troubleshoot-vpn-with-azure-diagnostics/image-16-tunnel-connected.png" alt-text="Example of a Tunnel Connected Event seen in TunnelDiagnosticLog.":::
112
113
113
114
114
-
The **TunnelDiagnosticLog** is very useful to troubleshoot past events about unexpected VPN disconnections. Its lightweight nature offers the possibility to analyze large time ranges over several days with little effort.
115
+
The **TunnelDiagnosticLog** is useful to troubleshoot past events about unexpected VPN disconnections. Its lightweight nature offers the possibility to analyze large time ranges over several days with little effort.
115
116
Only after you identify the timestamp of a disconnection, you can switch to the more detailed analysis of the **IKEdiagnosticLog** table to dig deeper into the reasoning of the disconnections shall those be IPsec related.
116
117
117
118
118
119
Some troubleshooting tips:
119
-
- If you see a disconnection event on one gateway instance, followed by a connection event on the **different** gateway instance in a few seconds, you are looking at a gateway failover. This is usually an expected behavior due to maintenance on a gateway instance. To learn more about this behavior, see [About Azure VPN gateway redundancy](./vpn-gateway-highlyavailable.md#activestandby).
120
-
- The same behavior will be observed if you intentionally run a Gateway Reset on the Azure side - which causes a reboot of the active gateway instance. To learn more about this behavior, see [Reset a VPN Gateway](./reset-gateway.md).
121
-
- If you see a disconnection event on one gateway instance, followed by a connection event on the **same** gateway instance in a few seconds, you may be looking at a network glitch causing a DPD timeout, or a disconnection erroneously sent by the on-premises device.
120
+
- If you observe a disconnection event on one gateway instance, followed by a connection event on a different gateway instance within a few seconds, it indicates a gateway failover. Such a event typically arises due to maintenance on a gateway instance. To learn more about this behavior, see [About Azure VPN gateway redundancy](./vpn-gateway-highlyavailable.md#activestandby).
121
+
- The same behavior is observed if you intentionally run a **Gateway Reset** on the Azure side - which causes a reboot of the active gateway instance. To learn more about this behavior, see [Reset a VPN Gateway](./reset-gateway.md).
122
+
- If you see a disconnection event on one gateway instance, followed by a connection event on the **same** gateway instance in a few seconds, you might be looking at a network glitch causing a DPD timeout, or a disconnection erroneously sent by the on-premises device.
This query on **RouteDiagnosticLog**will show you multiple columns.
136
+
This query on **RouteDiagnosticLog**shows you multiple columns.
136
137
137
138
|***Name***|***Description***|
138
139
|--- | --- |
139
140
|**TimeGenerated**| the timestamp of each event, in UTC timezone.|
140
141
|**OperationName**| the event that happened. Can be either of *StaticRouteUpdate, BgpRouteUpdate, BgpConnectedEvent, BgpDisconnectedEvent*.|
141
142
|**Message**| the detail of what operation is happening.|
142
143
143
-
The output will show useful information about BGP peers connected/disconnected and routes exchanged.
144
+
The output shows useful information about BGP peers connected/disconnected and routes exchanged.
144
145
145
146
Example:
146
147
@@ -150,7 +151,7 @@ Example:
150
151
151
152
## <aname="IKEDiagnosticLog"></a>IKEDiagnosticLog
152
153
153
-
The **IKEDiagnosticLog** table offers verbose debug logging for IKE/IPsec. This is very useful to review when troubleshooting disconnections, or failure to connect VPN scenarios.
154
+
The **IKEDiagnosticLog** table offers verbose debug logging for IKE/IPsec. This is useful to review when troubleshooting disconnections, or failure to connect VPN scenarios.
154
155
155
156
Here you have a sample query as reference.
156
157
@@ -164,24 +165,24 @@ AzureDiagnostics
164
165
| sort by TimeGenerated asc
165
166
```
166
167
167
-
This query on **IKEDiagnosticLog**will show you multiple columns.
168
+
This query on **IKEDiagnosticLog**shows you multiple columns.
168
169
169
170
170
171
|***Name***|***Description***|
171
172
|--- | --- |
172
173
|**TimeGenerated**| the timestamp of each event, in UTC timezone.|
173
-
|**RemoteIP**| the IP address of the on-premises VPN device. In real world scenarios, it is useful to filter by the IP address of the relevant on-premises device shall there be more than one. |
174
-
|**LocalIP**| the IP address of the VPN Gateway we are troubleshooting. In real world scenarios, it is useful to filter by the IP address of the relevant VPN gateway shall there be more than one in your subscription. |
175
-
|**Event**| contains a diagnostic message useful for troubleshooting. They usually start with a keyword and refer to the actions performed by the Azure Gateway: **\[SEND\]** indicates an event caused by an IPSec packet sent by the Azure Gateway. **\[RECEIVED\]** indicates an event in consequence of a packet received from on-premises device.**\[LOCAL\]** indicates an action taken locally by the Azure Gateway. |
174
+
|**RemoteIP**| the IP address of the on-premises VPN device. In real world scenarios, it's useful to filter by the IP address of the relevant on-premises device shall there be more than one. |
175
+
|**LocalIP**| the IP address of the VPN Gateway we're troubleshooting. In real world scenarios, it's useful to filter by the IP address of the relevant VPN gateway shall there be more than one in your subscription. |
176
+
|**Event**| contains a diagnostic message useful for troubleshooting. They usually start with a keyword and refer to the actions performed by the Azure Gateway: **\[SEND\]** indicates an event caused by an IPSec packet sent by the Azure Gateway. **\[RECEIVED\]** indicates an event in consequence of a packet received from on-premises device. **\[LOCAL\]** indicates an action taken locally by the Azure Gateway. |
176
177
177
178
178
-
Notice how RemoteIP, LocalIP, and Event columns are not present in the original column list on AzureDiagnostics database, but are added to the query by parsing the output of the "Message" column to simplify its analysis.
179
+
Notice how RemoteIP, LocalIP, and Event columns aren't present in the original column list on AzureDiagnostics database, but are added to the query by parsing the output of the "Message" column to simplify its analysis.
179
180
180
181
Troubleshooting tips:
181
182
182
183
- In order to identify the start of an IPSec negotiation, you need to find the initial SA\_INIT message. Such message could be sent by either side of the tunnel. Whoever sends the first packet is called "initiator" in IPsec terminology, while the other side becomes the "responder". The first SA\_INIT message is always the one where rCookie = 0.
183
184
184
-
- If the IPsec tunnel fails to establish, Azure will keep retrying every few seconds. For this reason, troubleshooting "VPN down" issues is very convenient on IKEdiagnosticLog because you do not have to wait for a specific time to reproduce the issue. Also, the failure will in theory always be the same every time we try so you could just zoom into one "sample" failing negotiation at any time.
185
+
- If the IPsec tunnel fails to establish, Azure keeps retrying every few seconds. For this reason, troubleshooting "VPN down" issues is convenient on IKEdiagnosticLog because you don't have to wait for a specific time to reproduce the issue. Also, the failure will in theory always be the same every time we try so you could just zoom into one "sample" failing negotiation at any time.
185
186
186
187
- The SA\_INIT contains the IPSec parameters that the peer wants to use for this IPsec negotiation.
187
188
The official document
@@ -208,13 +209,20 @@ This query on **P2SDiagnosticLog** will show you multiple columns.
208
209
|**OperationName**| the event that happened. Will be *P2SLogEvent*.|
209
210
|**Message**| the detail of what operation is happening.|
210
211
211
-
The output will show all of the Point to Site settings that the gateway has applied, as well as the IPsec policies in place.
212
+
The output shows all of the Point to Site settings that the gateway has applied, and the IPsec policies in place.
212
213
213
214
:::image type="content" source="./media/troubleshoot-vpn-with-azure-diagnostics/image-28-p2s-log-event.png" alt-text="Example of Point to Site connection seen in P2SDiagnosticLog.":::
214
215
215
-
Also, whenever a client will connect via IKEv2 or OpenVPN Point to Site, the table will log packet activity, EAP/RADIUS conversations and successful/failure results by user.
216
+
Additionally, when a client establishes a connection using OpenVPN and Microsoft Entra ID authentication for point-to-site, the table records packet activity as follows:
:::image type="content" source="./media/troubleshoot-vpn-with-azure-diagnostics/image-29-eap.png" alt-text="Example of EAP authentication seen in P2SDiagnosticLog.":::
224
+
> [!NOTE]
225
+
> In the point-to-site log, the username is partially obscured. The first octet of the client user IP is substituted with a `0`.
0 commit comments