Skip to content

Commit bcc37fc

Browse files
authored
Merge pull request #114713 from spelluru/ehubtroubleshoot0511
Event Hubs - Troubleshooting
2 parents fec0bc6 + eb4818c commit bcc37fc

File tree

3 files changed

+138
-70
lines changed

3 files changed

+138
-70
lines changed

articles/event-hubs/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -212,6 +212,8 @@
212212
- name: Troubleshoot
213213
items:
214214
- name: Troubleshooting guide
215+
href: troubleshooting-guide.md
216+
- name: .NET exceptions
215217
href: event-hubs-messaging-exceptions.md
216218
- name: Resource Manager exceptions
217219
href: resource-manager-exceptions.md

articles/event-hubs/event-hubs-messaging-exceptions.md

Lines changed: 56 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Troubleshooting guide - Azure Event Hubs | Microsoft Docs
2+
title: Azure Event Hubs - exceptions
33
description: This article provides a list of Azure Event Hubs messaging exceptions and suggested actions.
44
services: event-hubs
55
documentationcenter: na
@@ -17,22 +17,49 @@ ms.author: shvija
1717

1818
---
1919

20-
# Troubleshooting guide for Azure Event Hubs
21-
This article provides some of the .NET exceptions generated by Event Hubs .NET Framework APIs and also other tips for troubleshooting issues.
22-
23-
## Event Hubs messaging exceptions - .NET
20+
# Event Hubs messaging exceptions - .NET
2421
This section lists the .NET exceptions generated by .NET Framework APIs.
2522

26-
### Exception categories
27-
28-
The Event Hubs .NET APIs generate exceptions that can fall into the following categories, along with the associated action you can take to try to fix them.
29-
30-
1. User coding error: [System.ArgumentException](https://msdn.microsoft.com/library/system.argumentexception.aspx), [System.InvalidOperationException](https://msdn.microsoft.com/library/system.invalidoperationexception.aspx), [System.OperationCanceledException](https://msdn.microsoft.com/library/system.operationcanceledexception.aspx), [System.Runtime.Serialization.SerializationException](https://msdn.microsoft.com/library/system.runtime.serialization.serializationexception.aspx). General action: try to fix the code before proceeding.
31-
2. Setup/configuration error: [Microsoft.ServiceBus.Messaging.MessagingEntityNotFoundException](/dotnet/api/microsoft.servicebus.messaging.messagingentitynotfoundexception), [Microsoft.Azure.EventHubs.MessagingEntityNotFoundException](/dotnet/api/microsoft.azure.eventhubs.messagingentitynotfoundexception), [System.UnauthorizedAccessException](https://msdn.microsoft.com/library/system.unauthorizedaccessexception.aspx). General action: review your configuration and change if necessary.
32-
3. Transient exceptions: [Microsoft.ServiceBus.Messaging.MessagingException](/dotnet/api/microsoft.servicebus.messaging.messagingexception), [Microsoft.ServiceBus.Messaging.ServerBusyException](#serverbusyexception), [Microsoft.Azure.EventHubs.ServerBusyException](#serverbusyexception), [Microsoft.ServiceBus.Messaging.MessagingCommunicationException](/dotnet/api/microsoft.servicebus.messaging.messagingcommunicationexception). General action: retry the operation or notify users.
33-
4. Other exceptions: [System.Transactions.TransactionException](https://msdn.microsoft.com/library/system.transactions.transactionexception.aspx), [System.TimeoutException](#timeoutexception), [Microsoft.ServiceBus.Messaging.MessageLockLostException](/dotnet/api/microsoft.servicebus.messaging.messagelocklostexception), [Microsoft.ServiceBus.Messaging.SessionLockLostException](/dotnet/api/microsoft.servicebus.messaging.sessionlocklostexception). General action: specific to the exception type; refer to the table in the following section.
34-
35-
### Exception types
23+
## Exception categories
24+
25+
The Event Hubs .NET APIs generate exceptions that can fall into the following categories, along with the associated action you can take to try to fix them:
26+
27+
- User coding error:
28+
29+
- [System.ArgumentException](https://msdn.microsoft.com/library/system.argumentexception.aspx)
30+
- [System.InvalidOperationException](https://msdn.microsoft.com/library/system.invalidoperationexception.aspx)
31+
- [System.OperationCanceledException](https://msdn.microsoft.com/library/system.operationcanceledexception.aspx)
32+
- [System.Runtime.Serialization.SerializationException](https://msdn.microsoft.com/library/system.runtime.serialization.serializationexception.aspx)
33+
34+
General action: Try to fix the code before proceeding.
35+
36+
- Setup/configuration error:
37+
38+
- [Microsoft.ServiceBus.Messaging.MessagingEntityNotFoundException](/dotnet/api/microsoft.servicebus.messaging.messagingentitynotfoundexception)
39+
- [Microsoft.Azure.EventHubs.MessagingEntityNotFoundException](/dotnet/api/microsoft.azure.eventhubs.messagingentitynotfoundexception)
40+
- [System.UnauthorizedAccessException](https://msdn.microsoft.com/library/system.unauthorizedaccessexception.aspx)
41+
42+
General action: Review your configuration and change if necessary.
43+
44+
- Transient exceptions:
45+
46+
- [Microsoft.ServiceBus.Messaging.MessagingException](/dotnet/api/microsoft.servicebus.messaging.messagingexception)
47+
- [Microsoft.ServiceBus.Messaging.ServerBusyException](#serverbusyexception)
48+
- [Microsoft.Azure.EventHubs.ServerBusyException](#serverbusyexception)
49+
- [Microsoft.ServiceBus.Messaging.MessagingCommunicationException](/dotnet/api/microsoft.servicebus.messaging.messagingcommunicationexception)
50+
51+
General action: Retry the operation or notify users.
52+
53+
- Other exceptions:
54+
55+
- [System.Transactions.TransactionException](https://msdn.microsoft.com/library/system.transactions.transactionexception.aspx)
56+
- [System.TimeoutException](#timeoutexception)
57+
- [Microsoft.ServiceBus.Messaging.MessageLockLostException](/dotnet/api/microsoft.servicebus.messaging.messagelocklostexception)
58+
- [Microsoft.ServiceBus.Messaging.SessionLockLostException](/dotnet/api/microsoft.servicebus.messaging.sessionlocklostexception)
59+
60+
General action: Specific to the exception type; refer to the table in the following section.
61+
62+
## Exception types
3663
The following table lists messaging exception types, and their causes, and notes suggested action you can take.
3764

3865
| Exception Type | Description/Cause/Examples | Suggested Action | Note on automatic/immediate retry |
@@ -51,93 +78,52 @@ The following table lists messaging exception types, and their causes, and notes
5178
| [MessagingEntityDisabledException](/dotnet/api/microsoft.servicebus.messaging.messagingentitydisabledexception) | Request for a runtime operation on a disabled entity. |Activate the entity. | Retry might help if the entity has been activated in the interim. |
5279
| [Microsoft.ServiceBus.Messaging MessageSizeExceededException](/dotnet/api/microsoft.servicebus.messaging.messagesizeexceededexception) <br /><br/> [Microsoft.Azure.EventHubs MessageSizeExceededException](/dotnet/api/microsoft.azure.eventhubs.messagesizeexceededexception) | A message payload exceeds the 1-MB limit. This 1-MB limit is for the total message, which can include system properties and any .NET overhead. | Reduce the size of the message payload, then retry the operation. |Retry will not help. |
5380

54-
### QuotaExceededException
81+
## QuotaExceededException
5582
[QuotaExceededException](/dotnet/api/microsoft.servicebus.messaging.quotaexceededexception) indicates that a quota for a specific entity has been exceeded.
5683

5784
This exception can happen if the maximum number of receivers (5) has already been opened on a per-consumer group level.
5885

59-
#### Event Hubs
86+
### Event Hubs
6087
Event Hubs has a limit of 20 consumer groups per Event Hub. When you attempt to create more, you receive a [QuotaExceededException](/dotnet/api/microsoft.servicebus.messaging.quotaexceededexception).
6188

62-
### TimeoutException
89+
## TimeoutException
6390
A [TimeoutException](https://msdn.microsoft.com/library/system.timeoutexception.aspx) indicates that a user-initiated operation is taking longer than the operation timeout.
6491

6592
For Event Hubs, the timeout is specified either as part of the connection string, or through [ServiceBusConnectionStringBuilder](/dotnet/api/microsoft.servicebus.servicebusconnectionstringbuilder). The error message itself might vary, but it always contains the timeout value specified for the current operation.
6693

67-
#### Common causes
94+
### Common causes
6895
There are two common causes for this error: incorrect configuration, or a transient service error.
6996

70-
1. **Incorrect configuration**
97+
- **Incorrect configuration**
7198
The operation timeout might be too small for the operational condition. The default value for the operation timeout in the client SDK is 60 seconds. Check to see if your code has the value set to something too small. The condition of the network and CPU usage can affect the time it takes for a particular operation to complete, so the operation timeout should not be set to a small value.
72-
2. **Transient service error**
99+
- **Transient service error**
73100
Sometimes the Event Hubs service can experience delays in processing requests; for example, during periods of high traffic. In such cases, you can retry your operation after a delay, until the operation is successful. If the same operation still fails after multiple attempts, visit the [Azure service status site](https://azure.microsoft.com/status/) to see if there are any known service outages.
74101

75-
### ServerBusyException
102+
## ServerBusyException
76103

77104
A [Microsoft.ServiceBus.Messaging.ServerBusyException](/dotnet/api/microsoft.servicebus.messaging.serverbusyexception) or [Microsoft.Azure.EventHubs.ServerBusyException](/dotnet/api/microsoft.azure.eventhubs.serverbusyexception) indicates that a server is overloaded. There are two relevant error codes for this exception.
78105

79-
#### Error code 50002
80-
106+
### Error code 50002
81107
This error can occur for one of two reasons:
82108

83-
1. The load isn't evenly distributed across all partitions on the event hub, and one partition hits the local throughput unit limitation.
109+
- The load isn't evenly distributed across all partitions on the event hub, and one partition hits the local throughput unit limitation.
84110

85-
Resolution: Revising the partition distribution strategy or trying [EventHubClient.Send(eventDataWithOutPartitionKey)](/dotnet/api/microsoft.servicebus.messaging.eventhubclient) might help.
111+
**Resolution**: Revising the partition distribution strategy or trying [EventHubClient.Send(eventDataWithOutPartitionKey)](/dotnet/api/microsoft.servicebus.messaging.eventhubclient) might help.
86112

87-
2. The Event Hubs namespace doesn't have sufficient throughput units (you can check the **Metrics** screen in the Event Hubs namespace window in the [Azure portal](https://portal.azure.com) to confirm). The portal shows aggregated (1 minute) information, but we measure the throughput in real time – so it's only an estimate.
113+
- The Event Hubs namespace doesn't have sufficient throughput units (you can check the **Metrics** screen in the Event Hubs namespace window in the [Azure portal](https://portal.azure.com) to confirm). The portal shows aggregated (1 minute) information, but we measure the throughput in real time – so it's only an estimate.
88114

89-
Resolution: Increasing the throughput units on the namespace can help. You can do this operation on the portal, in the **Scale** window of the Event Hubs namespace screen. Or, you can use [Auto-inflate](event-hubs-auto-inflate.md).
115+
**Resolution**: Increasing the throughput units on the namespace can help. You can do this operation on the portal, in the **Scale** window of the Event Hubs namespace screen. Or, you can use [Auto-inflate](event-hubs-auto-inflate.md).
90116

91-
#### Error code 50001
117+
### Error code 50001
92118

93119
This error should rarely occur. It happens when the container running code for your namespace is low on CPU – not more than a few seconds before the Event Hubs load balancer begins.
94120

95-
#### Limit on calls to the GetRuntimeInformation method
96-
Azure Event Hubs supports up to 50 calls per second to the GetRuntimeInfo per second. You may receive an exception similar to the following one once the limit is reached:
121+
**Resolution**: Limit on calls to the GetRuntimeInformation method. Azure Event Hubs supports up to 50 calls per second to the GetRuntimeInfo per second. You may receive an exception similar to the following one once the limit is reached:
97122

98123
```
99124
ExceptionId: 00000000000-00000-0000-a48a-9c908fbe84f6-ServerBusyException: The request was terminated because the namespace 75248:aaa-default-eventhub-ns-prodb2b is being throttled. Error code : 50001. Please wait 10 seconds and try again.
100125
```
101126

102-
## Connectivity, certificate, or timeout issues
103-
The following steps may help you with troubleshooting connectivity/certificate/timeout issues for all services under *.servicebus.windows.net.
104-
105-
- Browse to or [wget](https://www.gnu.org/software/wget/) `https://<yournamespacename>.servicebus.windows.net/`. It helps with checking whether you have IP filtering or virtual network or certificate chain issues (most common when using java SDK).
106-
107-
An example of successful message:
108-
109-
```xml
110-
<feed xmlns="http://www.w3.org/2005/Atom"><title type="text">Publicly Listed Services</title><subtitle type="text">This is the list of publicly-listed services currently available.</subtitle><id>uuid:27fcd1e2-3a99-44b1-8f1e-3e92b52f0171;id=30</id><updated>2019-12-27T13:11:47Z</updated><generator>Service Bus 1.1</generator></feed>
111-
```
112-
113-
An example of failure error message:
114-
115-
```json
116-
<Error>
117-
<Code>400</Code>
118-
<Detail>
119-
Bad Request. To know more visit https://aka.ms/sbResourceMgrExceptions. . TrackingId:b786d4d1-cbaf-47a8-a3d1-be689cda2a98_G22, SystemTracker:NoSystemTracker, Timestamp:2019-12-27T13:12:40
120-
</Detail>
121-
</Error>
122-
```
123-
- Run the following command to check if any port is blocked on the firewall. Ports used are 443 (HTTPS), 5671 (AMQP) and 9093 (Kafka). Depending on the library you use, other ports are also used. Here is the sample command that check whether the 5671 port is blocked.
124-
125-
```powershell
126-
tnc <yournamespacename>.servicebus.windows.net -port 5671
127-
```
128-
129-
On Linux:
130-
131-
```shell
132-
telnet <yournamespacename>.servicebus.windows.net 5671
133-
```
134-
- When there are intermittent connectivity issues, run the following command to check if there are any dropped packets. This command will try to establish 25 different TCP connections every 1 second with the service. Then, you can check how many of them succeeded/failed and also see TCP connection latency. You can download the `psping` tool from [here](/sysinternals/downloads/psping).
135-
136-
```shell
137-
.\psping.exe -n 25 -i 1 -q <yournamespacename>.servicebus.windows.net:5671 -nobanner
138-
```
139-
You can use equivalent commands if you're using other tools such as `tnc`, `ping`, and so on.
140-
- Obtain a network trace if the previous steps don't help and analyze it using tools such as [Wireshark](https://www.wireshark.org/). Contact [Microsoft Support](https://support.microsoft.com/) if needed.
141127

142128
## Next steps
143129

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
---
2+
title: Troubleshooting guide - Azure Event Hubs | Microsoft Docs
3+
description: This article provides a list of Azure Event Hubs messaging exceptions and suggested actions.
4+
services: event-hubs
5+
documentationcenter: na
6+
author: ShubhaVijayasarathy
7+
manager: timlt
8+
9+
ms.service: event-hubs
10+
ms.devlang: na
11+
ms.topic: article
12+
ms.tgt_pltfrm: na
13+
ms.workload: na
14+
ms.custom: seodec18
15+
ms.date: 01/16/2020
16+
ms.author: shvija
17+
18+
---
19+
20+
# Azure Event Hubs - Troubleshooting guide
21+
This article provides troubleshooting tips and recommendations for a few issues that you may see when using Azure EventHubs.
22+
23+
## Connectivity, certificate, or timeout issues
24+
The following steps may help you with troubleshooting connectivity/certificate/timeout issues for all services under *.servicebus.windows.net.
25+
26+
- Browse to or [wget](https://www.gnu.org/software/wget/) `https://<yournamespacename>.servicebus.windows.net/`. It helps with checking whether you have IP filtering or virtual network or certificate chain issues (most common when using java SDK).
27+
28+
An example of successful message:
29+
30+
```xml
31+
<feed xmlns="http://www.w3.org/2005/Atom"><title type="text">Publicly Listed Services</title><subtitle type="text">This is the list of publicly-listed services currently available.</subtitle><id>uuid:27fcd1e2-3a99-44b1-8f1e-3e92b52f0171;id=30</id><updated>2019-12-27T13:11:47Z</updated><generator>Service Bus 1.1</generator></feed>
32+
```
33+
34+
An example of failure error message:
35+
36+
```json
37+
<Error>
38+
<Code>400</Code>
39+
<Detail>
40+
Bad Request. To know more visit https://aka.ms/sbResourceMgrExceptions. . TrackingId:b786d4d1-cbaf-47a8-a3d1-be689cda2a98_G22, SystemTracker:NoSystemTracker, Timestamp:2019-12-27T13:12:40
41+
</Detail>
42+
</Error>
43+
```
44+
- Run the following command to check if any port is blocked on the firewall. Ports used are 443 (HTTPS), 5671 (AMQP) and 9093 (Kafka). Depending on the library you use, other ports are also used. Here is the sample command that check whether the 5671 port is blocked.
45+
46+
```powershell
47+
tnc <yournamespacename>.servicebus.windows.net -port 5671
48+
```
49+
50+
On Linux:
51+
52+
```shell
53+
telnet <yournamespacename>.servicebus.windows.net 5671
54+
```
55+
- When there are intermittent connectivity issues, run the following command to check if there are any dropped packets. This command will try to establish 25 different TCP connections every 1 second with the service. Then, you can check how many of them succeeded/failed and also see TCP connection latency. You can download the `psping` tool from [here](/sysinternals/downloads/psping).
56+
57+
```shell
58+
.\psping.exe -n 25 -i 1 -q <yournamespacename>.servicebus.windows.net:5671 -nobanner
59+
```
60+
You can use equivalent commands if you're using other tools such as `tnc`, `ping`, and so on.
61+
- Obtain a network trace if the previous steps don't help and analyze it using tools such as [Wireshark](https://www.wireshark.org/). Contact [Microsoft Support](https://support.microsoft.com/) if needed.
62+
63+
## Issues that may occur with service upgrades/restarts
64+
Backend service upgrades and restarts may cause the following impact to your applications:
65+
66+
- Requests may be momentarily throttled.
67+
- There may be a drop in incoming messages/requests.
68+
- The log file may contain error messages.
69+
- The applications may be disconnected from the service for a few seconds.
70+
71+
If the application code utilizes SDK, the retry policy is already built in and active. The application will reconnect without significant impact to the application/workflow.
72+
73+
74+
## Next steps
75+
76+
You can learn more about Event Hubs by visiting the following links:
77+
78+
* [Event Hubs overview](event-hubs-what-is-event-hubs.md)
79+
* [Create an Event Hub](event-hubs-create.md)
80+
* [Event Hubs FAQ](event-hubs-faq.md)

0 commit comments

Comments
 (0)