You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: This article provides an overview of the Capture feature that allows you to capture events streaming through Azure Event Hubs.
4
4
ms.topic: article
5
-
ms.date: 02/16/2021
5
+
ms.date: 05/31/2022
6
6
---
7
7
8
8
# Capture events through Azure Event Hubs in Azure Blob Storage or Azure Data Lake Storage
@@ -27,6 +27,9 @@ Event Hubs Capture enables you to specify your own Azure Blob storage account an
27
27
28
28
Captured data is written in [Apache Avro][Apache Avro] format: a compact, fast, binary format that provides rich data structures with inline schema. This format is widely used in the Hadoop ecosystem, Stream Analytics, and Azure Data Factory. More information about working with Avro is available later in this article.
29
29
30
+
> [!NOTE]
31
+
> When you use no code editor in the Azure portal, you can capture streaming data in Event Hubs in an Azure Data Lake Storage Gen2 account in the **Parquet** format. For more information, see [How to: capture data from Event Hubs in Parquet format](../stream-analytics/capture-event-hub-data-parquet.md?toc=%2Fazure%2Fevent-hubs%2Ftoc.json) and [Tutorial: capture Event Hubs data in Parquet format and analyze with Azure Synapse Analytics](../stream-analytics/event-hubs-parquet-capture-tutorial.md?toc=%2Fazure%2Fevent-hubs%2Ftoc.json).
32
+
30
33
### Capture windowing
31
34
32
35
Event Hubs Capture enables you to set up a window to control capturing. This window is a minimum size and time configuration with a "first wins policy," meaning that the first trigger encountered causes a capture operation. If you have a fifteen-minute, 100 MB capture window and send 1 MB per second, the size window triggers before the time window. Each partition captures independently and writes a completed block blob at the time of capture, named for the time at which the capture interval was encountered. The storage naming convention is as follows:
@@ -41,13 +44,13 @@ The date values are padded with zeroes; an example filename might be:
In the event that your Azure storage blob is temporarily unavailable, Event Hubs Capture will retain your data for the data retention period configured on your event hub and back fill the data once your storage account is available again.
47
+
If your Azure storage blob is temporarily unavailable, Event Hubs Capture will retain your data for the data retention period configured on your event hub and back fill the data once your storage account is available again.
45
48
46
49
### Scaling throughput units or processing units
47
50
48
51
In the standard tier of Event Hubs, the traffic is controlled by [throughput units](event-hubs-scalability.md#throughput-units) and in the premium tier Event Hubs, it's controlled by [processing units](event-hubs-scalability.md#processing-units). Event Hubs Capture copies data directly from the internal Event Hubs storage, bypassing throughput unit or processing unit egress quotas and saving your egress for other processing readers, such as Stream Analytics or Spark.
49
52
50
-
Once configured, Event Hubs Capture runs automatically when you send your first event, and continues running. To make it easier for your downstream processing to know that the process is working, Event Hubs writes empty files when there is no data. This process provides a predictable cadence and marker that can feed your batch processors.
53
+
Once configured, Event Hubs Capture runs automatically when you send your first event, and continues running. To make it easier for your downstream processing to know that the process is working, Event Hubs writes empty files when there's no data. This process provides a predictable cadence and marker that can feed your batch processors.
51
54
52
55
## Setting up Event Hubs Capture
53
56
@@ -124,7 +127,7 @@ Apache Avro has complete Getting Started guides for [Java][Java] and [Python][Py
124
127
125
128
Event Hubs Capture is metered similarly to [throughput units](event-hubs-scalability.md#throughput-units) (standard tier) or [processing units](event-hubs-scalability.md#processing-units) (in premium tier): as an hourly charge. The charge is directly proportional to the number of throughput units or processing units purchased for the namespace. As throughput units or processing units are increased and decreased, Event Hubs Capture meters increase and decrease to provide matching performance. The meters occur in tandem. For pricing details, see [Event Hubs pricing](https://azure.microsoft.com/pricing/details/event-hubs/).
126
129
127
-
Capture does not consume egress quota as it is billed separately.
130
+
Capture doesn't consume egress quota as it is billed separately.
Copy file name to clipboardExpand all lines: articles/service-bus-messaging/duplicate-detection.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,12 +2,12 @@
2
2
title: Azure Service Bus duplicate message detection | Microsoft Docs
3
3
description: This article explains how you can detect duplicates in Azure Service Bus messages. The duplicate message can be ignored and dropped.
4
4
ms.topic: article
5
-
ms.date: 04/19/2021
5
+
ms.date: 05/31/2022
6
6
---
7
7
8
8
# Duplicate detection
9
9
10
-
If an application fails due to a fatal error immediately after it sends a message, and the restarted application instance erroneously believes that the prior message delivery did not occur, a subsequent send causes the same message to appear in the system twice.
10
+
If an application fails due to a fatal error immediately after it sends a message, and the restarted application instance erroneously believes that the prior message delivery didn't occur, a subsequent send causes the same message to appear in the system twice.
11
11
12
12
It's also possible for an error at the client or network level to occur a moment earlier, and for a sent message to be committed into the queue, with the acknowledgment not successfully returned to the client. This scenario leaves the client in doubt about the outcome of the send operation.
13
13
@@ -44,7 +44,7 @@ Keeping the window small means that fewer message-ids must be retained and match
44
44
## Next steps
45
45
You can enable duplicate message detection using Azure portal, PowerShell, CLI, Resource Manager template, .NET, Java, Python, and JavaScript. For more information, see [Enable duplicate message detection](enable-duplicate-detection.md).
46
46
47
-
In scenarios where client code is unable to resubmit a message with the same *MessageId* as before, it is important to design messages that can be safely reprocessed. This [blog post about idempotence](https://particular.net/blog/what-does-idempotent-mean) describes various techniques for how to do that.
47
+
In scenarios where client code is unable to resubmit a message with the same *MessageId* as before, it's important to design messages that can be safely reprocessed. This [blog post about idempotence](https://particular.net/blog/what-does-idempotent-mean) describes various techniques for how to do that.
48
48
49
49
Try the samples in the language of your choice to explore Azure Service Bus features.
Copy file name to clipboardExpand all lines: articles/service-bus-messaging/message-transfers-locks-settlement.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
title: Azure Service Bus message transfers, locks, and settlement
3
3
description: This article provides an overview of Azure Service Bus message transfers, locks, and settlement operations.
4
4
ms.topic: article
5
-
ms.date: 04/12/2021
5
+
ms.date: 05/31/2022
6
6
ms.devlang: csharp
7
7
ms.custom: devx-track-csharp
8
8
---
@@ -21,7 +21,7 @@ Using any of the supported Service Bus API clients, send operations into Service
21
21
22
22
If the message is rejected by Service Bus, the rejection contains an error indicator and text with a **tracking-id** in it. The rejection also includes information about whether the operation can be retried with any expectation of success. In the client, this information is turned into an exception and raised to the caller of the send operation. If the message has been accepted, the operation silently completes.
23
23
24
-
When using the AMQP protocol, which is the exclusive protocol for the .NET Standard, Java, JavaScript, Python, and Go clients, and [an option for the .NET Framework client](service-bus-amqp-dotnet.md), message transfers and settlements are pipelined and asynchronous. We recommend that you use the asynchronous programming model API variants.
24
+
When you use the AMQP protocol, which is the exclusive protocol for the .NET Standard, Java, JavaScript, Python, and Go clients, and [an option for the .NET Framework client](service-bus-amqp-dotnet.md), message transfers and settlements are pipelined and asynchronous. We recommend that you use the asynchronous programming model API variants.
25
25
26
26
A sender can put several messages on the wire in rapid succession without having to wait for each message to be acknowledged, as would otherwise be the case with the SBMP protocol or with HTTP 1.1. Those asynchronous send operations complete as the respective messages are accepted and stored, on partitioned entities or when send operation to different entities overlap. The completions might also occur out of the original send order.
27
27
@@ -41,7 +41,7 @@ for (int i = 0; i < 100; i++)
41
41
42
42
If the application starts the 10 asynchronous send operations in immediate succession and awaits their respective completion separately, the round-trip time for those 10 send operations overlaps. The 10 messages are transferred in immediate succession, potentially even sharing TCP frames, and the overall transfer duration largely depends on the network-related time it takes to get the messages transferred to the broker.
43
43
44
-
Making the same assumptions as for the prior loop, the total overlapped execution time for the following loop might stay well under one second:
44
+
With the same assumptions as for the prior loop, the total overlapped execution time for the following loop might stay well under one second:
45
45
46
46
```csharp
47
47
vartasks=newList<Task>();
@@ -52,9 +52,9 @@ for (int i = 0; i < 100; i++)
52
52
awaitTask.WhenAll(tasks);
53
53
```
54
54
55
-
It is important to note that all asynchronous programming models use some form of memory-based, hidden work queue that holds pending operations. When the send API returns, the send task is queued up in that work queue but the protocol gesture only commences once it is the task's turn to run. For code that tends to push bursts of messages and where reliability is a concern, care should be taken that not too many messages are put "in flight" at once, because all sent messages take up memory until they have factually been put onto the wire.
55
+
It's important to note that all asynchronous programming models use some form of memory-based, hidden work queue that holds pending operations. When the send API returns, the send task is queued up in that work queue but the protocol gesture only commences once it's the task's turn to run. For code that tends to push bursts of messages and where reliability is a concern, care should be taken that not too many messages are put "in flight" at once, because all sent messages take up memory until they have factually been put onto the wire.
56
56
57
-
Semaphores, as shown in the following code snippet in C#, are synchronization objects that enable such application-level throttling when needed. This use of a semaphore allows for at most 10 messages to be in flight at once. One of the 10 available semaphore locks is taken before the send and it is released as the send completes. The 11th pass through the loop waits until at least one of the prior sends has completed, and then makes its lock available:
57
+
Semaphores, as shown in the following code snippet in C#, are synchronization objects that enable such application-level throttling when needed. This use of a semaphore allows for at most 10 messages to be in flight at once. One of the 10 available semaphore locks is taken before the send and it's released as the send completes. The 11th pass through the loop waits until at least one of the prior sends has completed, and then makes its lock available:
58
58
59
59
```csharp
60
60
varsemaphore=newSemaphoreSlim(10);
@@ -79,7 +79,7 @@ for (int i = 0; i < 100; i++)
79
79
}
80
80
```
81
81
82
-
With a low-level AMQP client, Service Bus also accepts "pre-settled" transfers. A pre-settled transfer is a fire-and-forget operation for which the outcome, either way, is not reported back to the client and the message is considered settled when sent. The lack of feedback to the client also means that there is no actionable data available for diagnostics, which means that this mode does not qualify for help via Azure support.
82
+
With a low-level AMQP client, Service Bus also accepts "pre-settled" transfers. A pre-settled transfer is a fire-and-forget operation for which the outcome, either way, isn't reported back to the client and the message is considered settled when sent. The lack of feedback to the client also means that there's no actionable data available for diagnostics, which means that this mode doesn't qualify for help via Azure support.
83
83
84
84
## Settling receive operations
85
85
@@ -89,11 +89,11 @@ For receive operations, the Service Bus API clients enable two different explici
89
89
90
90
The **receive-and-delete** mode tells the broker to consider all messages it sends to the receiving client as settled when sent. That means that the message is considered consumed as soon as the broker has put it onto the wire. If the message transfer fails, the message is lost.
91
91
92
-
The upside of this mode is that the receiver does not need to take further action on the message and is also not slowed by waiting for the outcome of the settlement. If the data contained in the individual messages have low value and/or are only meaningful for a very short time, this mode is a reasonable choice.
92
+
The upside of this mode is that the receiver doesn't need to take further action on the message and is also not slowed by waiting for the outcome of the settlement. If the data contained in the individual messages have low value and/or are only meaningful for a very short time, this mode is a reasonable choice.
93
93
94
94
### PeekLock
95
95
96
-
The **peek-lock** mode tells the broker that the receiving client wants to settle received messages explicitly. The message is made available for the receiver to process, while held under an exclusive lock in the service so that other, competing receivers cannot see it. The duration of the lock is initially defined at the queue or subscription level and can be extended by the client owning the lock. For details about renewing locks, see the [Renew locks](#renew-locks) section in this article.
96
+
The **peek-lock** mode tells the broker that the receiving client wants to settle received messages explicitly. The message is made available for the receiver to process, while held under an exclusive lock in the service so that other, competing receivers can't see it. The duration of the lock is initially defined at the queue or subscription level and can be extended by the client owning the lock. For details about renewing locks, see the [Renew locks](#renew-locks) section in this article.
97
97
98
98
When a message is locked, other clients receiving from the same queue or subscription can take on locks and retrieve the next available messages not under active lock. When the lock on a message is explicitly released or when the lock expires, the message pops back up at or near the front of the retrieval order for redelivery.
99
99
@@ -103,15 +103,15 @@ The receiving client initiates settlement of a received message with a positive
103
103
104
104
When the receiving client fails to process a message but wants the message to be redelivered, it can explicitly ask for the message to be released and unlocked instantly by calling the `Abandon` API for the message or it can do nothing and let the lock elapse.
105
105
106
-
If a receiving client fails to process a message and knows that redelivering the message and retrying the operation will not help, it can reject the message, which moves it into the dead-letter queue by calling the `DeadLetter` API on the message, which also allows setting a custom property including a reason code that can be retrieved with the message from the dead-letter queue.
106
+
If a receiving client fails to process a message and knows that redelivering the message and retrying the operation won't help, it can reject the message, which moves it into the dead-letter queue by calling the `DeadLetter` API on the message, which also allows setting a custom property including a reason code that can be retrieved with the message from the dead-letter queue.
107
107
108
108
A special case of settlement is deferral, which is discussed in a [separate article](message-deferral.md).
109
109
110
-
The `Complete`, `Deadletter`, or `RenewLock` operations may fail due to network issues, if the held lock has expired, or there are other service-side conditions that prevent settlement. In one of the latter cases, the service sends a negative acknowledgment that surfaces as an exception in the API clients. If the reason is a broken network connection, the lock is dropped since Service Bus does not support recovery of existing AMQP links on a different connection.
110
+
The `Complete`, `Deadletter`, or `RenewLock` operations may fail due to network issues, if the held lock has expired, or there are other service-side conditions that prevent settlement. In one of the latter cases, the service sends a negative acknowledgment that surfaces as an exception in the API clients. If the reason is a broken network connection, the lock is dropped since Service Bus doesn't support recovery of existing AMQP links on a different connection.
111
111
112
-
If `Complete` fails, which occurs typically at the very end of message handling and in some cases after minutes of processing work, the receiving application can decide whether it preserves the state of the work and ignores the same message when it is delivered a second time, or whether it tosses out the work result and retries as the message is redelivered.
112
+
If `Complete` fails, which occurs typically at the very end of message handling and in some cases after minutes of processing work, the receiving application can decide whether it preserves the state of the work and ignores the same message when it's delivered a second time, or whether it tosses out the work result and retries as the message is redelivered.
113
113
114
-
The typical mechanism for identifying duplicate message deliveries is by checking the message-id, which can and should be set by the sender to a unique value, possibly aligned with an identifier from the originating process. A job scheduler would likely set the message-id to the identifier of the job it is trying to assign to a worker with the given worker, and the worker would ignore the second occurrence of the job assignment if that job is already done.
114
+
The typical mechanism for identifying duplicate message deliveries is by checking the message-id, which can and should be set by the sender to a unique value, possibly aligned with an identifier from the originating process. A job scheduler would likely set the message-id to the identifier of the job it's trying to assign to a worker with the given worker, and the worker would ignore the second occurrence of the job assignment if that job is already done.
115
115
116
116
> [!IMPORTANT]
117
117
> It is important to note that the lock that PeekLock acquires on the message is volatile and may be lost in the following conditions
0 commit comments