Skip to content

Commit 33da568

Browse files
authored
Merge pull request #102724 from itechedit/three-event-hubs-articles
edit pass: Three event-hubs articles
2 parents 619f3c5 + e24cddb commit 33da568

File tree

3 files changed

+150
-136
lines changed

3 files changed

+150
-136
lines changed

articles/event-hubs/get-started-capture-python-v2.md

Lines changed: 52 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: Read Azure Event Hubs captured data from Python app | Microsoft Docs
3-
description: This article shows you how to write Python code to capture data that's sent to an event hub and read the captured event data from an Azure Storage.
2+
title: Read Azure Event Hubs captured data from a Python app | Microsoft Docs
3+
description: This article shows you how to write Python code to capture data that's sent to an event hub and read the captured event data from an Azure storage account.
44
services: event-hubs
55
documentationcenter: ''
66
author: spelluru
@@ -16,13 +16,14 @@ ms.author: spelluru
1616

1717
---
1818

19-
# Capture Event Hubs data in Azure Storage and read it using Python
20-
You can use configure an event hub so that the data sent to an event hub is captured in an Azure Storage or Azure Data Lake Storage. This article shows you how to use write Python code to send events to an event hub and read the captured data from an Azure blob storage. For more information about this feature, see the [Event Hubs Capture feature overview](event-hubs-capture-overview.md).
19+
# Capture Event Hubs data in Azure Storage and read it by using Python
2120

22-
This sample uses the [Azure Python SDK](https://azure.microsoft.com/develop/python/) to demonstrate the Capture feature. The sender.py program sends simulated environmental telemetry to Event Hubs in JSON format. The event hub is configured to use the Capture feature to write this data to Blob storage in batches. The capturereader.py app reads these blobs and creates an append file per device. The app then writes the data into .csv files.
21+
You can configure an event hub so that the data that's sent to an event hub is captured in an Azure storage account or Azure Data Lake Storage. This article shows you how to write Python code to send events to an event hub and read the captured data from Azure Blob storage. For more information about this feature, see [Event Hubs Capture feature overview](event-hubs-capture-overview.md).
22+
23+
This quickstart uses the [Azure Python SDK](https://azure.microsoft.com/develop/python/) to demonstrate the Capture feature. The *sender.py* app sends simulated environmental telemetry to event hubs in JSON format. The event hub is configured to use the Capture feature to write this data to Blob storage in batches. The *capturereader.py* app reads these blobs and creates an append file for each device. The app then writes the data into CSV files.
2324

2425
> [!IMPORTANT]
25-
> This quickstart uses version 5 of the Azure Event Hubs Python SDK. For a quick start that uses the old version 1 of the Python SDK, see [this article](event-hubs-capture-python.md). If you are using version 1 of the SDK, we recommend that you migrate your code to the latest version. For details, see the [migration guide](https://github.com/Azure/azure-sdk-for-python/blob/master/sdk/eventhub/azure-eventhub/migration_guide.md).
26+
> This quickstart uses version 5 of the Azure Event Hubs Python SDK. For a quickstart that uses version 1 of the Python SDK, see [Quickstart: Event Hubs Capture walkthrough - Python](event-hubs-capture-python.md). If you're using version 1 of the SDK, we recommend that you migrate your code to the latest version. For more information, see the [migration guide](https://github.com/Azure/azure-sdk-for-python/blob/master/sdk/eventhub/azure-eventhub/migration_guide.md).
2627
2728
In this quickstart, you:
2829

@@ -35,23 +36,24 @@ In this quickstart, you:
3536
3637
## Prerequisites
3738

38-
- Python 2.7, and 3.5 or later, with `pip` installed and updated.
39-
- An Azure subscription. If you don't have one, [create a free account](https://azure.microsoft.com/free/) before you begin.
40-
- [Create an Event Hubs namespace and an event hub in the namespace](event-hubs-create.md). Note down the name of the Event Hubs namespace, name of the event hub, and the primary access key for the namespace. Get the access key by following instructions from the article: [Get connection string](event-hubs-get-connection-string.md#get-connection-string-from-the-portal). The default key name is: **RootManageSharedAccessKey**. You don't need the connection string for the tutorial. You just need the primary key.
41-
- Follow these steps to create an **Azure Storage account** and a **blob container**:
42-
1. [Create an Azure Storage account](../storage/common/storage-quickstart-create-account.md?tabs=azure-portal).
43-
2. [Create a blob container in the storage](../storage/blobs/storage-quickstart-blobs-portal.md#create-a-container).
44-
3. [Get the connection string to the storage account](../storage/common/storage-configure-connection-string.md#view-and-copy-a-connection-string).
39+
- Python 2.7, and 3.5 or later, with PIP installed and updated.
40+
- An Azure subscription. If you don't have one, [create a free account](https://azure.microsoft.com/free/) before you begin.
41+
- An active Event Hubs namespace and event hub.
42+
[Create an Event Hubs namespace and an event hub in the namespace](event-hubs-create.md). Record the name of the Event Hubs namespace, the name of the event hub, and the primary access key for the namespace. To get the access key, see [Get an Event Hubs connection string](event-hubs-get-connection-string.md#get-connection-string-from-the-portal). The default key name is *RootManageSharedAccessKey*. For this quickstart, you need only the primary key. You don't need the connection string.
43+
- An Azure storage account, a blob container in the storage account, and a connection string to the storage account. If you don't have these items, do the following:
44+
1. [Create an Azure storage account](../storage/common/storage-quickstart-create-account.md?tabs=azure-portal)
45+
1. [Create a blob container in the storage account](../storage/blobs/storage-quickstart-blobs-portal.md#create-a-container)
46+
1. [Get the connection string to the storage account](../storage/common/storage-configure-connection-string.md#view-and-copy-a-connection-string)
4547

46-
Note down **connection string** and the **container name**. You will use them later in the code.
47-
- Enable **Capture** feature for the event hub by following instructions from: [Enable Event Hubs Capture using the Azure portal](event-hubs-capture-enable-through-portal.md). Select the storage account and the blob container you created in the previous step. You can also enable the feature when creating an event hub.
48+
Be sure to record the connection string and container name for later use in this quickstart.
49+
- Enable the Capture feature for the event hub. To do so, follow the instructions in [Enable Event Hubs Capture using the Azure portal](event-hubs-capture-enable-through-portal.md). Select the storage account and the blob container you created in the preceding step. You can also enable the feature when you create an event hub.
4850

4951
## Create a Python script to send events to your event hub
50-
In this section, you create a Python script that sends 200 events (10 devices * 20 events) to an event hub. These events are sample environmental reading sent in JSON format.
52+
In this section, you create a Python script that sends 200 events (10 devices * 20 events) to an event hub. These events are a sample environmental reading that's sent in JSON format.
5153

5254
1. Open your favorite Python editor, such as [Visual Studio Code][Visual Studio Code].
53-
2. Create a script called **sender.py**.
54-
3. Paste the following code into sender.py. See the code comments for details.
55+
2. Create a script called *sender.py*.
56+
3. Paste the following code into *sender.py*.
5557

5658
```python
5759
import time
@@ -63,39 +65,39 @@ In this section, you create a Python script that sends 200 events (10 devices *
6365

6466
from azure.eventhub import EventHubProducerClient, EventData
6567

66-
# this scripts simulates production of events for 10 devices
68+
# This script simulates the production of events for 10 devices.
6769
devices = []
6870
for x in range(0, 10):
6971
devices.append(str(uuid.uuid4()))
7072

71-
# create a producer client to produce/publish events to the event hub
73+
# Create a producer client to produce and publish events to the event hub.
7274
producer = EventHubProducerClient.from_connection_string(conn_str="EVENT HUBS NAMESAPCE CONNECTION STRING", eventhub_name="EVENT HUB NAME")
7375

74-
for y in range(0,20): # for each device, produce 20 events
75-
event_data_batch = producer.create_batch() # create a batch. you will add events to the batch later.
76+
for y in range(0,20): # For each device, produce 20 events.
77+
event_data_batch = producer.create_batch() # Create a batch. You will add events to the batch later.
7678
for dev in devices:
77-
# create a dummy reading
79+
# Create a dummy reading.
7880
reading = {'id': dev, 'timestamp': str(datetime.datetime.utcnow()), 'uv': random.random(), 'temperature': random.randint(70, 100), 'humidity': random.randint(70, 100)}
79-
s = json.dumps(reading) # convert reading into a JSON string
80-
event_data_batch.add(EventData(s)) # add event data to the batch
81-
producer.send_batch(event_data_batch) # send the batch of events to the event hub
81+
s = json.dumps(reading) # Convert the reading into a JSON string.
82+
event_data_batch.add(EventData(s)) # Add event data to the batch.
83+
producer.send_batch(event_data_batch) # Send the batch of events to the event hub.
8284

83-
# close the producer
85+
# Close the producer.
8486
producer.close()
8587
```
86-
4. Replace the following values in the scripts:
87-
1. Replace `EVENT HUBS NAMESPACE CONNECTION STRING` with the connection string for your Event Hubs namespace.
88-
2. Replace `EVENT HUB NAME` with the name of your event hub.
89-
5. Run the script to send events to the event hub.
90-
6. In the Azure portal, you can verify that the event hub has received the messages. Switch to **Messages** view in the **Metrics** section. Refresh the page to update the chart. It may take a few seconds for it to show that the messages have been received.
88+
4. Replace the following values in the scripts:
89+
* Replace `EVENT HUBS NAMESPACE CONNECTION STRING` with the connection string for your Event Hubs namespace.
90+
* Replace `EVENT HUB NAME` with the name of your event hub.
91+
5. Run the script to send events to the event hub.
92+
6. In the Azure portal, you can verify that the event hub has received the messages. Switch to **Messages** view in the **Metrics** section. Refresh the page to update the chart. It might take a few seconds for the page to display that the messages have been received.
9193

9294
[![Verify that the event hub received the messages](./media/get-started-capture-python-v2/messages-portal.png)](./media/get-started-capture-python-v2/messages-portal.png#lightbox)
9395

9496
## Create a Python script to read your Capture files
95-
In this example, the captured data is stored in Azure Blob Storage. The script in this section reads the capture data files from your Azure Storage and generates CSV files for you to easily open and view the contents. You will see 10 files in the current working directory of the application. These files will contain the environmental readings for the 10 devices.
97+
In this example, the captured data is stored in Azure Blob storage. The script in this section reads the captured data files from your Azure storage account and generates CSV files for you to easily open and view. You will see 10 files in the current working directory of the application. These files will contain the environmental readings for the 10 devices.
9698

97-
1. In your Python editor, create a script called **capturereader.py**. This script reads the captured files and creates a file per device to write the data only for that device.
98-
2. Paste the following code into capturereader.py. See the code comments for details.
99+
1. In your Python editor, create a script called *capturereader.py*. This script reads the captured files and creates a file for each device to write the data only for that device.
100+
2. Paste the following code into *capturereader.py*.
99101

100102
```python
101103
import os
@@ -131,28 +133,28 @@ In this example, the captured data is stored in Azure Blob Storage. The script i
131133

132134
def startProcessing():
133135
print('Processor started using path: ' + os.getcwd())
134-
# create a blob container client
136+
# Create a blob container client.
135137
container = ContainerClient.from_connection_string("AZURE STORAGE CONNECTION STRING", container_name="BLOB CONTAINER NAME")
136-
blob_list = container.list_blobs() # list all the blobs in the container
138+
blob_list = container.list_blobs() # List all the blobs in the container.
137139
for blob in blob_list:
138-
#content_length == 508 is an empty file, so only process content_length > 508 (skip empty files)
140+
# Content_length == 508 is an empty file, so process only content_length > 508 (skip empty files).
139141
if blob.size > 508:
140142
print('Downloaded a non empty blob: ' + blob.name)
141-
# create a blob client for the blob
143+
# Create a blob client for the blob.
142144
blob_client = ContainerClient.get_blob_client(container, blob=blob.name)
143-
# construct a file name based on the blob name
145+
# Construct a file name based on the blob name.
144146
cleanName = str.replace(blob.name, '/', '_')
145147
cleanName = os.getcwd() + '\\' + cleanName
146-
with open(cleanName, "wb+") as my_file: # open the file to write. create if it doesn't exist.
147-
my_file.write(blob_client.download_blob().readall()) # write blob contents into the file
148-
processBlob2(cleanName) # convert the file into a CSV file
149-
os.remove(cleanName) # remove the original downloaded file
150-
# delete the blob from the container after it's read
148+
with open(cleanName, "wb+") as my_file: # Open the file to write. Create it if it doesn't exist.
149+
my_file.write(blob_client.download_blob().readall()) # Write blob contents into the file.
150+
processBlob2(cleanName) # Convert the file into a CSV file.
151+
os.remove(cleanName) # Remove the original downloaded file.
152+
# Delete the blob from the container after it's read.
151153
container.delete_blob(blob.name)
152154

153155
startProcessing()
154156
```
155-
4. Replace `<AZURE STORAGE CONNECTION STRING>` with the connection string for your Azure Storage account. The name of container you created in this tutorial is: **capture**. If you used a different name for the container, replace `capture` with the name of the container in the storage account.
157+
3. Replace `AZURE STORAGE CONNECTION STRING` with the connection string for your Azure storage account. The name of the container you created in this quickstart is *capture*. If you used a different name for the container, replace *capture* with the name of the container in the storage account.
156158

157159
## Run the scripts
158160
1. Open a command prompt that has Python in its path, and then run these commands to install Python prerequisite packages:
@@ -162,23 +164,23 @@ In this example, the captured data is stored in Azure Blob Storage. The script i
162164
pip install azure-eventhub
163165
pip install avro-python3
164166
```
165-
2. Change your directory to wherever you saved sender.py and capturereader.py, and run this command:
167+
2. Change your directory to the directory where you saved *sender.py* and *capturereader.py*, and run this command:
166168

167169
```
168170
python sender.py
169171
```
170172

171173
This command starts a new Python process to run the sender.
172-
3. Wait a few minutes for the capture to run. Then type the following command into your original command window:
174+
3. Wait a few minutes for the capture to run, and then enter the following command in your original command window:
173175

174176
```
175177
python capturereader.py
176178
```
177179

178-
This capture processor uses the local directory to download all the blobs from the storage account/container. It processes any that are not empty, and it writes the results as .csv files into the local directory.
180+
This capture processor uses the local directory to download all the blobs from the storage account and container. It processes any that are not empty, and it writes the results as CSV files into the local directory.
179181

180182
## Next steps
181-
Check out Python samples on the GitHub [here](https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/eventhub/azure-eventhub/samples).
183+
Check out [Python samples on GitHub](https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/eventhub/azure-eventhub/samples).
182184

183185

184186
[Azure portal]: https://portal.azure.com/

0 commit comments

Comments
 (0)