Skip to content

Commit 975e808

Browse files
authored
Fixed the code for BlobServiceClient
1 parent 81caaff commit 975e808

File tree

1 file changed

+16
-3
lines changed

1 file changed

+16
-3
lines changed

articles/batch/tutorial-run-python-batch-azure-data-factory.md

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,8 @@ title: 'Tutorial: Run a Batch job through Azure Data Factory'
33
description: Learn how to use Batch Explorer, Azure Storage Explorer, and a Python script to run a Batch workload through an Azure Data Factory pipeline.
44
ms.devlang: python
55
ms.topic: tutorial
6-
ms.date: 03/01/2024
6+
ms.date: 12/23/2024
7+
ai-usage: ai-assisted
78
ms.custom: mvc, devx-track-python
89
---
910

@@ -82,8 +83,10 @@ Paste the connection string into the following script, replacing the `<storage-a
8283
8384
``` python
8485
# Load libraries
85-
from azure.storage.blob import BlobClient
86+
# from azure.storage.blob import BlobClient
87+
from azure.storage.blob import BlobServiceClient
8688
import pandas as pd
89+
import io
8790

8891
# Define parameters
8992
connectionString = "<storage-account-connection-string>"
@@ -93,8 +96,16 @@ outputBlobName = "iris_setosa.csv"
9396
# Establish connection with the blob storage account
9497
blob = BlobClient.from_connection_string(conn_str=connectionString, container_name=containerName, blob_name=outputBlobName)
9598

99+
# Initialize the BlobServiceClient (This initializes a connection to the Azure Blob Storage, downloads the content of the 'iris.csv' file, and then loads it into a Pandas DataFrame for further processing.)
100+
blob_service_client = BlobServiceClient.from_connection_string(conn_str=connectionString)
101+
blob_client = blob_service_client.get_blob_client(container_name=containerName, blob_name=outputBlobName)
102+
103+
# Download the blob content
104+
blob_data = blob_client.download_blob().readall()
105+
96106
# Load iris dataset from the task node
97-
df = pd.read_csv("iris.csv")
107+
# df = pd.read_csv("iris.csv")
108+
df = pd.read_csv(io.BytesIO(blob_data))
98109

99110
# Take a subset of the records
100111
df = df[df['Species'] == "setosa"]
@@ -106,6 +117,8 @@ with open(outputBlobName, "rb") as data:
106117
blob.upload_blob(data, overwrite=True)
107118
```
108119

120+
For more information on working with Azure Blob Storage, refer to the [Azure Blob Storage documentation](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction).
121+
109122
Run the script locally to test and validate functionality.
110123

111124
``` bash

0 commit comments

Comments
 (0)