Skip to content

Commit fdf91f7

Browse files
committed
ETL2
1 parent 55f0143 commit fdf91f7

File tree

1 file changed

+7
-2
lines changed

1 file changed

+7
-2
lines changed

articles/hdinsight/hdinsight-sales-insights-etl.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ms.reviewer: jasonh
77
ms.service: hdinsight
88
ms.topic: tutorial
99
ms.custom: hdinsightactive
10-
ms.date: 03/27/2020
10+
ms.date: 04/15/2020
1111
---
1212

1313
# Tutorial: Create an end-to-end data pipeline to derive sales insights in Azure HDInsight
@@ -88,6 +88,8 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
8888
./scripts/resources.sh $resourceGroup LOCATION
8989
```
9090

91+
If you're not sure which region to specify, you can retrieve a list of supported regions for your subscription with the [az account list-locations](https://docs.microsoft.com/cli/azure/account?view=azure-cli-latest#az-account-list-locations) command.
92+
9193
The command will deploy the following resources:
9294
9395
* An Azure Blob storage account. This account will hold the company sales data.
@@ -152,7 +154,7 @@ This data factory will have one pipeline with two activities:
152154
* The first activity will copy the data from Azure Blob storage to the Data Lake Storage Gen 2 storage account to mimic data ingestion.
153155
* The second activity will transform the data in the Spark cluster. The script transforms the data by removing unwanted columns. It also appends a new column that calculates the revenue that a single transaction generates.
154156
155-
To set up your Azure Data Factory pipeline, execute the following command:
157+
To set up your Azure Data Factory pipeline, execute the command below. You should still be at the `hdinsight-sales-insights-etl` directory.
156158
157159
```bash
158160
blobStorageName=$(cat resourcesoutputs_storage.json | jq -r '.properties.outputs.blobStorageName.value')
@@ -186,6 +188,9 @@ To trigger the pipeline, you can either:
186188
* Trigger the Data Factory pipeline in PowerShell. Replace `RESOURCEGROUP`, and `DataFactoryName` with the appropriate values, then run the following commands:
187189

188190
```powershell
191+
# If you have multiple subscriptions, set the one to use
192+
# Select-AzSubscription -SubscriptionId "<SUBSCRIPTIONID>"
193+
189194
$resourceGroup="RESOURCEGROUP"
190195
$dataFactory="DataFactoryName"
191196

0 commit comments

Comments
 (0)