You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-sales-insights-etl.md
+7-2Lines changed: 7 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ ms.reviewer: jasonh
7
7
ms.service: hdinsight
8
8
ms.topic: tutorial
9
9
ms.custom: hdinsightactive
10
-
ms.date: 03/27/2020
10
+
ms.date: 04/15/2020
11
11
---
12
12
13
13
# Tutorial: Create an end-to-end data pipeline to derive sales insights in Azure HDInsight
@@ -88,6 +88,8 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
88
88
./scripts/resources.sh $resourceGroup LOCATION
89
89
```
90
90
91
+
If you're not sure which region to specify, you can retrieve a list of supported regions for your subscription with the [az account list-locations](https://docs.microsoft.com/cli/azure/account?view=azure-cli-latest#az-account-list-locations) command.
92
+
91
93
The command will deploy the following resources:
92
94
93
95
* An Azure Blob storage account. This account will hold the company sales data.
@@ -152,7 +154,7 @@ This data factory will have one pipeline with two activities:
152
154
* The first activity will copy the data from Azure Blob storage to the Data Lake Storage Gen 2 storage account to mimic data ingestion.
153
155
* The second activity will transform the data in the Spark cluster. The script transforms the data by removing unwanted columns. It also appends a new column that calculates the revenue that a single transaction generates.
154
156
155
-
To set up your Azure Data Factory pipeline, execute the following command:
157
+
To set up your Azure Data Factory pipeline, execute the command below. You should still be at the `hdinsight-sales-insights-etl` directory.
@@ -186,6 +188,9 @@ To trigger the pipeline, you can either:
186
188
* Trigger the Data Factory pipeline in PowerShell. Replace `RESOURCEGROUP`, and `DataFactoryName` with the appropriate values, then run the following commands:
187
189
188
190
```powershell
191
+
# If you have multiple subscriptions, set the one to use
0 commit comments