You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/data-factory/v1/data-factory-build-your-first-pipeline-using-arm.md
+20-13Lines changed: 20 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -300,24 +300,30 @@ Create a JSON file named **ADFTutorialARM-Parameters.json** that contains parame
300
300
```
301
301
302
302
> [!IMPORTANT]
303
-
> You may have separate parameter JSON files for development, testing, and production environments that you can use with the same Data Factory JSON template. By using a Power Shell script, you can automate deploying Data Factory entities in these environments.
304
-
>
305
-
>
303
+
> You may have separate parameter JSON files for development, testing, and production environments that you can use with the same Data Factory JSON template. By using a Power Shell script, you can automate deploying Data Factory entities in these environments.
306
304
307
305
## Create data factory
308
-
1. Start **Azure PowerShell** and run the following command:
306
+
307
+
1. Start **Azure PowerShell** and run the following command:
308
+
309
309
* Run the following command and enter the user name and password that you use to sign in to the Azure portal.
310
-
```PowerShell
310
+
311
+
```powershell
311
312
Connect-AzAccount
312
-
```
313
+
```
314
+
313
315
* Run the following command to view all the subscriptions for this account.
314
-
```PowerShell
316
+
317
+
```powershell
315
318
Get-AzSubscription
316
-
```
317
-
* Run the following command to select the subscription that you want to work with. This subscription should be the same as the one you used in the Azure portal.
318
319
```
320
+
321
+
* Run the following command to select the subscription that you want to work with. This subscription should be the same as the one you used in the Azure portal.
2. Run the following command to deploy Data Factory entities using the Resource Manager template you created in Step 1.
322
328
323
329
```powershell
@@ -560,17 +566,18 @@ You define a pipeline that transform data by running Hive script on an on-demand
560
566
```
561
567
562
568
## Reuse the template
563
-
In the tutorial, you created a template for defining Data Factory entities and a template for passing values for parameters. To use the same template to deploy Data Factory entities to different environments, you create a parameter file for each environment and use it when deploying to that environment.
569
+
In the tutorial, you created a template for defining Data Factory entities and a template for passing values for parameters. To use the same template to deploy Data Factory entities to different environments, you create a parameter file for each environment and use it when deploying to that environment.
Notice that the first command uses parameter file for the development environment, second one for the test environment, and the third one for the production environment.
575
582
576
583
You can also reuse the template to perform repeated tasks. For example, you need to create many data factories with one or more pipelines that implement the same logic but each data factory uses different Azure storage and Azure SQL Database accounts. In this scenario, you use the same template in the same environment (dev, test, or production) with different parameter files to create data factories.
Copy file name to clipboardExpand all lines: articles/data-factory/v1/data-factory-build-your-first-pipeline-using-powershell.md
+56-36Lines changed: 56 additions & 36 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -43,27 +43,36 @@ The pipeline in this tutorial has one activity: **HDInsight Hive activity**. Thi
43
43
In this step, you use Azure PowerShell to create an Azure Data Factory named **FirstDataFactoryPSH**. A data factory can have one or more pipelines. A pipeline can have one or more activities in it. For example, a Copy Activity to copy data from a source to a destination data store and a HDInsight Hive activity to run a Hive script to transform input data. Let's start with creating the data factory in this step.
44
44
45
45
1. Start Azure PowerShell and run the following command. Keep Azure PowerShell open until the end of this tutorial. If you close and reopen, you need to run these commands again.
46
+
46
47
* Run the following command and enter the user name and password that you use to sign in to the Azure portal.
47
-
```PowerShell
48
+
49
+
```powershell
48
50
Connect-AzAccount
49
-
```
51
+
```
52
+
50
53
* Run the following command to view all the subscriptions for this account.
51
-
```PowerShell
54
+
55
+
```powershell
52
56
Get-AzSubscription
53
57
```
58
+
54
59
* Run the following command to select the subscription that you want to work with. This subscription should be the same as the one you used in the Azure portal.
Some of the steps in this tutorial assume that you use the resource group named ADFTutorialResourceGroup. If you use a different resource group, you need to use it in place of ADFTutorialResourceGroup in this tutorial.
72
+
64
73
3. Run the **New-AzDataFactory** cmdlet that creates a data factory named **FirstDataFactoryPSH**.
You can run the following command to confirm that the Data Factory provider is registered:
83
91
84
-
```PowerShell
85
-
Get-AzResourceProvider
92
+
You can run the following command to confirm that the Data Factory provider is registered:
93
+
94
+
```powershell
95
+
Get-AzResourceProvider
86
96
```
97
+
87
98
* Login using the Azure subscription into the [Azure portal](https://portal.azure.com) and navigate to a Data Factory blade (or) create a data factory in the Azure portal. This action automatically registers the provider for you.
88
99
89
100
Before creating a pipeline, you need to create a few Data Factory entities first. You first create linked services to link data stores/computes to your data store, define input and output datasets to represent input/output data in linked data stores, and then create the pipeline with an activity that uses these datasets.
@@ -112,19 +123,21 @@ In this step, you link your Azure Storage account to your data factory. You use
112
123
2. In Azure PowerShell, switch to the ADFGetStarted folder.
113
124
3. You can use the **New-AzDataFactoryLinkedService** cmdlet that creates a linked service. This cmdlet and other Data Factory cmdlets you use in this tutorial requires you to pass values for the *ResourceGroupName* and *DataFactoryName* parameters. Alternatively, you can use **Get-AzDataFactory** to get a **DataFactory** object and pass the object without typing *ResourceGroupName* and *DataFactoryName* each time you run a cmdlet. Run the following command to assign the output of the **Get-AzDataFactory** cmdlet to a **$df** variable.
If you hadn't run the **Get-AzDataFactory** cmdlet and assigned the output to the **$df** variable, you would have to specify values for the *ResourceGroupName* and *DataFactoryName* parameters as follows.
If you close Azure PowerShell in the middle of the tutorial, you have to run the **Get-AzDataFactory** cmdlet next time you start Azure PowerShell to complete the tutorial.
129
142
130
143
### Create Azure HDInsight linked service
@@ -165,9 +178,9 @@ In this step, you link an on-demand HDInsight cluster to your data factory. The
165
178
166
179
See [On-demand HDInsight Linked Service](data-factory-compute-linked-services.md#azure-hdinsight-on-demand-linked-service) for details.
167
180
2. Run the **New-AzDataFactoryLinkedService** cmdlet that creates the linked service called HDInsightOnDemandLinkedService.
Now, you create the output dataset to represent the output data stored in the Azure Blob storage.
223
236
224
237
1. Create a JSON file named **OutputTable.json** in the **C:\ADFGetStarted** folder with the following content:
225
238
226
-
```json
239
+
```json
227
240
{
228
241
"name": "AzureBlobOutput",
229
242
"properties": {
@@ -243,11 +256,13 @@ Now, you create the output dataset to represent the output data stored in the Az
243
256
}
244
257
}
245
258
```
259
+
246
260
The JSON defines a dataset named **AzureBlobOutput**, which represents output data for an activity in the pipeline. In addition, it specifies that the results are stored in the blob container called **adfgetstarted** and the folder called **partitioneddata**. The **availability** section specifies that the output dataset is produced on a monthly basis.
261
+
247
262
2. Run the following command in Azure PowerShell to create the Data Factory dataset:
@@ -319,27 +334,30 @@ In this step, you create your first pipeline with a **HDInsightHive** activity.
319
334
320
335
2. Confirm that you see the **input.log** file in the **adfgetstarted/inputdata** folder in the Azure blob storage, and run the following command to deploy the pipeline. Since the **start** and **end** times are set in the past and **isPaused** is set to false, the pipeline (activity in the pipeline) runs immediately after you deploy.
Id : 0f6334f2-d56c-4d48-b427-d4f0fb4ef883_635268096000000000_635292288000000000_AzureBlobOutput
364
383
ResourceGroupName : ADFTutorialResourceGroup
365
384
DataFactoryName : FirstDataFactoryPSH
@@ -378,6 +397,7 @@ In this step, you use Azure PowerShell to monitor what’s going on in an Azure
378
397
PipelineName : MyFirstPipeline
379
398
Type : Script
380
399
```
400
+
381
401
You can keep running this cmdlet until you see the slice in **Ready** state or **Failed** state. When the slice is in Ready state, check the **partitioneddata** folder in the **adfgetstarted** container in your blob storage for the output data. Creation of an on-demand HDInsight cluster usually takes some time.
Copy file name to clipboardExpand all lines: articles/data-factory/v1/data-factory-copy-activity-tutorial-using-azure-resource-manager-template.md
+7-5Lines changed: 7 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -580,17 +580,19 @@ You define a pipeline that copies data from the Azure blob dataset to the Azure
580
580
```
581
581
582
582
## Reuse the template
583
-
In the tutorial, you created a template for defining Data Factory entities and a template for passing values for parameters. The pipeline copies data from an Azure Storage account to Azure SQL Database specified via parameters. To use the same template to deploy Data Factory entities to different environments, you create a parameter file for each environment and use it when deploying to that environment.
583
+
In the tutorial, you created a template for defining Data Factory entities and a template for passing values for parameters. The pipeline copies data from an Azure Storage account to Azure SQL Database specified via parameters. To use the same template to deploy Data Factory entities to different environments, you create a parameter file for each environment and use it when deploying to that environment.
0 commit comments