Skip to content

Commit aceb9ad

Browse files
authored
Merge pull request #111101 from Samantha-Yu/adfupdate0413
Addressed feedback & Upated UI changes
2 parents 92f8fec + 5b48f98 commit aceb9ad

9 files changed

+35
-34
lines changed
-69.6 KB
Loading
-29.3 KB
Loading
19.4 KB
Loading
-20.1 KB
Loading
11.9 KB
Loading
17.4 KB
Loading
-54.1 KB
Loading
28.4 KB
Loading

articles/data-factory/tutorial-copy-data-portal.md

Lines changed: 35 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Use the Azure portal to create a data factory pipeline
3-
description: This tutorial provides step-by-step instructions for using the Azure portal to create a data factory with a pipeline. The pipeline uses the copy activity to copy data from Azure Blob storage to a SQL database.
3+
description: This tutorial provides step-by-step instructions for using the Azure portal to create a data factory with a pipeline. The pipeline uses the copy activity to copy data from Azure Blob storage to an Azure SQL database.
44
services: data-factory
55
documentationcenter: ''
66
author: linda33wj
@@ -11,11 +11,11 @@ ms.service: data-factory
1111
ms.workload: data-services
1212
ms.topic: tutorial
1313
ms.custom: seo-lt-2019
14-
ms.date: 06/21/2018
14+
ms.date: 04/13/2020
1515
ms.author: jingwang
1616
---
1717
# Copy data from Azure Blob storage to a SQL database by using Azure Data Factory
18-
In this tutorial, you create a data factory by using the Azure Data Factory user interface (UI). The pipeline in this data factory copies data from Azure Blob storage to a SQL database. The configuration pattern in this tutorial applies to copying from a file-based data store to a relational data store. For a list of data stores supported as sources and sinks, see the [supported data stores](copy-activity-overview.md#supported-data-stores-and-formats) table.
18+
In this tutorial, you create a data factory by using the Azure Data Factory user interface (UI). The pipeline in this data factory copies data from Azure Blob storage to an Azure SQL database. The configuration pattern in this tutorial applies to copying from a file-based data store to a relational data store. For a list of data stores supported as sources and sinks, see the [supported data stores](copy-activity-overview.md#supported-data-stores-and-formats) table.
1919

2020
> [!NOTE]
2121
> - If you're new to Data Factory, see [Introduction to Azure Data Factory](introduction.md).
@@ -33,7 +33,7 @@ In this tutorial, you perform the following steps:
3333
## Prerequisites
3434
* **Azure subscription**. If you don't have an Azure subscription, create a [free Azure account](https://azure.microsoft.com/free/) before you begin.
3535
* **Azure storage account**. You use Blob storage as a *source* data store. If you don't have a storage account, see [Create an Azure storage account](../storage/common/storage-account-create.md) for steps to create one.
36-
* **Azure SQL Database**. You use the database as a *sink* data store. If you don't have a SQL database, see [Create a SQL database](../sql-database/sql-database-get-started-portal.md) for steps to create one.
36+
* **Azure SQL Database**. You use the database as a *sink* data store. If you don't have an Azure SQL database, see [Create a SQL database](../sql-database/sql-database-get-started-portal.md) for steps to create one.
3737

3838
### Create a blob and a SQL table
3939

@@ -43,10 +43,11 @@ Now, prepare your Blob storage and SQL database for the tutorial by performing t
4343

4444
1. Launch Notepad. Copy the following text, and save it as an **emp.txt** file on your disk:
4545

46-
```
46+
```
47+
FirstName,LastName
4748
John,Doe
4849
Jane,Doe
49-
```
50+
```
5051
5152
1. Create a container named **adftutorial** in your Blob storage. Create a folder named **input** in this container. Then, upload the **emp.txt** file to the **input** folder. Use the Azure portal or tools such as [Azure Storage Explorer](https://storageexplorer.com/) to do these tasks.
5253
@@ -72,10 +73,7 @@ Now, prepare your Blob storage and SQL database for the tutorial by performing t
7273
In this step, you create a data factory and start the Data Factory UI to create a pipeline in the data factory.
7374
7475
1. Open **Microsoft Edge** or **Google Chrome**. Currently, Data Factory UI is supported only in Microsoft Edge and Google Chrome web browsers.
75-
2. On the left menu, select **Create a resource** > **Analytics** > **Data Factory**:
76-
77-
![Data Factory selection in the "New" pane](./media/doc-common-process/new-azure-data-factory-menu.png)
78-
76+
2. On the left menu, select **Create a resource** > **Analytics** > **Data Factory**.
7977
3. On the **New data factory** page, under **Name**, enter **ADFTutorialDataFactory**.
8078
8179
The name of the Azure data factory must be *globally unique*. If you receive an error message about the name value, enter a different name for the data factory. (for example, yournameADFTutorialDataFactory). For naming rules for Data Factory artifacts, see [Data Factory naming rules](naming-rules.md).
@@ -116,33 +114,38 @@ In this tutorial, you start with creating the pipeline. Then you create linked s
116114
117115
### Configure source
118116
117+
>[!TIP]
118+
>In this tutorial, you use *Account key* as the authentication type for your source data store, but you can choose other supported authentication methods: *SAS URI*,*Service Principal* and *Managed Identity* if needed. Refer to corresponding sections in [this article](https://docs.microsoft.com/azure/data-factory/connector-azure-blob-storage#linked-service-properties) for details.
119+
>To store secrets for data stores securely, it's also recommended to use an Azure Key Vault. Refer to [this article](https://docs.microsoft.com/azure/data-factory/store-credentials-in-key-vault) for detailed illustrations.
120+
119121
1. Go to the **Source** tab. Select **+ New** to create a source dataset.
120122
121123
1. In the **New Dataset** dialog box, select **Azure Blob Storage**, and then select **Continue**. The source data is in Blob storage, so you select **Azure Blob Storage** for the source dataset.
122124
123125
1. In the **Select Format** dialog box, choose the format type of your data, and then select **Continue**.
124126
125-
![Data format type](./media/doc-common-process/select-data-format.png)
126-
127-
1. In the **Set Properties** dialog box, enter **SourceBlobDataset** for Name. Next to the **Linked service** text box, select **+ New**.
127+
1. In the **Set Properties** dialog box, enter **SourceBlobDataset** for Name. Select the checkbox for **First row as header**. Under the **Linked service** text box, select **+ New**.
128128
129-
1. In the **New Linked Service (Azure Blob Storage)** dialog box, enter **AzureStorageLinkedService** as name, select your storage account from the **Storage account name** list. Test connection, then select **Finish** to deploy the linked service.
129+
1. In the **New Linked Service (Azure Blob Storage)** dialog box, enter **AzureStorageLinkedService** as name, select your storage account from the **Storage account name** list. Test connection, select **Create** to deploy the linked service.
130130
131131
1. After the linked service is created, it's navigated back to the **Set properties** page. Next to **File path**, select **Browse**.
132132
133-
1. Navigate to the **adftutorial/input** folder, select the **emp.txt** file, and then select **Finish**.
133+
1. Navigate to the **adftutorial/input** folder, select the **emp.txt** file, and then select **OK**.
134134
135-
1. It automatically navigates to the pipeline page. In **Source** tab, confirm that **SourceBlobDataset** is selected. To preview data on this page, select **Preview data**.
135+
1. Select **OK**. It automatically navigates to the pipeline page. In **Source** tab, confirm that **SourceBlobDataset** is selected. To preview data on this page, select **Preview data**.
136136
137137
![Source dataset](./media/tutorial-copy-data-portal/source-dataset-selected.png)
138138
139139
### Configure sink
140+
>[!TIP]
141+
>In this tutorial, you use *SQL authentication* as the authentication type for your sink data store, but you can choose other supported authentication methods: *Service Principal* and *Managed Identity* if needed. Refer to corresponding sections in [this article](https://docs.microsoft.com/azure/data-factory/connector-azure-sql-database#linked-service-properties) for details.
142+
>To store secrets for data stores securely, it's also recommended to use an Azure Key Vault. Refer to [this article](https://docs.microsoft.com/azure/data-factory/store-credentials-in-key-vault) for detailed illustrations.
140143
141144
1. Go to the **Sink** tab, and select **+ New** to create a sink dataset.
142145
143146
1. In the **New Dataset** dialog box, input "SQL" in the search box to filter the connectors, select **Azure SQL Database**, and then select **Continue**. In this tutorial, you copy data to a SQL database.
144147
145-
1. In the **Set Properties** dialog box, enter **OutputSqlDataset** for Name. Next to the **Linked service** text box, select **+ New**. A dataset must be associated with a linked service. The linked service has the connection string that Data Factory uses to connect to the SQL database at runtime. The dataset specifies the container, folder, and the file (optional) to which the data is copied.
148+
1. In the **Set Properties** dialog box, enter **OutputSqlDataset** for Name. From the **Linked service** dropdown list, select **+ New**. A dataset must be associated with a linked service. The linked service has the connection string that Data Factory uses to connect to the SQL database at runtime. The dataset specifies the container, folder, and the file (optional) to which the data is copied.
146149
147150
1. In the **New Linked Service (Azure SQL Database)** dialog box, take the following steps:
148151
@@ -158,18 +161,17 @@ In this tutorial, you start with creating the pipeline. Then you create linked s
158161
159162
f. Select **Test connection** to test the connection.
160163
161-
g. Select **Finish** to deploy the linked service.
164+
g. Select **Create** to deploy the linked service.
162165
163166
![Save new linked service](./media/tutorial-copy-data-portal/new-azure-sql-linked-service-window.png)
164167
165-
1. It automatically navigates to the **Set Properties** dialog box. In **Table**, select **[dbo].[emp]**. Then select **Finish**.
168+
1. It automatically navigates to the **Set Properties** dialog box. In **Table**, select **[dbo].[emp]**. Then select **OK**.
166169
167170
1. Go to the tab with the pipeline, and in **Sink Dataset**, confirm that **OutputSqlDataset** is selected.
168171
169172
![Pipeline tab](./media/tutorial-copy-data-portal/pipeline-tab-2.png)
170173
171-
You can optionally map the schema of the source to corresponding schema of destination by following [Schema mapping in copy activity
172-
](copy-activity-schema-and-type-mapping.md)
174+
You can optionally map the schema of the source to corresponding schema of destination by following [Schema mapping in copy activity](copy-activity-schema-and-type-mapping.md).
173175
174176
## Validate the pipeline
175177
To validate the pipeline, select **Validate** from the tool bar.
@@ -181,20 +183,20 @@ You can debug a pipeline before you publish artifacts (linked services, datasets
181183
182184
1. To debug the pipeline, select **Debug** on the toolbar. You see the status of the pipeline run in the **Output** tab at the bottom of the window.
183185
184-
1. Once the pipeline can run successfully, in the top toolbar, select **Publish All**. This action publishes entities (datasets, and pipelines) you created to Data Factory.
186+
1. Once the pipeline can run successfully, in the top toolbar, select **Publish all**. This action publishes entities (datasets, and pipelines) you created to Data Factory.
185187
186188
1. Wait until you see the **Successfully published** message. To see notification messages, click the **Show Notifications** on the top-right (bell button).
187189
188190
## Trigger the pipeline manually
189191
In this step, you manually trigger the pipeline you published in the previous step.
190192
191-
1. Select **Add Trigger** on the toolbar, and then select **Trigger Now**. On the **Pipeline Run** page, select **Finish**.
193+
1. Select **Trigger** on the toolbar, and then select **Trigger Now**. On the **Pipeline Run** page, select **OK**.
192194
193-
1. Go to the **Monitor** tab on the left. You see a pipeline run that is triggered by a manual trigger. You can use links in the **Actions** column to view activity details and to rerun the pipeline.
195+
1. Go to the **Monitor** tab on the left. You see a pipeline run that is triggered by a manual trigger. You can use links under the **PIPELINE NAME** column to view activity details and to rerun the pipeline.
194196
195197
![Monitor pipeline runs](./media/tutorial-copy-data-portal/monitor-pipeline.png)
196198
197-
1. To see activity runs associated with the pipeline run, select the **View Activity Runs** link in the **Actions** column. In this example, there's only one activity, so you see only one entry in the list. For details about the copy operation, select the **Details** link (eyeglasses icon) in the **Actions** column. Select **Pipeline Runs** at the top to go back to the Pipeline Runs view. To refresh the view, select **Refresh**.
199+
1. To see activity runs associated with the pipeline run, select the **CopyPipeline** link under the **PIPELINE NAME** column. In this example, there's only one activity, so you see only one entry in the list. For details about the copy operation, select the **Details** link (eyeglasses icon) under the **ACTIVITY NAME** column. Select **All pipeline runs** at the top to go back to the Pipeline Runs view. To refresh the view, select **Refresh**.
198200
199201
![Monitor activity runs](./media/tutorial-copy-data-portal/view-activity-runs.png)
200202
@@ -205,9 +207,9 @@ In this schedule, you create a schedule trigger for the pipeline. The trigger ru
205207
206208
1. Go to the **Author** tab on the left above the monitor tab.
207209
208-
1. Go to your pipeline, click **Add Trigger** on the tool bar, and select **New/Edit**.
210+
1. Go to your pipeline, click **Trigger** on the tool bar, and select **New/Edit**.
209211
210-
1. In the **Add Triggers** dialog box, select **+ New** for **Choose trigger** area.
212+
1. In the **Add triggers** dialog box, select **+ New** for **Choose trigger** area.
211213
212214
1. In the **New Trigger** window, take the following steps:
213215
@@ -221,25 +223,24 @@ In this schedule, you create a schedule trigger for the pipeline. The trigger ru
221223
222224
e. Update the **End Time** part to be a few minutes past the current datetime. The trigger is activated only after you publish the changes. If you set it to only a couple of minutes apart, and you don't publish it by then, you don't see a trigger run.
223225
224-
f. Select **Apply**.
226+
f. Select **OK**.
225227
226228
g. For **Activated** option, select **Yes**.
227229
228-
h. Select **Next**.
229-
230-
![Activated button](./media/tutorial-copy-data-portal/trigger-activiated-next.png)
230+
h. Select **OK**.
231231
232232
> [!IMPORTANT]
233233
> A cost is associated with each pipeline run, so set the end date appropriately.
234-
1. On the **Trigger Run Parameters** page, review the warning, and then select **Finish**. The pipeline in this example doesn't take any parameters.
235234
236-
1. Click **Publish All** to publish the change.
235+
1. On the **Edit trigger** page, review the warning, and then select **Save**. The pipeline in this example doesn't take any parameters.
236+
237+
1. Click **Publish all** to publish the change.
237238
238239
1. Go to the **Monitor** tab on the left to see the triggered pipeline runs.
239240
240241
![Triggered pipeline runs](./media/tutorial-copy-data-portal/triggered-pipeline-runs.png)
241242
242-
1. To switch from the **Pipeline Runs** view to the **Trigger Runs** view, select **Trigger Runs** on the top of the window.
243+
1. To switch from the **Pipeline Runs** view to the **Trigger Runs** view, select **Trigger Runs** on the left side of the window.
243244
244245
1. You see the trigger runs in a list.
245246

0 commit comments

Comments
 (0)