You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/data-factory/tutorial-copy-data-portal.md
+35-34Lines changed: 35 additions & 34 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: Use the Azure portal to create a data factory pipeline
3
-
description: This tutorial provides step-by-step instructions for using the Azure portal to create a data factory with a pipeline. The pipeline uses the copy activity to copy data from Azure Blob storage to a SQL database.
3
+
description: This tutorial provides step-by-step instructions for using the Azure portal to create a data factory with a pipeline. The pipeline uses the copy activity to copy data from Azure Blob storage to an Azure SQL database.
4
4
services: data-factory
5
5
documentationcenter: ''
6
6
author: linda33wj
@@ -11,11 +11,11 @@ ms.service: data-factory
11
11
ms.workload: data-services
12
12
ms.topic: tutorial
13
13
ms.custom: seo-lt-2019
14
-
ms.date: 06/21/2018
14
+
ms.date: 04/13/2020
15
15
ms.author: jingwang
16
16
---
17
17
# Copy data from Azure Blob storage to a SQL database by using Azure Data Factory
18
-
In this tutorial, you create a data factory by using the Azure Data Factory user interface (UI). The pipeline in this data factory copies data from Azure Blob storage to a SQL database. The configuration pattern in this tutorial applies to copying from a file-based data store to a relational data store. For a list of data stores supported as sources and sinks, see the [supported data stores](copy-activity-overview.md#supported-data-stores-and-formats) table.
18
+
In this tutorial, you create a data factory by using the Azure Data Factory user interface (UI). The pipeline in this data factory copies data from Azure Blob storage to an Azure SQL database. The configuration pattern in this tutorial applies to copying from a file-based data store to a relational data store. For a list of data stores supported as sources and sinks, see the [supported data stores](copy-activity-overview.md#supported-data-stores-and-formats) table.
19
19
20
20
> [!NOTE]
21
21
> - If you're new to Data Factory, see [Introduction to Azure Data Factory](introduction.md).
@@ -33,7 +33,7 @@ In this tutorial, you perform the following steps:
33
33
## Prerequisites
34
34
***Azure subscription**. If you don't have an Azure subscription, create a [free Azure account](https://azure.microsoft.com/free/) before you begin.
35
35
***Azure storage account**. You use Blob storage as a *source* data store. If you don't have a storage account, see [Create an Azure storage account](../storage/common/storage-account-create.md) for steps to create one.
36
-
***Azure SQL Database**. You use the database as a *sink* data store. If you don't have a SQL database, see [Create a SQL database](../sql-database/sql-database-get-started-portal.md) for steps to create one.
36
+
***Azure SQL Database**. You use the database as a *sink* data store. If you don't have an Azure SQL database, see [Create a SQL database](../sql-database/sql-database-get-started-portal.md) for steps to create one.
37
37
38
38
### Create a blob and a SQL table
39
39
@@ -43,10 +43,11 @@ Now, prepare your Blob storage and SQL database for the tutorial by performing t
43
43
44
44
1. Launch Notepad. Copy the following text, and save it as an **emp.txt** file on your disk:
45
45
46
-
```
46
+
```
47
+
FirstName,LastName
47
48
John,Doe
48
49
Jane,Doe
49
-
```
50
+
```
50
51
51
52
1. Create a container named **adftutorial** in your Blob storage. Create a folder named **input** in this container. Then, upload the **emp.txt** file to the **input** folder. Use the Azure portal or tools such as [Azure Storage Explorer](https://storageexplorer.com/) to do these tasks.
52
53
@@ -72,10 +73,7 @@ Now, prepare your Blob storage and SQL database for the tutorial by performing t
72
73
In this step, you create a data factory and start the Data Factory UI to create a pipeline in the data factory.
73
74
74
75
1. Open **Microsoft Edge** or **Google Chrome**. Currently, Data Factory UI is supported only in Microsoft Edge and Google Chrome web browsers.
75
-
2. On the left menu, select**Create a resource**>**Analytics**>**Data Factory**:
76
-
77
-

78
-
76
+
2. On the left menu, select **Create a resource** > **Analytics** > **Data Factory**.
79
77
3. On the **New data factory** page, under **Name**, enter **ADFTutorialDataFactory**.
80
78
81
79
The name of the Azure data factory must be *globally unique*. If you receive an error message about the name value, enter a different name for the data factory. (for example, yournameADFTutorialDataFactory). For naming rules for Data Factory artifacts, see [Data Factory naming rules](naming-rules.md).
@@ -116,33 +114,38 @@ In this tutorial, you start with creating the pipeline. Then you create linked s
116
114
117
115
### Configure source
118
116
117
+
>[!TIP]
118
+
>In this tutorial, you use *Account key* as the authentication type for your source data store, but you can choose other supported authentication methods: *SAS URI*,*Service Principal* and *Managed Identity* if needed. Refer to corresponding sections in [this article](https://docs.microsoft.com/azure/data-factory/connector-azure-blob-storage#linked-service-properties) for details.
119
+
>To store secrets for data stores securely, it's also recommended to use an Azure Key Vault. Refer to [this article](https://docs.microsoft.com/azure/data-factory/store-credentials-in-key-vault) for detailed illustrations.
120
+
119
121
1. Go to the **Source** tab. Select **+ New** to create a source dataset.
120
122
121
123
1. In the **New Dataset** dialog box, select **Azure Blob Storage**, and then select **Continue**. The source data is in Blob storage, so you select **Azure Blob Storage** for the source dataset.
122
124
123
125
1. In the **Select Format** dialog box, choose the format type of your data, and then select **Continue**.
124
126
125
-

126
-
127
-
1. In the **Set Properties** dialog box, enter **SourceBlobDataset** for Name. Next to the **Linked service** text box, select **+ New**.
127
+
1. In the **Set Properties** dialog box, enter **SourceBlobDataset** for Name. Select the checkbox for **First row as header**. Under the **Linked service** text box, select **+ New**.
128
128
129
-
1. In the **New Linked Service (Azure Blob Storage)** dialog box, enter **AzureStorageLinkedService** as name, select your storage account from the **Storage account name** list. Test connection, then select **Finish** to deploy the linked service.
129
+
1. In the **New Linked Service (Azure Blob Storage)** dialog box, enter **AzureStorageLinkedService** as name, select your storage account from the **Storage account name** list. Test connection, select **Create** to deploy the linked service.
130
130
131
131
1. After the linked service is created, it's navigated back to the **Set properties** page. Next to **File path**, select **Browse**.
132
132
133
-
1. Navigate to the **adftutorial/input** folder, select the **emp.txt** file, and then select**Finish**.
133
+
1. Navigate to the **adftutorial/input** folder, select the **emp.txt** file, and then select **OK**.
134
134
135
-
1. It automatically navigates to the pipeline page. In**Source** tab, confirm that **SourceBlobDataset** is selected. To preview data on this page, select**Preview data**.
135
+
1. Select **OK**. It automatically navigates to the pipeline page. In **Source** tab, confirm that **SourceBlobDataset** is selected. To preview data on this page, select **Preview data**.
>In this tutorial, you use *SQL authentication* as the authentication type for your sink data store, but you can choose other supported authentication methods: *Service Principal* and *Managed Identity* if needed. Refer to corresponding sections in [this article](https://docs.microsoft.com/azure/data-factory/connector-azure-sql-database#linked-service-properties) for details.
142
+
>To store secrets for data stores securely, it's also recommended to use an Azure Key Vault. Refer to [this article](https://docs.microsoft.com/azure/data-factory/store-credentials-in-key-vault) for detailed illustrations.
140
143
141
144
1. Go to the **Sink** tab, and select **+ New** to create a sink dataset.
142
145
143
146
1. In the **New Dataset** dialog box, input "SQL" in the search box to filter the connectors, select **Azure SQL Database**, and then select **Continue**. In this tutorial, you copy data to a SQL database.
144
147
145
-
1. In the **Set Properties** dialog box, enter **OutputSqlDataset** for Name. Next to the **Linked service**textbox, select**+ New**. A dataset must be associated with a linked service. The linked service has the connection string that Data Factory uses to connect to the SQL database at runtime. The dataset specifies the container, folder, and the file (optional) to which the data is copied.
148
+
1. In the **Set Properties** dialog box, enter **OutputSqlDataset** for Name. From the **Linked service** dropdown list, select **+ New**. A dataset must be associated with a linked service. The linked service has the connection string that Data Factory uses to connect to the SQL database at runtime. The dataset specifies the container, folder, and the file (optional) to which the data is copied.
146
149
147
150
1. In the **New Linked Service (Azure SQL Database)** dialog box, take the following steps:
148
151
@@ -158,18 +161,17 @@ In this tutorial, you start with creating the pipeline. Then you create linked s
158
161
159
162
f. Select **Test connection** to test the connection.
160
163
161
-
g. Select**Finish** to deploy the linked service.
164
+
g. Select **Create** to deploy the linked service.
162
165
163
166

164
167
165
-
1. It automatically navigates to the **Set Properties** dialog box. In**Table**, select**[dbo].[emp]**. Then select**Finish**.
168
+
1. It automatically navigates to the **Set Properties** dialog box. In **Table**, select **[dbo].[emp]**. Then select **OK**.
166
169
167
170
1. Go to the tab with the pipeline, and in **Sink Dataset**, confirm that **OutputSqlDataset** is selected.
You can optionally map the schema of the source to corresponding schema of destination by following [Schema mapping in copy activity
172
-
](copy-activity-schema-and-type-mapping.md)
174
+
You can optionally map the schema of the source to corresponding schema of destination by following [Schema mapping in copy activity](copy-activity-schema-and-type-mapping.md).
173
175
174
176
## Validate the pipeline
175
177
To validate the pipeline, select **Validate** from the tool bar.
@@ -181,20 +183,20 @@ You can debug a pipeline before you publish artifacts (linked services, datasets
181
183
182
184
1. To debug the pipeline, select **Debug** on the toolbar. You see the status of the pipeline run in the **Output** tab at the bottom of the window.
183
185
184
-
1. Once the pipeline can run successfully, in the top toolbar, select**Publish All**. This action publishes entities (datasets, and pipelines) you created to Data Factory.
186
+
1. Once the pipeline can run successfully, in the top toolbar, select **Publish all**. This action publishes entities (datasets, and pipelines) you created to Data Factory.
185
187
186
188
1. Wait until you see the **Successfully published** message. To see notification messages, click the **Show Notifications** on the top-right (bell button).
187
189
188
190
## Trigger the pipeline manually
189
191
In this step, you manually trigger the pipeline you published in the previous step.
190
192
191
-
1. Select**Add Trigger**on the toolbar, and then select**Trigger Now**. On the **Pipeline Run** page, select**Finish**.
193
+
1. Select **Trigger** on the toolbar, and then select **Trigger Now**. On the **Pipeline Run** page, select **OK**.
192
194
193
-
1. Go to the **Monitor** tab on the left. You see a pipeline run that is triggered by a manual trigger. You can use links in the **Actions** column to view activity details and to rerun the pipeline.
195
+
1. Go to the **Monitor** tab on the left. You see a pipeline run that is triggered by a manual trigger. You can use links under the **PIPELINE NAME** column to view activity details and to rerun the pipeline.
1. To see activity runs associated with the pipeline run, select the **View Activity Runs** link in the **Actions** column. In this example, there's only one activity, so you see only one entry in the list. For details about the copy operation, select the **Details** link (eyeglasses icon) in the **Actions** column. Select **Pipeline Runs** at the top to go back to the Pipeline Runs view. To refresh the view, select **Refresh**.
199
+
1. To see activity runs associated with the pipeline run, select the **CopyPipeline** link under the **PIPELINE NAME** column. In this example, there's only one activity, so you see only one entry in the list. For details about the copy operation, select the **Details** link (eyeglasses icon) under the **ACTIVITY NAME** column. Select **All pipeline runs** at the top to go back to the Pipeline Runs view. To refresh the view, select **Refresh**.
@@ -205,9 +207,9 @@ In this schedule, you create a schedule trigger for the pipeline. The trigger ru
205
207
206
208
1. Go to the **Author** tab on the left above the monitor tab.
207
209
208
-
1. Go to your pipeline, click **Add Trigger** on the tool bar, and select **New/Edit**.
210
+
1. Go to your pipeline, click **Trigger** on the tool bar, and select **New/Edit**.
209
211
210
-
1. In the **Add Triggers** dialog box, select **+ New** for **Choose trigger** area.
212
+
1. In the **Add triggers** dialog box, select **+ New** for **Choose trigger** area.
211
213
212
214
1. In the **New Trigger** window, take the following steps:
213
215
@@ -221,25 +223,24 @@ In this schedule, you create a schedule trigger for the pipeline. The trigger ru
221
223
222
224
e. Update the **End Time** part to be a few minutes past the current datetime. The trigger is activated only after you publish the changes. If you set it to only a couple of minutes apart, and you don't publish it by then, you don't see a trigger run.
0 commit comments