Skip to content

Commit beb8840

Browse files
authored
Merge pull request #298886 from whhender/patch-832205
Editorial updates to tutorial-copy-data-portal.md
2 parents 6a7d6e2 + 351e2ec commit beb8840

File tree

1 file changed

+62
-35
lines changed

1 file changed

+62
-35
lines changed

articles/data-factory/tutorial-copy-data-portal.md

Lines changed: 62 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,16 @@
11
---
2-
title: Use the Azure portal to create a data factory pipeline
3-
description: This tutorial provides step-by-step instructions for using the Azure portal to create a data factory with a pipeline. The pipeline uses the copy activity to copy data from Azure Blob storage to Azure SQL Database.
2+
title: 'Use the Azure portal to create a data factory pipeline'
3+
description: This tutorial provides instructions to create a data factory with a pipeline with a copy activity to copy data from Azure Blob storage to Azure SQL Database.
44
author: jianleishen
55
ms.topic: tutorial
6-
ms.date: 10/03/2024
6+
ms.date: 04/25/2025
77
ms.subservice: data-movement
88
ms.author: jianleishen
9+
10+
#customer intent: As a new Azure Data Factory user I want to create a data factory and quickly create my first pipeline to move data between resources, so I can apply it to my own needs.
911
---
1012

11-
# Copy data from Azure Blob storage to a database in Azure SQL Database by using Azure Data Factory
13+
# Tutorial: Copy data from Azure Blob storage to a database in Azure SQL Database by using Azure Data Factory
1214

1315
[!INCLUDE[appliesto-adf-asa-md](includes/appliesto-adf-asa-md.md)]
1416

@@ -20,14 +22,16 @@ In this tutorial, you create a data factory by using the Azure Data Factory user
2022
In this tutorial, you perform the following steps:
2123

2224
> [!div class="checklist"]
23-
> * Create a data factory.
24-
> * Create a pipeline with a copy activity.
25+
> * [Create a data factory.](#create-a-data-factory)
26+
> * [Create a pipeline with a copy activity.](#create-a-pipeline)
2527
> * Test run the pipeline.
26-
> * Trigger the pipeline manually.
27-
> * Trigger the pipeline on a schedule.
28+
> * [Trigger the pipeline manually.](#trigger-the-pipeline-manually)
29+
> * [Trigger the pipeline on a schedule.](#trigger-the-pipeline-on-a-schedule)
2830
> * Monitor the pipeline and activity runs.
31+
> * [Disable or delete your scheduled trigger.](#disable-trigger)
2932
3033
## Prerequisites
34+
3135
* **Azure subscription**. If you don't have an Azure subscription, create a [free Azure account](https://azure.microsoft.com/free/) before you begin.
3236
* **Azure storage account**. You use Blob storage as a *source* data store. If you don't have a storage account, see [Create an Azure storage account](../storage/common/storage-account-create.md) for steps to create one.
3337
* **Azure SQL Database**. You use the database as a *sink* data store. If you don't have a database in Azure SQL Database, see the [Create a database in Azure SQL Database](/azure/azure-sql/database/single-database-create-quickstart) for steps to create one.
@@ -38,15 +42,16 @@ Now, prepare your Blob storage and SQL database for the tutorial by performing t
3842

3943
#### Create a source blob
4044

41-
1. Launch Notepad. Copy the following text, and save it as an **emp.txt** file on your disk:
45+
1. Launch Notepad. Copy the following text, and save it as an **emp.txt** file:
4246

4347
```
4448
FirstName,LastName
4549
John,Doe
4650
Jane,Doe
4751
```
4852
49-
1. Create a container named **adftutorial** in your Blob storage. Create a folder named **input** in this container. Then, upload the **emp.txt** file to the **input** folder. Use the Azure portal or tools such as [Azure Storage Explorer](https://storageexplorer.com/) to do these tasks.
53+
1. Move that file into a folder called input.
54+
1. Create a container named **adftutorial** in your Blob storage. Upload your **input** folder with the **emp.txt** file to this container. You can use the Azure portal or tools such as [Azure Storage Explorer](https://storageexplorer.com/) to do these tasks.
5055
5156
#### Create a sink SQL table
5257
@@ -64,13 +69,14 @@ Now, prepare your Blob storage and SQL database for the tutorial by performing t
6469
CREATE CLUSTERED INDEX IX_emp_ID ON dbo.emp (ID);
6570
```
6671
67-
1. Allow Azure services to access SQL Server. Ensure that **Allow access to Azure services** is turned **ON** for your SQL Server so that Data Factory can write data to your SQL Server. To verify and turn on this setting, go to logical SQL server > Overview > Set server firewall> set the **Allow access to Azure services** option to **ON**.
72+
1. Allow Azure services to access SQL Server. Ensure that **Allow access to Azure services** is turned **ON** for your SQL Server so that Data Factory can write data to your SQL Server. To verify and turn on this setting, go to your SQL Server in the Azure portal, select **Security** > **Networking** > enable **Selected networks**> check **Allow Azure services and resources to access this server** under the **Exceptions**.
6873
6974
## Create a data factory
75+
7076
In this step, you create a data factory and start the Data Factory UI to create a pipeline in the data factory.
7177
7278
1. Open **Microsoft Edge** or **Google Chrome**. Currently, Data Factory UI is supported only in Microsoft Edge and Google Chrome web browsers.
73-
2. On the left menu, select **Create a resource** > **Integration** > **Data Factory**.
79+
2. On the left menu, select **Create a resource** > **Analytics** > **Data Factory**.
7480
3. On the **Create Data Factory** page, under **Basics** tab, select the Azure **Subscription** in which you want to create the data factory.
7581
4. For **Resource Group**, take one of the following steps:
7682
@@ -79,28 +85,20 @@ In this step, you create a data factory and start the Data Factory UI to create
7985
b. Select **Create new**, and enter the name of a new resource group.
8086
8187
To learn about resource groups, see [Use resource groups to manage your Azure resources](../azure-resource-manager/management/overview.md).
82-
5. Under **Region**, select a location for the data factory. Only locations that are supported are displayed in the drop-down list. The data stores (for example, Azure Storage and SQL Database) and computes (for example, Azure HDInsight) used by the data factory can be in other regions.
83-
6. Under **Name**, enter **ADFTutorialDataFactory**.
84-
85-
The name of the Azure data factory must be *globally unique*. If you receive an error message about the name value, enter a different name for the data factory. (for example, yournameADFTutorialDataFactory). For naming rules for Data Factory artifacts, see [Data Factory naming rules](naming-rules.md).
88+
5. Under **Region**, select a location for the data factory. Your data stores can be in a different region than your data factory, if they need to be.
89+
6. Under **Name**, the name of the Azure data factory must be *globally unique*. If you receive an error message about the name value, enter a different name for the data factory. (for example, yournameADFDemo). For naming rules for Data Factory artifacts, see [Data Factory naming rules](naming-rules.md).
8690
8791
:::image type="content" source="./media/doc-common-process/name-not-available-error.png" alt-text="New data factory error message for duplicate name.":::
8892
8993
7. Under **Version**, select **V2**.
9094
8. Select **Git configuration** tab on the top, and select the **Configure Git later** check box.
9195
9. Select **Review + create**, and select **Create** after the validation is passed.
9296
10. After the creation is finished, you see the notice in Notifications center. Select **Go to resource** to navigate to the Data factory page.
93-
11. Select **Open** on the **Open Azure Data Factory Studio** tile to launch the Azure Data Factory UI in a separate tab.
94-
97+
11. Select **Launch Studio** on the **Azure Data Factory Studio** tile.
9598
9699
## Create a pipeline
97-
In this step, you create a pipeline with a copy activity in the data factory. The copy activity copies data from Blob storage to SQL Database. In the [Quickstart tutorial](quickstart-create-data-factory-portal.md), you created a pipeline by following these steps:
98-
99-
1. Create the linked service.
100-
1. Create input and output datasets.
101-
1. Create a pipeline.
102100
103-
In this tutorial, you start with creating the pipeline. Then you create linked services and datasets when you need them to configure the pipeline.
101+
In this step, you create a pipeline with a copy activity in the data factory. The copy activity copies data from Blob storage to SQL Database.
104102
105103
1. On the home page, select **Orchestrate**.
106104
@@ -115,14 +113,14 @@ In this tutorial, you start with creating the pipeline. Then you create linked s
115113
### Configure source
116114
117115
>[!TIP]
118-
>In this tutorial, you use *Account key* as the authentication type for your source data store, but you can choose other supported authentication methods: *SAS URI*,*Service Principal* and *Managed Identity* if needed. Refer to corresponding sections in [this article](./connector-azure-blob-storage.md#linked-service-properties) for details.
116+
>In this tutorial, you use *Account key* as the authentication type for your source data store, but you can choose other supported authentication methods: *SAS URI*, *Service Principal*, and *Managed Identity* if needed. Refer to corresponding sections in [this article](./connector-azure-blob-storage.md#linked-service-properties) for details.
119117
>To store secrets for data stores securely, it's also recommended to use an Azure Key Vault. Refer to [this article](./store-credentials-in-key-vault.md) for detailed illustrations.
120118
121119
1. Go to the **Source** tab. Select **+ New** to create a source dataset.
122120
123121
1. In the **New Dataset** dialog box, select **Azure Blob Storage**, and then select **Continue**. The source data is in Blob storage, so you select **Azure Blob Storage** for the source dataset.
124122
125-
1. In the **Select Format** dialog box, choose the format type of your data, and then select **Continue**.
123+
1. In the **Select Format** dialog box, choose **Delimited Text**, and then select **Continue**.
126124
127125
1. In the **Set Properties** dialog box, enter **SourceBlobDataset** for Name. Select the checkbox for **First row as header**. Under the **Linked service** text box, select **+ New**.
128126
@@ -137,15 +135,16 @@ In this tutorial, you start with creating the pipeline. Then you create linked s
137135
:::image type="content" source="./media/tutorial-copy-data-portal/source-dataset-selected.png" alt-text="Source dataset":::
138136
139137
### Configure sink
138+
140139
>[!TIP]
141140
>In this tutorial, you use *SQL authentication* as the authentication type for your sink data store, but you can choose other supported authentication methods: *Service Principal* and *Managed Identity* if needed. Refer to corresponding sections in [this article](./connector-azure-sql-database.md#linked-service-properties) for details.
142141
>To store secrets for data stores securely, it's also recommended to use an Azure Key Vault. Refer to [this article](./store-credentials-in-key-vault.md) for detailed illustrations.
143142
144143
1. Go to the **Sink** tab, and select **+ New** to create a sink dataset.
145144
146-
1. In the **New Dataset** dialog box, input "SQL" in the search box to filter the connectors, select **Azure SQL Database**, and then select **Continue**. In this tutorial, you copy data to a SQL database.
145+
1. In the **New Dataset** dialog box, input "SQL" in the search box to filter the connectors, select **Azure SQL Database**, and then select **Continue**.
147146
148-
1. In the **Set Properties** dialog box, enter **OutputSqlDataset** for Name. From the **Linked service** dropdown list, select **+ New**. A dataset must be associated with a linked service. The linked service has the connection string that Data Factory uses to connect to SQL Database at runtime. The dataset specifies the container, folder, and the file (optional) to which the data is copied.
147+
1. In the **Set Properties** dialog box, enter **OutputSqlDataset** for Name. From the **Linked service** dropdown list, select **+ New**. A dataset must be associated with a linked service. The linked service has the connection string that Data Factory uses to connect to SQL Database at runtime, and specifies where the data will be copied to.
149148
150149
1. In the **New Linked Service (Azure SQL Database)** dialog box, take the following steps:
151150
@@ -165,7 +164,7 @@ In this tutorial, you start with creating the pipeline. Then you create linked s
165164
166165
:::image type="content" source="./media/tutorial-copy-data-portal/new-azure-sql-linked-service-window.png" alt-text="Save new linked service":::
167166
168-
1. It automatically navigates to the **Set Properties** dialog box. In **Table**, select **[dbo].[emp]**. Then select **OK**.
167+
1. It automatically navigates to the **Set Properties** dialog box. In **Table**, select **Enter manually**, and enter **[dbo].[emp]**. Then select **OK**.
169168
170169
1. Go to the tab with the pipeline, and in **Sink Dataset**, confirm that **OutputSqlDataset** is selected.
171170
@@ -174,42 +173,49 @@ In this tutorial, you start with creating the pipeline. Then you create linked s
174173
You can optionally map the schema of the source to corresponding schema of destination by following [Schema mapping in copy activity](copy-activity-schema-and-type-mapping.md).
175174
176175
## Validate the pipeline
176+
177177
To validate the pipeline, select **Validate** from the tool bar.
178178
179179
You can see the JSON code associated with the pipeline by clicking **Code** on the upper right.
180180
181181
## Debug and publish the pipeline
182+
182183
You can debug a pipeline before you publish artifacts (linked services, datasets, and pipeline) to Data Factory or your own Azure Repos Git repository.
183184
184185
1. To debug the pipeline, select **Debug** on the toolbar. You see the status of the pipeline run in the **Output** tab at the bottom of the window.
185186
186187
1. Once the pipeline can run successfully, in the top toolbar, select **Publish all**. This action publishes entities (datasets, and pipelines) you created to Data Factory.
187188
188-
1. Wait until you see the **Successfully published** message. To see notification messages, click the **Show Notifications** on the top-right (bell button).
189+
1. Wait until you see the **Successfully published** notification message. To see notification messages, select the **Show Notifications** on the top-right (bell button).
189190
190191
## Trigger the pipeline manually
192+
191193
In this step, you manually trigger the pipeline you published in the previous step.
192194
193-
1. Select **Trigger** on the toolbar, and then select **Trigger Now**. On the **Pipeline Run** page, select **OK**.
195+
1. Select **Add trigger** on the toolbar, and then select **Trigger Now**.
196+
197+
1. On the **Pipeline Run** page, select **OK**.
194198
195199
1. Go to the **Monitor** tab on the left. You see a pipeline run that is triggered by a manual trigger. You can use links under the **PIPELINE NAME** column to view activity details and to rerun the pipeline.
196200
197201
:::image type="content" source="./media/tutorial-copy-data-portal/monitor-pipeline-inline-and-expended.png" alt-text="Monitor pipeline runs" lightbox="./media/tutorial-copy-data-portal/monitor-pipeline-inline-and-expended.png":::
198202
199-
1. To see activity runs associated with the pipeline run, select the **CopyPipeline** link under the **PIPELINE NAME** column. In this example, there's only one activity, so you see only one entry in the list. For details about the copy operation, select the **Details** link (eyeglasses icon) under the **ACTIVITY NAME** column. Select **All pipeline runs** at the top to go back to the Pipeline Runs view. To refresh the view, select **Refresh**.
203+
1. To see activity runs associated with the pipeline run, select the **CopyPipeline** link under the **PIPELINE NAME** column. In this example, there's only one activity, so you see only one entry in the list. For details about the copy operation, hover over the activity and
204+
1. select the **Details** link (eyeglasses icon) under the **ACTIVITY NAME** column. Select **All pipeline runs** at the top to go back to the Pipeline Runs view. To refresh the view, select **Refresh**.
200205
201206
:::image type="content" source="./media/tutorial-copy-data-portal/view-activity-runs-inline-and-expended.png#lightbox" alt-text="Monitor activity runs" lightbox="./media/tutorial-copy-data-portal/view-activity-runs-inline-and-expended.png":::
202207
203208
1. Verify that two more rows are added to the **emp** table in the database.
204209
205210
## Trigger the pipeline on a schedule
211+
206212
In this schedule, you create a schedule trigger for the pipeline. The trigger runs the pipeline on the specified schedule, such as hourly or daily. Here you set the trigger to run every minute until the specified end datetime.
207213
208214
1. Go to the **Author** tab on the left above the monitor tab.
209215
210-
1. Go to your pipeline, click **Trigger** on the tool bar, and select **New/Edit**.
216+
1. Go to your pipeline, select **Trigger** on the tool bar, and select **New/Edit**.
211217
212-
1. In the **Add triggers** dialog box, select **+ New** for **Choose trigger** area.
218+
1. In the **Add triggers** dialog box, select **Choose trigger** and select **+ New**.
213219
214220
1. In the **New Trigger** window, take the following steps:
215221
@@ -232,7 +238,7 @@ In this schedule, you create a schedule trigger for the pipeline. The trigger ru
232238
233239
1. On the **Edit trigger** page, review the warning, and then select **Save**. The pipeline in this example doesn't take any parameters.
234240
235-
1. Click **Publish all** to publish the change.
241+
1. Select **Publish all** to publish the change.
236242
237243
1. Go to the **Monitor** tab on the left to see the triggered pipeline runs.
238244
@@ -244,7 +250,22 @@ In this schedule, you create a schedule trigger for the pipeline. The trigger ru
244250
245251
1. Verify that two rows per minute (for each pipeline run) are inserted into the **emp** table until the specified end time.
246252
253+
## Disable trigger
254+
255+
To disable your every minute trigger that you created, follow these steps:
256+
257+
1. Select the **Manage** pane on the left side.
258+
259+
1. Under **Author** select **Triggers**.
260+
261+
1. Hover over the **RunEveryMinute** trigger you created.
262+
1. Select the **Stop** button to disable the trigger from running.
263+
1. Select the **Delete** button to disable and delete the trigger.
264+
265+
1. Select **Publish all** to save your changes.
266+
247267
## Related content
268+
248269
The pipeline in this sample copies data from one location to another location in Blob storage. You learned how to:
249270
250271
> [!div class="checklist"]
@@ -254,9 +275,15 @@ The pipeline in this sample copies data from one location to another location in
254275
> * Trigger the pipeline manually.
255276
> * Trigger the pipeline on a schedule.
256277
> * Monitor the pipeline and activity runs.
278+
> * Disable or delete your scheduled trigger.
257279
258280
259281
Advance to the following tutorial to learn how to copy data from on-premises to the cloud:
260282
261283
> [!div class="nextstepaction"]
262284
>[Copy data from on-premises to the cloud](tutorial-hybrid-copy-portal.md)
285+
286+
For more information on copying data to or from Azure Blob Storage and Azure SQL Database, see these connector guides:
287+
288+
- [Copy and transform data in Azure Blob Storage](connector-azure-blob-storage.md)
289+
- [Copy and transform data in Azure SQL Database](connector-azure-sql-database.md)

0 commit comments

Comments
 (0)