Skip to content

Commit 1f38cfa

Browse files
authored
Merge pull request #111340 from mamccrea/databricks-synapse-rebrand
Databricks: update branding
2 parents a6b3852 + e715fe7 commit 1f38cfa

File tree

1 file changed

+20
-20
lines changed

1 file changed

+20
-20
lines changed

articles/azure-databricks/databricks-extract-load-sql-data-warehouse.md

Lines changed: 20 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: 'Tutorial - Perform ETL operations using Azure Databricks'
3-
description: In this tutorial, learn how to extract data from Data Lake Storage Gen2 into Azure Databricks, transform the data, and then load the data into Azure SQL Data Warehouse.
3+
description: In this tutorial, learn how to extract data from Data Lake Storage Gen2 into Azure Databricks, transform the data, and then load the data into Azure Synapse Analytics.
44
author: mamccrea
55
ms.author: mamccrea
66
ms.reviewer: jasonh
@@ -11,13 +11,13 @@ ms.date: 01/29/2020
1111
---
1212
# Tutorial: Extract, transform, and load data by using Azure Databricks
1313

14-
In this tutorial, you perform an ETL (extract, transform, and load data) operation by using Azure Databricks. You extract data from Azure Data Lake Storage Gen2 into Azure Databricks, run transformations on the data in Azure Databricks, and load the transformed data into Azure SQL Data Warehouse.
14+
In this tutorial, you perform an ETL (extract, transform, and load data) operation by using Azure Databricks. You extract data from Azure Data Lake Storage Gen2 into Azure Databricks, run transformations on the data in Azure Databricks, and load the transformed data into Azure Synapse Analytics.
1515

16-
The steps in this tutorial use the SQL Data Warehouse connector for Azure Databricks to transfer data to Azure Databricks. This connector, in turn, uses Azure Blob Storage as temporary storage for the data being transferred between an Azure Databricks cluster and Azure SQL Data Warehouse.
16+
The steps in this tutorial use the Azure Synapse connector for Azure Databricks to transfer data to Azure Databricks. This connector, in turn, uses Azure Blob Storage as temporary storage for the data being transferred between an Azure Databricks cluster and Azure Synapse.
1717

1818
The following illustration shows the application flow:
1919

20-
![Azure Databricks with Data Lake Store and SQL Data Warehouse](./media/databricks-extract-load-sql-data-warehouse/databricks-extract-transform-load-sql-datawarehouse.png "Azure Databricks with Data Lake Store and SQL Data Warehouse")
20+
![Azure Databricks with Data Lake Store and Azure Synapse](./media/databricks-extract-load-sql-data-warehouse/databricks-extract-transform-load-sql-datawarehouse.png "Azure Databricks with Data Lake Store and Azure Synapse")
2121

2222
This tutorial covers the following tasks:
2323

@@ -29,9 +29,9 @@ This tutorial covers the following tasks:
2929
> * Create a service principal.
3030
> * Extract data from the Azure Data Lake Storage Gen2 account.
3131
> * Transform data in Azure Databricks.
32-
> * Load data into Azure SQL Data Warehouse.
32+
> * Load data into Azure Synapse.
3333
34-
If you dont have an Azure subscription, create a [free account](https://azure.microsoft.com/free/?WT.mc_id=A261C142F) before you begin.
34+
If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/?WT.mc_id=A261C142F) before you begin.
3535

3636
> [!Note]
3737
> This tutorial cannot be carried out using **Azure Free Trial Subscription**.
@@ -41,9 +41,9 @@ If you don’t have an Azure subscription, create a [free account](https://azure
4141

4242
Complete these tasks before you begin this tutorial:
4343

44-
* Create an Azure SQL data warehouse, create a server-level firewall rule, and connect to the server as a server admin. See [Quickstart: Create and query an Azure SQL data warehouse in the Azure portal](../synapse-analytics/sql-data-warehouse/create-data-warehouse-portal.md).
44+
* Create an Azure Synapse, create a server-level firewall rule, and connect to the server as a server admin. See [Quickstart: Create and query a Synapse SQL pool using the Azure portal](../synapse-analytics/sql-data-warehouse/create-data-warehouse-portal.md).
4545

46-
* Create a master key for the Azure SQL data warehouse. See [Create a database master key](https://docs.microsoft.com/sql/relational-databases/security/encryption/create-a-database-master-key).
46+
* Create a master key for the Azure Synapse. See [Create a database master key](https://docs.microsoft.com/sql/relational-databases/security/encryption/create-a-database-master-key).
4747

4848
* Create an Azure Blob storage account, and a container within it. Also, retrieve the access key to access the storage account. See [Quickstart: Upload, download, and list blobs with the Azure portal](../storage/blobs/storage-quickstart-blobs-portal.md).
4949

@@ -57,7 +57,7 @@ Complete these tasks before you begin this tutorial:
5757

5858
If you'd prefer to use an access control list (ACL) to associate the service principal with a specific file or directory, reference [Access control in Azure Data Lake Storage Gen2](../storage/blobs/data-lake-storage-access-control.md).
5959

60-
* When performing the steps in the [Get values for signing in](https://docs.microsoft.com/azure/active-directory/develop/howto-create-service-principal-portal#get-values-for-signing-in) section of the article, paste the tenant ID, app ID, and secret values into a text file. You'll need those soon.
60+
* When performing the steps in the [Get values for signing in](https://docs.microsoft.com/azure/active-directory/develop/howto-create-service-principal-portal#get-values-for-signing-in) section of the article, paste the tenant ID, app ID, and secret values into a text file.
6161

6262
* Sign in to the [Azure portal](https://portal.azure.com/).
6363

@@ -67,7 +67,7 @@ Make sure that you complete the prerequisites of this tutorial.
6767

6868
Before you begin, you should have these items of information:
6969

70-
:heavy_check_mark: The database name, database server name, user name, and password of your Azure SQL Data warehouse.
70+
:heavy_check_mark: The database name, database server name, user name, and password of your Azure Synapse.
7171

7272
:heavy_check_mark: The access key of your blob storage account.
7373

@@ -310,11 +310,11 @@ The raw sample data **small_radio_json.json** file captures the audience for a r
310310
+---------+----------+------+--------------------+-----------------+
311311
```
312312

313-
## Load data into Azure SQL Data Warehouse
313+
## Load data into Azure Synapse
314314

315-
In this section, you upload the transformed data into Azure SQL Data Warehouse. You use the Azure SQL Data Warehouse connector for Azure Databricks to directly upload a dataframe as a table in a SQL data warehouse.
315+
In this section, you upload the transformed data into Azure Synapse. You use the Azure Synapse connector for Azure Databricks to directly upload a dataframe as a table in a Synapse Spark pool.
316316

317-
As mentioned earlier, the SQL Data Warehouse connector uses Azure Blob storage as temporary storage to upload data between Azure Databricks and Azure SQL Data Warehouse. So, you start by providing the configuration to connect to the storage account. You must already have already created the account as part of the prerequisites for this article.
317+
As mentioned earlier, the Azure Synapse connector uses Azure Blob storage as temporary storage to upload data between Azure Databricks and Azure Synapse. So, you start by providing the configuration to connect to the storage account. You must already have already created the account as part of the prerequisites for this article.
318318

319319
1. Provide the configuration to access the Azure Storage account from Azure Databricks.
320320

@@ -324,7 +324,7 @@ As mentioned earlier, the SQL Data Warehouse connector uses Azure Blob storage a
324324
val blobAccessKey = "<access-key>"
325325
```
326326

327-
2. Specify a temporary folder to use while moving data between Azure Databricks and Azure SQL Data Warehouse.
327+
2. Specify a temporary folder to use while moving data between Azure Databricks and Azure Synapse.
328328

329329
```scala
330330
val tempDir = "wasbs://" + blobContainer + "@" + blobStorage +"/tempDirs"
@@ -337,10 +337,10 @@ As mentioned earlier, the SQL Data Warehouse connector uses Azure Blob storage a
337337
sc.hadoopConfiguration.set(acntInfo, blobAccessKey)
338338
```
339339

340-
4. Provide the values to connect to the Azure SQL Data Warehouse instance. You must have created a SQL data warehouse as a prerequisite. Use the fully qualified server name for **dwServer**. For example, `<servername>.database.windows.net`.
340+
4. Provide the values to connect to the Azure Synapse instance. You must have created an Azure Synapse Analytics service as a prerequisite. Use the fully qualified server name for **dwServer**. For example, `<servername>.database.windows.net`.
341341

342342
```scala
343-
//SQL Data Warehouse related settings
343+
//Azure Synapse related settings
344344
val dwDatabase = "<database-name>"
345345
val dwServer = "<database-server-name>"
346346
val dwUser = "<user-name>"
@@ -351,7 +351,7 @@ As mentioned earlier, the SQL Data Warehouse connector uses Azure Blob storage a
351351
val sqlDwUrlSmall = "jdbc:sqlserver://" + dwServer + ":" + dwJdbcPort + ";database=" + dwDatabase + ";user=" + dwUser+";password=" + dwPass
352352
```
353353

354-
5. Run the following snippet to load the transformed dataframe, **renamedColumnsDF**, as a table in a SQL data warehouse. This snippet creates a table called **SampleTable** in the SQL database.
354+
5. Run the following snippet to load the transformed dataframe, **renamedColumnsDF**, as a table in Azure Synapse. This snippet creates a table called **SampleTable** in the SQL database.
355355

356356
```scala
357357
spark.conf.set(
@@ -362,9 +362,9 @@ As mentioned earlier, the SQL Data Warehouse connector uses Azure Blob storage a
362362
```
363363

364364
> [!NOTE]
365-
> This sample uses the `forward_spark_azure_storage_credentials` flag, which causes SQL Data Warehouse to access data from blob storage using an Access Key. This is the only supported method of authentication.
365+
> This sample uses the `forward_spark_azure_storage_credentials` flag, which causes Azure Synapse to access data from blob storage using an Access Key. This is the only supported method of authentication.
366366
>
367-
> If your Azure Blob Storage is restricted to select virtual networks, SQL Data Warehouse requires [Managed Service Identity instead of Access Keys](../sql-database/sql-database-vnet-service-endpoint-rule-overview.md#impact-of-using-vnet-service-endpoints-with-azure-storage). This will cause the error "This request is not authorized to perform this operation."
367+
> If your Azure Blob Storage is restricted to select virtual networks, Azure Synapse requires [Managed Service Identity instead of Access Keys](../sql-database/sql-database-vnet-service-endpoint-rule-overview.md#impact-of-using-vnet-service-endpoints-with-azure-storage). This will cause the error "This request is not authorized to perform this operation."
368368
369369
6. Connect to the SQL database and verify that you see a database named **SampleTable**.
370370

@@ -392,7 +392,7 @@ In this tutorial, you learned how to:
392392
> * Create a notebook in Azure Databricks
393393
> * Extract data from a Data Lake Storage Gen2 account
394394
> * Transform data in Azure Databricks
395-
> * Load data into Azure SQL Data Warehouse
395+
> * Load data into Azure Synapse
396396
397397
Advance to the next tutorial to learn about streaming real-time data into Azure Databricks using Azure Event Hubs.
398398

0 commit comments

Comments
 (0)