Skip to content

Commit fe8c592

Browse files
20241227 edit pass
1 parent c9cca68 commit fe8c592

File tree

2 files changed

+131
-136
lines changed

2 files changed

+131
-136
lines changed
Lines changed: 70 additions & 76 deletions
Original file line numberDiff line numberDiff line change
@@ -1,52 +1,52 @@
11
---
2-
title: 'Tutorial: Loading external data using a managed identity'
2+
title: "Tutorial: Loading External Data Using a Managed Identity"
33
description: This tutorial shows how to connect to external data for queries or ingestion using a managed identity.
44
author: periclesrocha
5+
ms.author: procha
6+
ms.reviewer: WilliamDAssafMSFT
7+
ms.date: 01/04/2025
58
ms.service: azure-synapse-analytics
6-
ms.topic: tutorial
79
ms.subservice: sql
8-
ms.date: 01/04/2025
9-
ms.custom:
10-
ms.author: procha
11-
ms.reviewer: WilliamDAssafMSFT
10+
ms.topic: tutorial
1211
---
1312

1413
# Tutorial: Loading external data using a managed identity
1514

1615
This article explains how to create external tables or ingest data from Azure Data Lake Storage (ADLS) Gen2 accounts using a managed identity.
1716

18-
## Prerequisites:
17+
## Prerequisites
1918

2019
The following resources are required to complete this tutorial:
2120

22-
* An Azure Data Lake Storage Gen2 (ADLS Gen2) account
23-
* An Azure Synapse Analytics workspace and a dedicated SQL Pool
21+
- An Azure Data Lake Storage (ADLS) Gen2 account
22+
- An Azure Synapse Analytics workspace and a dedicated SQL pool
2423

2524
## Give the workspace identity access to the storage account
2625

27-
Each Azure Synapse Analytics workspace automatically creates a managed identity that helps you configure secure access to external data from your workspace. To learn more about managed identities for Azure Synapse Analytics, visit [Managed service identity for Azure Synapse Analytics - Azure Synapse | Microsoft Learn](https://learn.microsoft.com/azure/synapse-analytics/synapse-service-identity).
26+
Each Azure Synapse Analytics workspace automatically creates a managed identity that helps you configure secure access to external data from your workspace. To learn more about managed identities for Azure Synapse Analytics, visit [Managed service identity for Azure Synapse Analytics](../synapse-service-identity.md).
2827

2928
To enable your managed identity to access data on ADLS Gen2 accounts, you need to give your identity access to the source account. To grant proper permissions, follow these steps:
3029

3130
1. In the Azure portal, find your storage account.
32-
2. Select **Data storage -> Containers**, and navigate to the folder where the source data the external table needs access to is.
33-
3. Select **Access control (IAM)**.
34-
4. Select **Add -> Add role assignment**.
35-
5. In the list of job function roles, select **Storage Blob Data Contributor** and select **Next**.
36-
6. In the Add role assignment page, select **+ Select members**. The Select members pane opens in the right-hand corner.
37-
7. Type the name of your workspace identity. The workspace identity is the same as your workspace name. When displayed, pick your workspace identity and chose **Select**.
38-
8. In the **Add role assignment** page, make sure the list of Members include your desired Entra ID account. Once verified, select **Review + assign**.
39-
9. In the confirmation page, review the changes and select **Review + assign**.
31+
1. Select **Data storage -> Containers**, and navigate to the folder where the source data the external table needs access to is.
32+
1. Select **Access control (IAM)**.
33+
1. Select **Add -> Add role assignment**.
34+
1. In the list of job function roles, select **Storage Blob Data Contributor** and select **Next**.
35+
1. In the **Add role assignment** page, select **+ Select members**. The **Select members** pane opens.
36+
1. Type the name of your workspace identity. The workspace identity is the same as your workspace name. When displayed, pick your workspace identity, then **Select**.
37+
1. In the **Add role assignment** page, make sure the list of Members include your desired Microsoft Entra ID account. Once verified, select **Review + assign**.
38+
1. In the confirmation page, review the changes and select **Review + assign**.
4039

4140
Your workspace identity is now a member of the Storage Blob Data Contributor role and has access to the source folder.
4241

43-
Note: these steps also apply to secure ADLS Gen2 accounts that are configured to restrict public access. To learn more about securing your ADLS Gen2 account, visit [Configure Azure Storage firewalls and virtual networks | Microsoft Learn](https://learn.microsoft.com/azure/storage/common/storage-network-security?tabs=azure-portal).
42+
> [!NOTE]
43+
> These steps also apply to secure ADLS Gen2 accounts that are configured to restrict public access. To learn more about securing your ADLS Gen2 account, see [Configure Azure Storage firewalls and virtual networks](/azure/storage/common/storage-network-security).
4444
4545
## Ingest data using COPY INTO
4646

47-
The COPY INTO statement provides flexible, high-throughput data ingestion into your tables, and is the primary strategy to ingest data into your dedicated SQL Pool tables. It allows users to ingest data from external locations without having to create any of the extra database objects that are required for external tables.
47+
The T-SQL `COPY INTO` statement provides flexible, high-throughput data ingestion into your tables, and is the primary strategy to ingest data into your dedicated SQL pool tables. `COPY INTO` allows users to ingest data from external locations without having to create any of the extra database objects that are required for external tables.
4848

49-
To run the COPY INTO statement using a workspace managed identity for authentication, use the following command:
49+
To run the `COPY INTO` statement using a workspace managed identity for authentication, use the following T-SQL command:
5050

5151
```sql
5252
COPY INTO <TableName>
@@ -55,100 +55,92 @@ WITH
5555
(
5656
CREDENTIAL = (IDENTITY = 'Managed Identity'),
5757
[<CopyIntoOptions>]
58-
)
58+
);
5959
```
6060

6161
Where:
6262

63-
* \<TableName> is the name of the table you'll ingest data into
64-
* \<AccountName> is your ADLS Gen2 account name
65-
* \<Container> is the name of the container within your storage account where the source data is stored
66-
* \<Folder> is the folder (or path with subfolders) where the source data is stored within your container. You can also provide a file name if pointing directly to a single file.
67-
* \<CopyIntoOptions> is the list of any other options you wish to provide to the COPY INTO statement.
63+
- `<TableName>` is the name of the table to ingest data into
64+
- `<AccountName>` is your ADLS Gen2 account name
65+
- `<Container>` is the name of the container within your storage account where the source data is stored
66+
- `<Folder>` is the folder (or path with subfolders) where the source data is stored within your container. You can also provide a file name if pointing directly to a single file.
67+
- `<CopyIntoOptions>` is the list of any other options you wish to provide to the COPY INTO statement.
6868

69-
To learn more and explore the full syntax of COPY INTO, refer to <https://learn.microsoft.com/sql/t-sql/statements/copy-into-transact-sql?view=azure-sqldw-latest>.
69+
To learn more and explore the full syntax of COPY INTO, see [COPY INTO (Transact-SQL)](/sql/t-sql/statements/copy-into-transact-sql?view=azure-sqldw-latest).
7070

7171
## Query data on ADLS Gen2 using external tables
7272

73-
External tables allow users to query data from ADLS Gen2 accounts without the need to ingest data first. Users can create an external table which points to files on an ADLS Gen2 container, and query it just like a regular user table.
73+
External tables allow users to query data from Azure Data Lake Storage (ADLS) Gen2 accounts without the need to ingest data first. Users can create an external table which points to files on an ADLS Gen2 container, and query it just like a regular user table.
7474

7575
The following steps describe the process to create a new external table pointing to data on ADLS Gen2, using a managed identity for authentication.
7676

7777
### Create the required database objects
7878

7979
External tables require the following objects to be created:
8080

81-
1. A database master key that encrypts the database scoped credentials secret
82-
2. A database scoped credential that uses your workspace identity.
83-
3. An external data source that points to the source folder.
84-
4. An external file format that defines the format of the source files.
85-
5. An external table definition that is used for queries.
81+
1. A database master key that encrypts the database scoped credential's secret
82+
1. A database scoped credential that uses your workspace identity
83+
1. An external data source that points to the source folder
84+
1. An external file format that defines the format of the source files
85+
1. An external table definition that is used for queries
8686

87-
To follow these steps, you'll need to use the SQL editor in the Azure Synapse Workspace, or your preferred SQL client connected to your dedicated SQL Pool. Lets look at these steps in detail.
87+
To follow these steps, use the SQL editor in the Azure Synapse Workspace, or your preferred SQL client connected to your dedicated SQL Pool. Let's look at these steps in detail.
8888

8989
#### Create the database master key
9090

91-
The database master key is a symmetric key used to protect the private keys of certificates and asymmetric keys that are present in the database and secrets in database scoped credentials. If there's already a master key in the database, you don't need to create a new one.
91+
The database master key is a symmetric key used to protect the private keys of certificates and asymmetric keys that are present in the database and secrets in database scoped credentials. If there's already a master key in the database, you don't need to create a new one. Replace `<Secure Password>` with a secure password. This password is used to encrypt the master key in the database.
9292

93-
To create a master key, use the following command:
93+
To create a master key, use the following T-SQL command:
9494

9595
```sql
96-
-- Replace <Secure Password Here> with a secure password
97-
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<Secure Password Here>'
96+
-- Replace <Secure Password> with a secure password
97+
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<Secure Password>';
9898
```
9999

100-
Where:
101-
102-
* \<Secure Password Here> should be replaced with a strong password. This password is used to encrypt the master key in the database
103-
104-
To learn more about the database master key, refer to <https://learn.microsoft.com/sql/t-sql/statements/create-master-key-transact-sql?view=azure-sqldw-latest>.
100+
To learn more about the database master key, see [CREATE MASTER KEY (Transact-SQL)](/sql/t-sql/statements/create-master-key-transact-sql?view=azure-sqldw-latest).
105101

106102
#### Create the database scoped credential
107103

108104
A database scoped credential uses your workspace identity and is needed to access to the external location anytime the external table requires access to the source data.
109105

110-
To create the database scoped credential, use the following command:
106+
To create the database scoped credential, use the following command. Replace `<CredentialName>` with the name you would like to use for your database scoped credential.
111107

112108
```sql
113-
CREATE DATABASE SCOPED CREDENTIAL <CredentialName> WITH IDENTITY = 'Managed Service Identity'
109+
CREATE DATABASE SCOPED CREDENTIAL <CredentialName> WITH IDENTITY = 'Managed Service Identity';
114110
```
115111

116-
Where:
117-
118-
* \<CredentialName> should be replaced with the name you would like to use for your database scoped credential
119-
120-
To learn more about database scoped credentials, refer to <https://learn.microsoft.com/sql/t-sql/statements/create-database-scoped-credential-transact-sql?view=azure-sqldw-latest>.
112+
To learn more about database scoped credentials, see [CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)](/sql/t-sql/statements/create-database-scoped-credential-transact-sql?view=azure-sqldw-latest).
121113

122114
#### Create the external data source
123115

124116
The next step is to create an external data source that specifies where the source data used by the external table resides.
125117

126-
To create the external data source, use the following command:
118+
To create the external data source, use the following T-SQL command:
127119

128120
```sql
129121
CREATE EXTERNAL DATA SOURCE <ExternalDataSourceName>
130122
WITH (
131-
TYPE = hadoop,
123+
TYPE = HADOOP,
132124
LOCATION = 'abfss://<Container>@<AccountName>.dfs.core.windows.net/<Folder>/,
133125
CREDENTIAL = <CredentialName>
134-
)
126+
);
135127
```
136128
137129
Where:
138130
139-
* \<ExternalDataSourceName> is the name you want to use for your external data source
140-
* \<AccountName> is your ADLS Gen2 account name
141-
* \<Container> is the name of the container within your storage account where the source data is stored
142-
* \<Folder> is the folder (or path with subfolders) where the source data is stored within your container. You can also provide a file name if pointing directly to a single file.
143-
* \<Credential> is the name of the database scoped credential you created in step b)
131+
- `<ExternalDataSourceName>` is the name you want to use for your external data source.
132+
- `<AccountName>` is your ADLS Gen2 account name.
133+
- `<Container>` is the name of the container within your storage account where the source data is stored.
134+
- `<Folder>` is the folder (or path with subfolders) where the source data is stored within your container. You can also provide a file name if pointing directly to a single file.
135+
- `<Credential>` is the name of [the database scoped credential you created earlier](#create-the-database-scoped-credential).
144136
145-
To learn more about external data sources, refer to <https://learn.microsoft.com/sql/t-sql/statements/create-external-data-source-transact-sql?view=azure-sqldw-latest&tabs=dedicated>.
137+
To learn more about external data sources, see [CREATE EXTERNAL DATA SOURCE (Transact-SQL)](/sql/t-sql/statements/create-external-data-source-transact-sql?view=azure-sqldw-latest&tabs=dedicated).
146138
147139
#### Create the external file format
148140
149141
The next step is to create the external file format. It specifies the actual layout of the data referenced by the external table.
150142
151-
To create the external file format, use the following command:
143+
To create the external file format, use the following T-SQL command. Replace `<FileFormatName>` with the name you want to use for your external file format.
152144
153145
```sql
154146
CREATE EXTERNAL FILE FORMAT <FileFormatName>
@@ -160,18 +152,14 @@ WITH (
160152
FIRST_ROW = 2,
161153
USE_TYPE_DEFAULT = True
162154
)
163-
)
155+
);
164156
```
165157
166-
Where:
167-
168-
* \<FileFormatName> is the name you want to use for your external file format
169-
170-
In this example, adjust parameters such as FIELD_TERMINATOR, STRING_DELIMITER, FIRST_ROW and others as needed in accordance with your source data. For more formatting options and to learn more about EXTERNAL FILE FORMAT, visit <https://learn.microsoft.com/sql/t-sql/statements/create-external-file-format-transact-sql?view=azure-sqldw-latest&tabs=delimited>.
158+
In this example, adjust parameters such as `FIELD_TERMINATOR`, `STRING_DELIMITER`, `FIRST_ROW`, and others as needed in accordance with your source data. For more formatting options and to learn more about `EXTERNAL FILE FORMAT`, see [CREATE EXTERNAL FILE FORMAT](/sql/t-sql/statements/create-external-file-format-transact-sql?view=azure-sqldw-latest).
171159
172160
#### Create the external table
173161
174-
Now that all the necessary objects that hold the metadata to securely access external data are created, it's time to create the external table. To create the external table, use the following command:
162+
Now that all the necessary objects that hold the metadata to securely access external data are created, it's time to create the external table. To create the external table, use the following T-SQL command:
175163
176164
```sql
177165
-- Adjust the table name and columns to your desired name and external table schema
@@ -185,24 +173,30 @@ WITH
185173
LOCATION = '<Path>',
186174
DATA_SOURCE = <ExternalDataSourceName>,
187175
FILE_FORMAT = <FileFormatName>
188-
)
176+
);
189177
```
190178
191179
Where:
192180
193-
* \<ExternalTableName> is the name you want to use for your external table
194-
* \<Path> is the relative path of the source data from the location specified in the external data source on step c)
195-
* \<ExternalDataSourceName> is the name of the external data source you created previously c)
196-
* \<FileFormatName> is the name of the external file format you created in step d)
181+
- `<ExternalTableName>` is the name you want to use for your external table.
182+
- `<Path>` is the path of the source data, relative to the [location specified in the external data source](#create-the-external-data-source).
183+
- `<ExternalDataSourceName>` is the name of [the external data source you created previously](#create-the-external-data-source).
184+
- `<FileFormatName>` is the name of [the external file format you created previously](#create-the-external-file-format).
197185
198186
Make sure to adjust the table name and schema to the desired name and the schema of the data in your source files.
199187
200-
At this point, all the metadata required to access the external table are created. To test your external table, use a query such as the following one to validate your work:
188+
At this point, all the metadata required to access the external table are created. To test your external table, use a query such as the following T-SQL sample to validate your work:
201189
202190
```sql
203-
SELECT TOP 10 Col1, Col2 FROM <ExternalTableName>
191+
SELECT TOP 10 Col1, Col2 FROM <ExternalTableName>;
204192
```
205193
206194
If everything was configured properly, you should see the data from your source data as a result of this query.
207195
208-
To learn more and explore the full syntax of EXTERNAL TABLE, refer to <https://learn.microsoft.com/sql/t-sql/statements/create-external-table-transact-sql?view=azure-sqldw-latest&tabs=dedicated>.
196+
To learn more and explore the full syntax of `CREATE EXTERNAL TABLE`, see [CREATE EXTERNAL TABLE (Transact-SQL)](/sql/t-sql/statements/create-external-table-transact-sql?view=azure-sqldw-latest&tabs=dedicated).
197+
198+
## Related content
199+
200+
- [Tutorial: Loading external data using a managed identity](tutorial-external-tables-using-managed-identity.md)
201+
- [Load Contoso retail data into dedicated SQL pools in Azure Synapse Analytics](../sql-data-warehouse/sql-data-warehouse-load-from-azure-blob-storage-with-polybase.md)
202+
- [Managed identities for Azure Synapse Analytics](../synapse-service-identity.md)

0 commit comments

Comments
 (0)