Skip to content

Commit 9e7aea1

Browse files
authored
Merge pull request #114450 from kevinvngo/patch-159
Bulk loading quick start - examples
2 parents 0a97b82 + 5b78422 commit 9e7aea1

File tree

2 files changed

+154
-0
lines changed

2 files changed

+154
-0
lines changed
Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
---
2+
title: Authentication mechanisms with the COPY statement
3+
description: Outlines the authentication mechanisms to bulk load data
4+
services: synapse-analytics
5+
author: kevinvngo
6+
ms.service: synapse-analytics
7+
ms.topic: overview
8+
ms.subservice:
9+
ms.date: 05/06/2020
10+
ms.author: kevin
11+
ms.reviewer: jrasnick
12+
---
13+
14+
# Securely load data using Synapse SQL
15+
16+
This article highlights and provides examples on the secure authentication mechanisms for the [COPY statement](https://docs.microsoft.com/sql/t-sql/statements/copy-into-transact-sql?view=azure-sqldw-latest). The COPY statement is the most flexible and secure way of bulk loading data in Synapse SQL.
17+
## Supported authentication mechanisms
18+
19+
The following matrix describes the supported authentication methods for each file type and storage account. This applies to the source storage location and the error file location.
20+
21+
| | CSV | Parquet | ORC |
22+
| :------------------: | :-------------------------------: | :-------------------------------: | :-------------------------------: |
23+
| Azure blob storage | SAS/MSI/SERVICE PRINCIPAL/KEY/AAD | SAS/KEY | SAS/KEY |
24+
| Azure Data Lake Gen2 | SAS/MSI/SERVICE PRINCIPAL/KEY/AAD | SAS/MSI/SERVICE PRINCIPAL/KEY/AAD | SAS/MSI/SERVICE PRINCIPAL/KEY/AAD |
25+
26+
## A. Storage account key with LF as the row terminator
27+
28+
29+
```sql
30+
--Note when specifying the column list, input field numbers start from 1
31+
COPY INTO target_table (Col_one default 'myStringDefault' 1, Col_two default 1 3)
32+
FROM 'https://adlsgen2account.dfs.core.windows.net/myblobcontainer/folder1/'
33+
WITH (
34+
FILE_TYPE = 'CSV'
35+
,CREDENTIAL=(IDENTITY= 'Storage Account Key', SECRET='<Your_Account_Key>')
36+
--CREDENTIAL should look something like this:
37+
--CREDENTIAL=(IDENTITY= 'Storage Account Key', SECRET='x6RWv4It5F2msnjelv3H4DA80n0QW0daPdw43jM0nyetx4c6CpDkdj3986DX5AHFMIf/YN4y6kkCnU8lb+Wx0Pj+6MDw=='),
38+
,ROWTERMINATOR='0x0A' --0x0A specifies to use the Line Feed character (Unix based systems)
39+
)
40+
```
41+
42+
## B. Shared Access Signatures (SAS) with CRLF as the row terminator
43+
```sql
44+
COPY INTO target_table
45+
FROM 'https://adlsgen2account.dfs.core.windows.net/myblobcontainer/folder1/'
46+
WITH (
47+
FILE_TYPE = 'CSV'
48+
,CREDENTIAL=(IDENTITY= 'Shared Access Signature', SECRET='<Your_SAS_Token>')
49+
--CREDENTIAL should look something like this:
50+
--CREDENTIAL=(IDENTITY= 'Shared Access Signature', SECRET='?sv=2018-03-28&ss=bfqt&srt=sco&sp=rl&st=2016-10-17T20%3A14%3A55Z&se=2021-10-18T20%3A19%3A00Z&sig=IEoOdmeYnE9%2FKiJDSFSYsz4AkNa%2F%2BTx61FuQ%2FfKHefqoBE%3D'),
51+
,ROWTERMINATOR='\n'-- COPY command automatically prefixes the \r character when \n (newline) is specified. This results in carriage return newline (\r\n) for Windows based systems.
52+
)
53+
```
54+
55+
> [!IMPORTANT]
56+
>
57+
> - Specifying the ROWTERMINATOR as '\r\n' will be interpreted as '\r\r\n' which will result in parsing issues
58+
59+
## C. Managed Identity
60+
61+
Managed Identity authentication is required when your storage account is attached to a VNet.
62+
63+
### Prerequisites
64+
65+
1. Install Azure PowerShell using this [guide](/powershell/azure/install-az-ps?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json).
66+
2. If you have a general-purpose v1 or blob storage account, you must first upgrade to general-purpose v2 using this [guide](../../storage/common/storage-account-upgrade.md?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json).
67+
3. You must have **Allow trusted Microsoft services to access this storage account** turned on under Azure Storage account **Firewalls and Virtual networks** settings menu. Refer to this [guide](../../storage/common/storage-network-security.md?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json#exceptions) for more information.
68+
#### Steps
69+
70+
1. In PowerShell, **register your SQL server** with Azure Active Directory (AAD):
71+
72+
```powershell
73+
Connect-AzAccount
74+
Select-AzSubscription -SubscriptionId your-subscriptionId
75+
Set-AzSqlServer -ResourceGroupName your-database-server-resourceGroup -ServerName your-database-servername -AssignIdentity
76+
```
77+
78+
2. Create a **general-purpose v2 Storage Account** using this [guide](../../storage/common/storage-account-create.md?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json).
79+
80+
> [!NOTE]
81+
> If you have a general-purpose v1 or blob storage account, you must **first upgrade to v2** using this [guide](../../storage/common/storage-account-upgrade.md?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json).
82+
83+
3. Under your storage account, navigate to **Access Control (IAM)**, and select **Add role assignment**. Assign **Storage Blob Data Owner, Contributor, or Reader** RBAC role to your SQL server.
84+
85+
> [!NOTE]
86+
> Only members with Owner privilege can perform this step. For various built-in roles for Azure resources, refer to this [guide](../../role-based-access-control/built-in-roles.md?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json).
87+
88+
4. You can now run the COPY statement specifying "Managed Identity":
89+
90+
```sql
91+
COPY INTO dbo.target_table
92+
FROM 'https://myaccount.blob.core.windows.net/myblobcontainer/folder1/*.txt'
93+
WITH (
94+
FILE_TYPE = 'CSV',
95+
CREDENTIAL = (IDENTITY = 'Managed Identity'),
96+
)
97+
```
98+
99+
> [!IMPORTANT]
100+
>
101+
> - Specify the **Storage** **Blob Data** Owner, Contributor, or Reader RBAC role. These roles are different than the Azure built-in roles of Owner, Contributor, and Reader.
102+
103+
## D. Azure Active Directory Authentication (AAD)
104+
#### Steps
105+
106+
1. Under your storage account, navigate to **Access Control (IAM)**, and select **Add role assignment**. Assign **Storage Blob Data Owner, Contributor, or Reader** RBAC role to your AAD user.
107+
108+
2. Configure Azure AD authentication by going through the following [documentation](https://docs.microsoft.com/azure/sql-database/sql-database-aad-authentication-configure?tabs=azure-powershell#create-an-azure-ad-administrator-for-azure-sql-server).
109+
110+
3. Connect to your SQL pool using Active Directory where you can now run the COPY statement without specifying any credentials:
111+
112+
```sql
113+
COPY INTO dbo.target_table
114+
FROM 'https://myaccount.blob.core.windows.net/myblobcontainer/folder1/*.txt'
115+
WITH (
116+
FILE_TYPE = 'CSV'
117+
)
118+
```
119+
120+
> [!IMPORTANT]
121+
>
122+
> - Specify the **Storage** **Blob Data** Owner, Contributor, or Reader RBAC role. These roles are different than the Azure built-in roles of Owner, Contributor, and Reader.
123+
124+
## E. Service Principal Authentication
125+
#### Steps
126+
127+
1. [Create an Azure Active Directory (AAD) application](https://docs.microsoft.com/azure/active-directory/develop/howto-create-service-principal-portal#create-an-azure-active-directory-application)
128+
2. [Get application ID](https://docs.microsoft.com/azure/active-directory/develop/howto-create-service-principal-portal#get-values-for-signing-in)
129+
3. [Get the authentication key](https://docs.microsoft.com/azure/active-directory/develop/howto-create-service-principal-portal#create-a-new-application-secret)
130+
4. [Get the V1 OAuth 2.0 token endpoint](https://docs.microsoft.com/azure/data-lake-store/data-lake-store-service-to-service-authenticate-using-active-directory?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json#step-4-get-the-oauth-20-token-endpoint-only-for-java-based-applications)
131+
5. [Assign read, write, and execution permissions to your AAD application](https://docs.microsoft.com/azure/data-lake-store/data-lake-store-service-to-service-authenticate-using-active-directory?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json#step-3-assign-the-azure-ad-application-to-the-azure-data-lake-storage-gen1-account-file-or-folder) on your storage account
132+
6. You can now run the COPY statement:
133+
134+
```sql
135+
COPY INTO dbo.target_table
136+
FROM 'https://myaccount.blob.core.windows.net/myblobcontainer/folder0/*.txt'
137+
WITH (
138+
FILE_TYPE = 'CSV'
139+
,CREDENTIAL=(IDENTITY= '<application_ID>@<OAuth_2.0_Token_EndPoint>' , SECRET= '<authentication_key>')
140+
--CREDENTIAL should look something like this:
141+
--,CREDENTIAL=(IDENTITY= '92761aac-12a9-4ec3-89b8-7149aef4c35b@https://login.microsoftonline.com/72f714bf-86f1-41af-91ab-2d7cd011db47/oauth2/token', SECRET='juXi12sZ6gse]woKQNgqwSywYv]7A.M')
142+
)
143+
```
144+
145+
> [!IMPORTANT]
146+
>
147+
> - Use the **V1** version of the OAuth 2.0 token endpoint
148+
149+
## Next steps
150+
151+
- Check the [COPY statement article](https://docs.microsoft.com/sql/t-sql/statements/copy-into-transact-sql?view=azure-sqldw-latest#syntax) article for the detailed syntax
152+
- Check the [data loading overview](https://docs.microsoft.com/azure/synapse-analytics/sql-data-warehouse/design-elt-data-loading#what-is-elt) article for loading best practices

articles/synapse-analytics/sql-data-warehouse/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,8 @@
3737
items:
3838
- name: COPY statement
3939
href: quickstart-bulk-load-copy-tsql.md
40+
- name: COPY statement examples
41+
href: quickstart-bulk-load-copy-tsql-examples.md
4042
- name: Scale
4143
items:
4244
- name: Portal

0 commit comments

Comments
 (0)