Skip to content

Commit a7edc7d

Browse files
authored
COPY statement examples for quickstart
1 parent 452d254 commit a7edc7d

File tree

1 file changed

+156
-0
lines changed

1 file changed

+156
-0
lines changed
Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
---
2+
title: Authentication mechanisms with the COPY statement
3+
description: Outlines the authentication mechanisms to bulk load data
4+
services: synapse-analytics
5+
author: kevinvngo
6+
ms.service: synapse-analytics
7+
ms.topic: overview
8+
ms.subservice:
9+
ms.date: 05/06/2020
10+
ms.author: kevin
11+
ms.reviewer: jrasnick
12+
---
13+
14+
# Securely load data using Synapse SQL
15+
16+
This article highlights and provides examples on the secure authentication mechanisms for the [COPY statement](https://docs.microsoft.com/sql/t-sql/statements/copy-into-transact-sql?view=azure-sqldw-latest). The COPY statement is the most flexible and secure way of bulk loading data in Synapse SQL.
17+
18+
## Supported authentication mechanisms
19+
20+
The following matrix describes the supported authentication methods for each file type and storage account. This applies to the source storage location and the error file location.
21+
22+
| | CSV | Parquet | ORC |
23+
| :------------------: | :-------------------------------: | :-------------------------------: | :-------------------------------: |
24+
| Azure blob storage | SAS/MSI/SERVICE PRINCIPAL/KEY/AAD | SAS/KEY | SAS/KEY |
25+
| Azure Data Lake Gen2 | SAS/MSI/SERVICE PRINCIPAL/KEY/AAD | SAS/MSI/SERVICE PRINCIPAL/KEY/AAD | SAS/MSI/SERVICE PRINCIPAL/KEY/AAD |
26+
27+
## A. Storage account key with LF as the row terminator
28+
29+
30+
```sql
31+
--Note when specifying the column list, input field numbers start from 1
32+
COPY INTO target_table (Col_one default 'myStringDefault' 1, Col_two default 1 3)
33+
FROM 'https://adlsgen2account.dfs.core.windows.net/myblobcontainer/folder1/'
34+
WITH (
35+
FILE_TYPE = 'CSV'
36+
,CREDENTIAL=(IDENTITY= 'Storage Account Key', SECRET='<Your_Account_Key>')
37+
--CREDENTIAL should look something like this:
38+
--CREDENTIAL=(IDENTITY= 'Storage Account Key', SECRET='x6RWv4It5F2msnjelv3H4DA80n0QW0daPdw43jM0nyetx4c6CpDkdj3986DX5AHFMIf/YN4y6kkCnU8lb+Wx0Pj+6MDw=='),
39+
,ROWTERMINATOR='0x0A' --0x0A specifies to use the Line Feed character (Unix based systems)
40+
)
41+
```
42+
43+
## B. Share Access Signatures (SAS) with CRLF as the row terminator
44+
```sql
45+
COPY INTO target_table
46+
FROM 'https://adlsgen2account.dfs.core.windows.net/myblobcontainer/folder1/'
47+
WITH (
48+
FILE_TYPE = 'CSV'
49+
,CREDENTIAL=(IDENTITY= 'Shared Access Signature', SECRET='<Your_SAS_Token>')
50+
--CREDENTIAL should look something like this:
51+
--CREDENTIAL=(IDENTITY= 'Shared Access Signature', SECRET='?sv=2018-03-28&ss=bfqt&srt=sco&sp=rl&st=2016-10-17T20%3A14%3A55Z&se=2021-10-18T20%3A19%3A00Z&sig=IEoOdmeYnE9%2FKiJDSFSYsz4AkNa%2F%2BTx61FuQ%2FfKHefqoBE%3D'),
52+
,ROWTERMINATOR='\n'-- COPY command automatically prefixes the \r character when \n (newline) is specified. This results in carriage return newline (\r\n) for Windows based systems.
53+
)
54+
```
55+
56+
> [!IMPORTANT]
57+
>
58+
> - Specifying the ROWTERMINATOR as '\r\n' will result in '\r\r\n' which will result in parsing issues
59+
60+
## C. Managed Identity
61+
62+
Managed Identity authentication is required when your storage account is attached to a VNet.
63+
64+
### Prerequisites
65+
66+
1. Install Azure PowerShell using this [guide](/powershell/azure/install-az-ps?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json).
67+
2. If you have a general-purpose v1 or blob storage account, you must first upgrade to general-purpose v2 using this [guide](../../storage/common/storage-account-upgrade.md?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json).
68+
3. You must have **Allow trusted Microsoft services to access this storage account** turned on under Azure Storage account **Firewalls and Virtual networks** settings menu. Refer to this [guide](../../storage/common/storage-network-security.md?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json#exceptions) for more information.
69+
70+
#### Steps
71+
72+
1. In PowerShell, **register your SQL server** with Azure Active Directory (AAD):
73+
74+
```powershell
75+
Connect-AzAccount
76+
Select-AzSubscription -SubscriptionId your-subscriptionId
77+
Set-AzSqlServer -ResourceGroupName your-database-server-resourceGroup -ServerName your-database-servername -AssignIdentity
78+
```
79+
80+
2. Create a **general-purpose v2 Storage Account** using this [guide](../../storage/common/storage-account-create.md?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json).
81+
82+
> [!NOTE]
83+
> If you have a general-purpose v1 or blob storage account, you must **first upgrade to v2** using this [guide](../../storage/common/storage-account-upgrade.md?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json).
84+
85+
3. Under your storage account, navigate to **Access Control (IAM)**, and select **Add role assignment**. Assign **Storage Blob Data Owner, Contributor, or Reader** RBAC role to your SQL server.
86+
87+
> [!NOTE]
88+
> Only members with Owner privilege can perform this step. For various built-in roles for Azure resources, refer to this [guide](../../role-based-access-control/built-in-roles.md?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json).
89+
90+
4. You can now run the COPY statement specifying "Managed Identity":
91+
92+
```sql
93+
COPY INTO dbo.target_table
94+
FROM 'https://myaccount.blob.core.windows.net/myblobcontainer/folder1/*.txt'
95+
WITH (
96+
FILE_TYPE = 'CSV',
97+
CREDENTIAL = (IDENTITY = 'Managed Identity'),
98+
)
99+
```
100+
101+
> [!IMPORTANT]
102+
>
103+
> - Specify the **Storage** **Blob Data** Owner, Contributor, or Reader RBAC role. These roles are different than the Azure built-in roles of Owner, Contributor, and Reader.
104+
105+
## D. Azure Active Directory Authentication (AAD)
106+
107+
#### Steps
108+
109+
1. Under your storage account, navigate to **Access Control (IAM)**, and select **Add role assignment**. Assign **Storage Blob Data Owner, Contributor, or Reader** RBAC role to your AAD user.
110+
111+
2. Configure Azure AD authentication by going through the following [documentation](https://docs.microsoft.com/azure/sql-database/sql-database-aad-authentication-configure?tabs=azure-powershell#create-an-azure-ad-administrator-for-azure-sql-server).
112+
113+
Connect to your SQL pool using Active Directory where you can now run the COPY statement without specifying any credentials:
114+
115+
```sql
116+
COPY INTO dbo.target_table
117+
FROM 'https://myaccount.blob.core.windows.net/myblobcontainer/folder1/*.txt'
118+
WITH (
119+
FILE_TYPE = 'CSV'
120+
)
121+
```
122+
123+
> [!IMPORTANT]
124+
>
125+
> - Specify the **Storage** **Blob Data** Owner, Contributor, or Reader RBAC role. These roles are different than the Azure built-in roles of Owner, Contributor, and Reader.
126+
127+
## E. Service Principal Authentication
128+
129+
#### Steps
130+
131+
1. [Create an Azure Active Directory (AAD) application](https://docs.microsoft.com/azure/active-directory/develop/howto-create-service-principal-portal#create-an-azure-active-directory-application)
132+
2. [Get application ID](https://docs.microsoft.com/azure/active-directory/develop/howto-create-service-principal-portal#get-values-for-signing-in)
133+
3. [Get the authentication key](https://docs.microsoft.com/azure/active-directory/develop/howto-create-service-principal-portal#create-a-new-application-secret)
134+
4. [Get the V1 OAuth 2.0 token endpoint](https://docs.microsoft.com/azure/data-lake-store/data-lake-store-service-to-service-authenticate-using-active-directory?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json#step-4-get-the-oauth-20-token-endpoint-only-for-java-based-applications)
135+
5. [Assign read, write, and execution permissions to your AAD application](https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-service-to-service-authenticate-using-active-directory?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json#step-3-assign-the-azure-ad-application-to-the-azure-data-lake-storage-gen1-account-file-or-folder) on your storage account
136+
6. You can now run the COPY statement:
137+
138+
```sql
139+
COPY INTO dbo.target_table
140+
FROM 'https://myaccount.blob.core.windows.net/myblobcontainer/folder0/*.txt'
141+
WITH (
142+
FILE_TYPE = 'CSV'
143+
,CREDENTIAL=(IDENTITY= '<application_ID>@<OAuth_2.0_Token_EndPoint>' , SECRET= '<authentication_key>')
144+
--CREDENTIAL should look something like this:
145+
--,CREDENTIAL=(IDENTITY= '92761aac-12a9-4ec3-89b8-7749aef4c35b@https://login.microsoftonline.com/72f988bf-86f1-41af-91ab-2d7cd011db47/oauth2/token', SECRET='juXi1OVZ6gf5]woKQNgqwSywYv]7A.M')
146+
)
147+
```
148+
149+
> [!IMPORTANT]
150+
>
151+
> - Use the **V1** version of the OAuth 2.0 token endpoint
152+
153+
## Next steps
154+
155+
- Check the [COPY statement article](https://docs.microsoft.com/sql/t-sql/statements/copy-into-transact-sql?view=azure-sqldw-latest#syntax) article for the detailed syntax
156+
- Check the [data loading overview](https://docs.microsoft.com/azure/synapse-analytics/sql-data-warehouse/design-elt-data-loading#what-is-elt) article for loading best practices

0 commit comments

Comments
 (0)