Skip to content

Commit c2f0944

Browse files
20250629 1557 COPY INTO update for Fabric
1 parent ed99e67 commit c2f0944

File tree

1 file changed

+32
-32
lines changed

1 file changed

+32
-32
lines changed

docs/t-sql/statements/copy-into-transact-sql.md

Lines changed: 32 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
---
2-
title: COPY INTO (Transact-SQL)
2+
title: "COPY INTO (Transact-SQL)"
33
titleSuffix: Azure Synapse Analytics and Microsoft Fabric
44
description: Use the COPY statement in Azure Synapse Analytics and Warehouse in Microsoft Fabric for loading from external storage accounts.
55
author: WilliamDAssafMSFT
66
ms.author: wiassaf
77
ms.reviewer: procha, mikeray, fresantos
8-
ms.date: 02/26/2025
8+
ms.date: 06/29/2025
99
ms.service: sql
1010
ms.subservice: t-sql
1111
ms.topic: reference
@@ -16,7 +16,7 @@ f1_keywords:
1616
- "LOAD"
1717
dev_langs:
1818
- "TSQL"
19-
monikerRange: "=azure-sqldw-latest||=fabric"
19+
monikerRange: "=azure-sqldw-latest || =fabric"
2020
---
2121
# COPY INTO (Transact-SQL)
2222

@@ -141,7 +141,7 @@ Multiple file locations can only be specified from the same storage account and
141141

142142
*FILE_TYPE* specifies the format of the external data.
143143

144-
- CSV: Specifies a comma-separated values file compliant to the [RFC 4180](https://tools.ietf.org/html/rfc4180) standard.
144+
- CSV: Specifies a comma-separated values file compliant to the [RFC 4180](https://datatracker.ietf.org/doc/html/rfc4180) standard.
145145
- PARQUET: Specifies a Parquet format.
146146
- ORC: Specifies an Optimized Row Columnar (ORC) format.
147147

@@ -174,28 +174,28 @@ Multiple file locations can only be specified from the same storage account and
174174

175175
- *IDENTITY: A constant with a value of 'Shared Access Signature'*
176176
- *SECRET: The* [*shared access signature*](/azure/storage/common/storage-sas-overview) *provides delegated access to resources in your storage account.*
177-
- Minimum permissions required: READ and LIST
177+
- Minimum permissions required: READ and LIST
178178

179179
- Authenticating with [*Service Principals*](/azure/sql-data-warehouse/sql-data-warehouse-load-from-azure-data-lake-store#create-a-credential)
180180

181181
- *IDENTITY: \<ClientID\>@<OAuth_2.0_Token_EndPoint>*
182182
- *SECRET: Microsoft Entra application service principal key*
183-
- Minimum RBAC roles required: Storage blob data contributor, Storage blob data contributor, Storage blob data owner, or Storage blob data reader
183+
- Minimum RBAC roles required: Storage blob data contributor, Storage blob data contributor, Storage blob data owner, or Storage blob data reader
184184

185185
- Authenticating with Storage account key
186186

187187
- *IDENTITY: A constant with a value of 'Storage Account Key'*
188-
- *SECRET: Storage account key*
188+
- *SECRET: Storage account key*
189189

190190
- Authenticating with [Managed Identity](/azure/sql-data-warehouse/load-data-from-azure-blob-storage-using-polybase#authenticate-using-managed-identities-to-load-optional) (VNet Service Endpoints)
191191

192192
- *IDENTITY: A constant with a value of 'Managed Identity'*
193-
- Minimum RBAC roles required: Storage blob data contributor or Storage blob data owner for the Microsoft Entra registered [logical server in Azure](/azure/azure-sql/database/logical-servers). When using a dedicated SQL pool (formerly SQL DW) that is not associated with a Synapse Workspace this RBAC role is not required, but the managed identity requires Access Control List (ACL) permissions on the target objects to enable read access to the source files
193+
- Minimum RBAC roles required: Storage blob data contributor or Storage blob data owner for the Microsoft Entra registered [logical server in Azure](/azure/azure-sql/database/logical-servers). When using a dedicated SQL pool (formerly SQL DW) that is not associated with a Synapse Workspace this RBAC role is not required, but the managed identity requires Access Control List (ACL) permissions on the target objects to enable read access to the source files
194194

195195
- Authenticating with a Microsoft Entra user
196196

197197
- *CREDENTIAL isn't required*
198-
- Minimum RBAC roles required: Storage blob data contributor or Storage blob data owner for the Microsoft Entra user
198+
- Minimum RBAC roles required: Storage blob data contributor or Storage blob data owner for the Microsoft Entra user
199199

200200
#### *ERRORFILE = Directory Location*
201201

@@ -218,23 +218,23 @@ If ERRORFILE has the full path of the storage account defined, then the ERRORFIL
218218
- Authenticating with Shared Access Signatures (SAS)
219219
- *IDENTITY: A constant with a value of 'Shared Access Signature'*
220220
- *SECRET: The* [*shared access signature*](/azure/storage/common/storage-sas-overview) *provides delegated access to resources in your storage account.*
221-
- Minimum permissions required: READ, LIST, WRITE, CREATE, DELETE
221+
- Minimum permissions required: READ, LIST, WRITE, CREATE, DELETE
222222

223223
- Authenticating with [*Service Principals*](/azure/sql-data-warehouse/sql-data-warehouse-load-from-azure-data-lake-store#create-a-credential)
224224
- *IDENTITY: \<ClientID\>@<OAuth_2.0_Token_EndPoint>*
225225
- *SECRET: Microsoft Entra application service principal key*
226-
- Minimum RBAC roles required: Storage blob data contributor or Storage blob data owner
226+
- Minimum RBAC roles required: Storage blob data contributor or Storage blob data owner
227227

228228
> [!NOTE]
229229
> Use the OAuth 2.0 token endpoint **V1**
230230
231231
- Authenticating with [Managed Identity](/azure/sql-data-warehouse/load-data-from-azure-blob-storage-using-polybase#authenticate-using-managed-identities-to-load-optional) (VNet Service Endpoints)
232232
- *IDENTITY: A constant with a value of 'Managed Identity'*
233-
- Minimum RBAC roles required: Storage blob data contributor or Storage blob data owner for the Microsoft Entra registered SQL Database server
233+
- Minimum RBAC roles required: Storage blob data contributor or Storage blob data owner for the Microsoft Entra registered SQL Database server
234234

235235
- Authenticating with a Microsoft Entra user
236236
- *CREDENTIAL isn't required*
237-
- Minimum RBAC roles required: Storage blob data contributor or Storage blob data owner for the Microsoft Entra user
237+
- Minimum RBAC roles required: Storage blob data contributor or Storage blob data owner for the Microsoft Entra user
238238

239239
Using a storage account key with ERRORFILE_CREDENTIAL is not supported.
240240

@@ -345,7 +345,7 @@ GRANT ALTER on SCHEMA::HR to [[email protected]];
345345

346346
## Remarks
347347

348-
The COPY statement accepts only UTF-8 and UTF-16 valid characters for row data and command parameters. Source files or parameters (such as ROW TERMINATOR or FIELD TERMINATOR) that use invalid characters may be interpreted incorrectly by the COPY statement and cause unexpected results such as data corruption, or other failures. Make sure your source files and parameters are UTF-8 or UTF-16 compliant before you invoke the COPY statement.
348+
The COPY statement accepts only UTF-8 and UTF-16 valid characters for row data and command parameters. Source files or parameters (such as ROW TERMINATOR or FIELD TERMINATOR) that use invalid characters might be interpreted incorrectly by the COPY statement and cause unexpected results such as data corruption, or other failures. Make sure your source files and parameters are UTF-8 or UTF-16 compliant before you invoke the COPY statement.
349349

350350
## Examples
351351

@@ -488,11 +488,11 @@ The COPY command has better performance depending on your workload.
488488

489489
- Compressed files can't be split automatically. For best loading performance, consider splitting your input into multiple files when loading compressed CSVs.
490490

491-
- Large uncompressed CSV files can be split and loaded in parallel automatically, so there's no need to manually split uncompressed CSV files in most cases. In certain cases where auto file splitting isn't feasible due to data characteristics, manually splitting large CSVs may still benefit performance.
491+
- Large uncompressed CSV files can be split and loaded in parallel automatically, so there's no need to manually split uncompressed CSV files in most cases. In certain cases where auto file splitting isn't feasible due to data characteristics, manually splitting large CSVs might still benefit performance.
492492

493493
### What is the file splitting guidance for the COPY command loading compressed CSV files?
494494

495-
Guidance on the number of files is outlined in the following table. Once the recommended number of files are reached, you have better performance the larger the files. The number of files is determined by number of compute nodes multiplied by 60. For example, at 6000DWU we have 12 compute nodes and 12*60 = 720 partitions. For a simple file splitting experience, refer to [How to maximize COPY load throughput with file splits](https://techcommunity.microsoft.com/t5/azure-synapse-analytics/how-to-maximize-copy-load-throughput-with-file-splits/ba-p/1314474).
495+
Guidance on the number of files is outlined in the following table. Once the recommended number of files are reached, you have better performance the larger the files. The number of files is determined by number of compute nodes multiplied by 60. For example, at 6000DWU we have 12 compute nodes and 12*60 = 720 partitions. For a simple file splitting experience, refer to [How to maximize COPY load throughput with file splits](https://techcommunity.microsoft.com/blog/azuresynapseanalyticsblog/how-to-maximize-copy-load-throughput-with-file-splits/1314474).
496496

497497
| DWU | #Files |
498498
| :---: | :---: |
@@ -523,7 +523,7 @@ There are no limitations on the number or size of files; however, for best perfo
523523

524524
### Are there any known issues with the COPY statement?
525525

526-
If you have an Azure Synapse workspace that was created prior to December 7, 2020, you may run into a similar error message when authenticating using Managed Identity: `com.microsoft.sqlserver.jdbc.SQLServerException: Managed Service Identity has not been enabled on this server. Please enable Managed Service Identity and try again.`
526+
If you have an Azure Synapse workspace that was created prior to December 7, 2020, you might run into a similar error message when authenticating using Managed Identity: `com.microsoft.sqlserver.jdbc.SQLServerException: Managed Service Identity has not been enabled on this server. Please enable Managed Service Identity and try again.`
527527

528528
Follow these steps to work around this issue by re-registering the workspace's managed identity:
529529

@@ -534,7 +534,7 @@ Follow these steps to work around this issue by re-registering the workspace's m
534534
Select-AzSubscription -SubscriptionId <subscriptionId>
535535
Set-AzSqlServer -ResourceGroupName your-database-server-resourceGroup -ServerName your-SQL-servername -AssignIdentity
536536
```
537-
537+
538538
## Related content
539539

540540
- [Loading overview with [!INCLUDE[ssazuresynapse-md](../../includes/ssazuresynapse-md.md)]](/azure/sql-data-warehouse/design-elt-data-loading)
@@ -670,7 +670,7 @@ To access files on Azure Data Lake Storage (ADLS) Gen2 and Azure Blob Storage lo
670670

671671
*FILE_TYPE* specifies the format of the external data.
672672

673-
- CSV: Specifies a comma-separated values file compliant to the [RFC 4180](https://tools.ietf.org/html/rfc4180) standard.
673+
- CSV: Specifies a comma-separated values file compliant to the [RFC 4180](https://datatracker.ietf.org/doc/html/rfc4180) standard.
674674
- PARQUET: Specifies a Parquet format.
675675

676676
#### *CREDENTIAL (IDENTITY = '', SECRET = '')*
@@ -683,12 +683,12 @@ To access files on Azure Data Lake Storage (ADLS) Gen2 and Azure Blob Storage lo
683683

684684
- *IDENTITY: A constant with a value of 'Shared Access Signature'*
685685
- *SECRET: The* [*shared access signature*](/azure/storage/common/storage-sas-overview) *provides delegated access to resources in your storage account.*
686-
- Minimum permissions required: READ and LIST
686+
- Minimum permissions required: READ and LIST
687687

688688
- Authenticating with Storage Account Key
689689

690690
- *IDENTITY: A constant with a value of 'Storage Account Key'*
691-
- *SECRET: Storage account key*
691+
- *SECRET: Storage account key*
692692

693693
#### *ERRORFILE = Directory Location*
694694

@@ -705,12 +705,12 @@ When using a firewall protected Azure Storage Account, the error file will be cr
705705

706706
#### *ERRORFILE_CREDENTIAL = (IDENTITY= '', SECRET = '')*
707707

708-
*ERRORFILE_CREDENTIAL* only applies to CSV files. On [!INCLUDE [fabricdw](../../includes/fabric-dw.md)] in [!INCLUDE [fabric](../../includes/fabric.md)], the only supported authentication mechanism is Shared Access Signature (SAS).
708+
*ERRORFILE_CREDENTIAL* only applies to CSV files. On [!INCLUDE [fabricdw](../../includes/fabric-dw.md)] in [!INCLUDE [fabric](../../includes/fabric.md)], the only supported authentication mechanism is Shared Access Signature (SAS).
709709

710710
- Authenticating with Shared Access Signatures (SAS)
711711
- *IDENTITY: A constant with a value of 'Shared Access Signature'*
712712
- *SECRET: The* [*shared access signature*](/azure/storage/common/storage-sas-overview) *provides delegated access to resources in your storage account.*
713-
- Minimum permissions required: READ, LIST, WRITE, CREATE, DELETE
713+
- Minimum permissions required: READ, LIST, WRITE, CREATE, DELETE
714714

715715
> [!NOTE]
716716
> If you are using the same storage account for your ERRORFILE and specifying the ERRORFILE path relative to the root of the container, you do not need to specify the ERROR_CREDENTIAL.
@@ -794,20 +794,21 @@ Parser version 1.0 is available for backward compatibility only, and should be u
794794
- If a value is not provided to a non-nullable column, the COPY command fails.
795795
- If *MATCH_COLUMN_COUNT* is `ON`:
796796
- The COPY command checks if the column count on each row in each file from the source matches the column count of the destination table.
797-
- If there is a column count mismatch, the COPY command fails.
798-
797+
- If there is a column count mismatch, the COPY command fails.
798+
799799
> [!NOTE]
800800
> *MATCH_COLUMN_COUNT* works independently from *MAXERRORS*. A column count mismatch causes `COPY INTO` to fail regardless of *MAXERRORS*.
801801
802-
## Using COPY INTO with OneLake (Public Preview)
803-
You can now use `COPY INTO` to load data directly from files stored in **OneLake**, specifically from the **Files folder** of a Lakehouse. This eliminates the need for external staging accounts (such as ADLS Gen2 or Blob Storage) and enables workspace-governed, SaaS-native ingestion using Fabric permissions.
802+
## Use COPY INTO with OneLake
803+
804+
You can now use `COPY INTO` to load data directly from files stored in the Fabric OneLake, specifically from the **Files folder** of a Fabric Lakehouse. This eliminates the need for external staging accounts (such as ADLS Gen2 or Blob Storage) and enables workspace-governed, SaaS-native ingestion using Fabric permissions. This functionality supports:
804805

805-
This functionality supports:
806806
- Reading from `Files` folders in Lakehouses
807-
- Workspace-to-Warehouse loads within the same tenant
807+
- Workspace-to-warehouse loads within the same tenant
808808
- Native identity enforcement using Microsoft Entra ID
809+
809810
> [!NOTE]
810-
> This feature is currently in Public Preview.
811+
> This feature is currently in [preview](/fabric/fundamentals/preview).
811812
812813
## Permissions
813814

@@ -837,7 +838,7 @@ GO
837838
838839
## Remarks
839840

840-
The COPY statement accepts only UTF-8 and UTF-16 valid characters for row data and command parameters. Source files or parameters (such as ROW TERMINATOR or FIELD TERMINATOR) that use invalid characters may be interpreted incorrectly by the COPY statement and cause unexpected results such as data corruption, or other failures. Make sure your source files and parameters are UTF-8 or UTF-16 compliant before you invoke the COPY statement.
841+
The COPY statement accepts only UTF-8 and UTF-16 valid characters for row data and command parameters. Source files or parameters (such as `ROW TERMINATOR` or `FIELD TERMINATOR`) that use invalid characters might be interpreted incorrectly by the COPY statement and cause unexpected results such as data corruption, or other failures. Make sure your source files and parameters are UTF-8 or UTF-16 compliant before you invoke the COPY statement.
841842

842843
## Limitations for OneLake as source (Public Preview)
843844

@@ -849,7 +850,6 @@ The COPY statement accepts only UTF-8 and UTF-16 valid characters for row data a
849850

850851
- **Contributor permissions are required on both workspaces.** The executing user must have at least Contributor role on the source Lakehouse workspace and the target Warehouse workspace.
851852

852-
853853
## Examples
854854

855855
For more information on using COPY INTO on your [!INCLUDE [fabricdw](../../includes/fabric-dw.md)] in [!INCLUDE [fabric](../../includes/fabric.md)], see [Ingest data into your [!INCLUDE [fabricdw](../../includes/fabric-dw.md)] using the COPY statement](/fabric/data-warehouse/ingest-data-copy).

0 commit comments

Comments
 (0)