You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/t-sql/statements/copy-into-transact-sql.md
+15-13Lines changed: 15 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ description: Use the COPY statement in Azure Synapse Analytics and Warehouse in
5
5
author: WilliamDAssafMSFT
6
6
ms.author: wiassaf
7
7
ms.reviewer: procha, mikeray, fresantos
8
-
ms.date: 06/29/2025
8
+
ms.date: 07/28/2025
9
9
ms.service: sql
10
10
ms.subservice: t-sql
11
11
ms.topic: reference
@@ -85,7 +85,7 @@ WITH
85
85
86
86
#### *schema_name*
87
87
88
-
Optional if the default schema for the user performing the operation is the schema of the specified table. If *schema* isn't specified, and the default schema of the user performing the COPY operation is different from the schema of the specified table, COPY is canceled, and an error message is returned.
88
+
Optional if the default schema for the user performing the operation is the schema of the specified table. If *schema* isn't specified, and the default schema of the user performing the COPY operation is different from the schema of the specified table, then the COPY operation is canceled and an error message is returned.
89
89
90
90
#### *table_name*
91
91
@@ -116,7 +116,7 @@ Is where the files containing the data is staged. Currently Azure Data Lake Stor
116
116
-*External location* for ADLS Gen2: `https://<account\>.dfs.core.windows.net/<container\>/<path\>`
117
117
118
118
> [!NOTE]
119
-
> The .blob endpoint is available for ADLS Gen2 as well and currently yields the best performance. Use the .blob endpoint when .dfs is not required for your authentication method.
119
+
> The `.blob` endpoint is available for ADLS Gen2 as well and currently yields the best performance. Use the `.blob` endpoint when `.dfs` is not required for your authentication method.
120
120
121
121
-*Account* - The storage account name
122
122
@@ -128,7 +128,7 @@ Wildcards cards can be included in the path where
128
128
129
129
- Wildcard path name matching is case-sensitive
130
130
- Wildcard can be escaped using the backslash character (`\`)
131
-
- Wildcard expansion is applied recursively. For instance, all CSV files under Customer1 (including subdirectories of Customer1) are loaded in the following example: `Account/Container/Customer1/*.csv`
131
+
- Wildcard expansion is applied recursively. For instance, all CSV files under `Customer1` (including subdirectories of `Customer1`) are loaded in the following example: `Account/Container/Customer1/*.csv`
132
132
133
133
> [!NOTE]
134
134
> For best performance, avoid specifying wildcards that would expand over a larger number of files. If possible, list multiple file locations instead of specifying wildcards.
@@ -260,9 +260,9 @@ When *FILE_TYPE* is 'PARQUET', exceptions that are caused by data type conversio
260
260
261
261
The COPY command autodetects the compression type based on the file extension when this parameter isn't specified:
262
262
263
-
- .gz - **GZIP**
264
-
- .snappy –**Snappy**
265
-
- .deflate - **DefaultCodec** (Parquet and ORC only)
263
+
-`.gz` - **GZIP**
264
+
-`.snappy` -**Snappy**
265
+
-`.deflate` - **DefaultCodec** (Parquet and ORC only)
266
266
267
267
The COPY command requires that gzip files do not contain any trailing garbage to operate normally. The gzip format strictly requires that files be composed of valid members without any additional information before, between, or after them. Any deviation from this format, such as the presence of trailing non-gzip data, will result in the failure of the COPY command. Make sure to verify there's no trailing garbage at the end of gzip files to ensure COPY can successfully process these files.
268
268
@@ -299,7 +299,7 @@ DATEFORMAT only applies to CSV and specifies the date format of the date mapping
299
299
300
300
#### *IDENTITY_INSERT = 'ON' | 'OFF'*
301
301
302
-
IDENTITY_INSERT specifies whether the identity value or values in the imported data file are to be used for the identity column. If IDENTITY_INSERT is OFF (default), the identity values for this column are verified, but not imported. Azure Synapse Analytics automatically assigns unique values based on the seed and increment values specified during table creation. Note the following behavior with the COPY command:
302
+
IDENTITY_INSERT specifies whether the identity value or values in the imported data file are to be used for the identity column. If IDENTITY_INSERT is OFF (default), the identity values for this column are verified, but not imported. Note the following behavior with the COPY command:
303
303
304
304
- If IDENTITY_INSERT is OFF, and table has an identity column
305
305
- A column list must be specified which doesn't map an input field to the identity column.
@@ -308,6 +308,8 @@ IDENTITY_INSERT specifies whether the identity value or values in the imported d
308
308
- Default value isn't supported for the IDENTITY COLUMN in the column list.
309
309
- IDENTITY_INSERT can only be set for one table at a time.
310
310
311
+
Azure Synapse Analytics automatically assigns unique values based on the seed and increment values specified during table creation.
312
+
311
313
#### *AUTO_CREATE_TABLE = { 'ON' | 'OFF' }*
312
314
313
315
*AUTO_CREATE_TABLE* specifies if the table could be automatically created by working alongside with automatic schema discovery. It's available only for Parquet files.
@@ -318,9 +320,9 @@ IDENTITY_INSERT specifies whether the identity value or values in the imported d
318
320
> [!NOTE]
319
321
> The automatic table creation works alongside with automatic schema discovery. The automatic table creation is NOT enabled by default.
320
322
321
-
Don't load into hash distributed tables from Parquet files using COPY INTO with AUTO_CREATE_TABLE = 'ON'.
323
+
Don't load into hash distributed tables from Parquet files using `COPY INTO` with `AUTO_CREATE_TABLE = 'ON'`.
322
324
323
-
If Parquet files are to be loaded into hash distributed tables using COPY INTO, load it into a round robin staging table followed by INSERT ... SELECT from that table to the target hash distributed table.
325
+
If Parquet files are to be loaded into hash distributed tables using `COPY INTO`, load it into a round robin staging table followed by `INSERT ... SELECT` from that table to the target hash distributed table.
324
326
325
327
### Permissions
326
328
@@ -637,7 +639,7 @@ Specifies where the files containing the data is staged. Currently Azure Data La
637
639
Azure Data Lake Storage (ADLS) Gen2 offers better performance than Azure Blob Storage (legacy). Consider using an ADLS Gen2 account whenever possible.
638
640
639
641
> [!NOTE]
640
-
> The .blob endpoint is available for ADLS Gen2 as well and currently yields the best performance. Use the `blob` endpoint when `dfs` is not required for your authentication method.
642
+
> The `.blob` endpoint is available for ADLS Gen2 as well and currently yields the best performance. Use the `blob` endpoint when `dfs` is not required for your authentication method.
641
643
642
644
-*Account* - The storage account name
643
645
@@ -659,7 +661,7 @@ Multiple file locations can only be specified from the same storage account and
659
661
660
662
**External locations behind firewall**
661
663
662
-
To access files on Azure Data Lake Storage (ADLS) Gen2 and Azure Blob Storage locations that are behind a firewall, the following prerequisites are apply:
664
+
To access files on Azure Data Lake Storage (ADLS) Gen2 and Azure Blob Storage locations that are behind a firewall, the following prerequisites apply:
663
665
664
666
- A **workspace identity** for the workspace hosting your warehouse must be provisioned. For more information on how to set up a workspace identity, see [Workspace identity](/fabric/security/workspace-identity).
665
667
- Your Entra ID account must be able to use the workspace identity.
@@ -730,7 +732,7 @@ In Microsoft Fabric, *MAXERRORS* cannot be used when *FILE_TYPE* is 'PARQUET'.
730
732
731
733
The COPY command autodetects the compression type based on the file extension when this parameter isn't specified:
732
734
733
-
- .gz - **GZIP**
735
+
-`.gz` - **GZIP**
734
736
735
737
Loading compressed files is currently only supported with *PARSER_VERSION* 1.0.
0 commit comments