Skip to content

Commit c525390

Browse files
Clare Zheng (Shanghai Wicresoft Co Ltd)Clare Zheng (Shanghai Wicresoft Co Ltd)
authored andcommitted
Add Iceberg format doc
1 parent d87c04a commit c525390

File tree

3 files changed

+95
-1
lines changed

3 files changed

+95
-1
lines changed

articles/data-factory/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -492,6 +492,8 @@ items:
492492
displayName: timeout
493493
- name: HubSpot
494494
href: connector-hubspot.md
495+
- name: Iceberg format
496+
href: format-iceberg.md
495497
- name: Impala
496498
href: connector-impala.md
497499
- name: Informix

articles/data-factory/connector-azure-data-lake-storage.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: jianleishen
77
ms.subservice: data-movement
88
ms.topic: conceptual
99
ms.custom: synapse
10-
ms.date: 01/05/2024
10+
ms.date: 09/12/2024
1111
---
1212

1313
# Copy and transform data in Azure Data Lake Storage Gen2 using Azure Data Factory or Azure Synapse Analytics
@@ -428,6 +428,7 @@ For a full list of sections and properties available for defining activities, se
428428
### Azure Data Lake Storage Gen2 as a source type
429429

430430
[!INCLUDE [data-factory-v2-file-formats](includes/data-factory-v2-file-formats.md)]
431+
- [Iceberg format](format-iceberg.md)
431432

432433
You have several options to copy data from ADLS Gen2:
433434

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
---
2+
title: Iceberg format in Azure Data Factory
3+
titleSuffix: Azure Data Factory & Azure Synapse
4+
description: This topic describes how to deal with Iceberg format in Azure Data Factory and Azure Synapse Analytics.
5+
author: jianleishen
6+
ms.subservice: data-movement
7+
ms.custom: synapse
8+
ms.topic: conceptual
9+
ms.date: 09/12/2024
10+
ms.author: jianleishen
11+
---
12+
13+
# Iceberg format in Azure Data Factory and Azure Synapse Analytics
14+
15+
[!INCLUDE[appliesto-adf-asa-md](includes/appliesto-adf-asa-md.md)]
16+
17+
Follow this article when you want to **write the data into Iceberg format**.
18+
19+
Iceberg format is supported for the following connectors:
20+
21+
- [Azure Data Lake Storage Gen2](connector-azure-data-lake-storage.md)
22+
23+
## Dataset properties
24+
25+
For a full list of sections and properties available for defining datasets, see the [Datasets](concepts-datasets-linked-services.md) article. This section provides a list of properties supported by the Iceberg format dataset.
26+
27+
| Property | Description | Required |
28+
| ---------------- | ------------------------------------------------------------ | -------- |
29+
| type | The type property of the dataset must be set to **Iceberg**. | Yes |
30+
| location | Location settings of the file(s). Each file-based connector has its own location type and supported properties under `location`. | Yes |
31+
32+
Below is an example of Iceberg dataset on Azure Data Lake Storage Gen2:
33+
34+
```json
35+
{
36+
"name": "IcebergDataset",
37+
"properties": {
38+
"type": "Iceberg",
39+
"linkedServiceName": {
40+
"referenceName": "<Azure Data Lake Storage Gen2 linked service name>",
41+
"type": "LinkedServiceReference"
42+
},
43+
"schema": [ < physical schema, optional, retrievable during authoring >
44+
],
45+
"typeProperties": {
46+
"location": {
47+
"type": "AzureBlobFSLocation",
48+
"fileSystem": "filesystemname",
49+
"folderPath": "folder/subfolder",
50+
}
51+
}
52+
}
53+
}
54+
55+
```
56+
57+
## Copy activity properties
58+
59+
For a full list of sections and properties available for defining activities, see the [Pipelines](concepts-pipelines-activities.md) article. This section provides a list of properties supported by the Iceberg sink.
60+
61+
### Iceberg as sink
62+
63+
The following properties are supported in the copy activity ***\*sink\**** section.
64+
65+
| Property | Description | Required |
66+
| -------------- | ------------------------------------------------------------ | -------- |
67+
| type | The type property of the copy activity source must be set to **IcebergSink**. | Yes |
68+
| formatSettings | A group of properties. Refer to **Iceberg write settings** table below. | No |
69+
| storeSettings | A group of properties on how to write data to a data store. Each file-based connector has its own supported write settings under `storeSettings`. | No |
70+
71+
Supported **Iceberg write settings** under `formatSettings`:
72+
73+
| Property | Description | Required |
74+
| ------------- | ------------------------------------------------------------ | ----------------------------------------------------- |
75+
| type | The type of formatSettings must be set to **IcebergWriteSettings**. | Yes |
76+
77+
## Related connectors and formats
78+
79+
Here are some common connectors and formats related to the delimited text format:
80+
81+
- [Azure Data Lake Storage Gen2](connector-azure-data-lake-storage.md)
82+
- [Binary format](format-binary.md)
83+
- [Delta format](format-delta.md)
84+
- [Excel format](format-excel.md)
85+
- [JSON format](format-json.md)
86+
- [Parquet format](format-parquet.md)
87+
88+
## Related content
89+
90+
- [Data type mapping in dataset schemas](copy-activity-schema-and-type-mapping.md#data-type-mapping)
91+
- [Copy activity overview](copy-activity-overview.md)

0 commit comments

Comments
 (0)