Skip to content

Commit c4ec7a8

Browse files
Merge pull request #263142 from jonburchel/2024-01-12-merge-public-prs
2024 01 12 merge public prs
2 parents 3820b35 + 7c8ac18 commit c4ec7a8

File tree

5 files changed

+81
-75
lines changed

5 files changed

+81
-75
lines changed

articles/data-factory/connector-microsoft-fabric-lakehouse.md

Lines changed: 47 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,13 @@ This article outlines how to use Copy activity to copy data from and to Microsof
2424

2525
This Microsoft Fabric Lakehouse connector is supported for the following capabilities:
2626

27-
| Supported capabilities|IR |
28-
|---------| --------|
29-
|[Copy activity](copy-activity-overview.md) (source/sink)|① ②|
30-
|[Mapping data flow](concepts-data-flow-overview.md) (source/sink)|① |
27+
| Supported capabilities|IR | Managed private endpoint|
28+
|---------| --------| --------|
29+
|[Copy activity](copy-activity-overview.md) (source/sink)|① ②||
30+
|[Mapping data flow](concepts-data-flow-overview.md) (source/sink)|① |- |
31+
|[Lookup activity](control-flow-lookup-activity.md)|① ②||
32+
|[GetMetadata activity](control-flow-get-metadata-activity.md)|① ②||
33+
|[Delete activity](delete-activity.md)|① ②||
3134

3235
*① Azure integration runtime ② Self-hosted integration runtime*
3336

@@ -39,7 +42,7 @@ This Microsoft Fabric Lakehouse connector is supported for the following capabil
3942

4043
Use the following steps to create a Microsoft Fabric Lakehouse linked service in the Azure portal UI.
4144

42-
1. Browse to the Manage tab in your Azure Data Factory or Synapse workspace and select Linked Services, then select New:
45+
1. Browse to the **Manage** tab in your Azure Data Factory or Synapse workspace and select Linked Services, then select New:
4346

4447
# [Azure Data Factory](#tab/data-factory)
4548

@@ -203,18 +206,18 @@ The following properties are supported for Microsoft Fabric Lakehouse Table data
203206

204207
```json
205208
{
206-
    "name":"LakehouseTableDataset",
207-
    "properties":{
208-
        "type":"LakehouseTable",
209-
        "linkedServiceName":{
210-
            "referenceName":"<Microsoft Fabric Lakehouse linked service name>",
211-
            "type":"LinkedServiceReference"
212-
        },
213-
        "typeProperties":{
209+
"name": "LakehouseTableDataset",
210+
"properties": {
211+
"type": "LakehouseTable",
212+
"linkedServiceName": {
213+
"referenceName": "<Microsoft Fabric Lakehouse linked service name>",
214+
"type": "LinkedServiceReference"
215+
},
216+
"typeProperties": {
214217
"table": "<table_name>"
215-
        },
216-
        "schema":[< physical schema, optional, retrievable during authoring >]
217-
    }
218+
},
219+
"schema": [< physical schema, optional, retrievable during authoring >]
220+
}
218221
}
219222
```
220223

@@ -541,39 +544,39 @@ The following properties are supported in the Mapping Data Flows **sink** sectio
541544
| Update method | When you select "Allow insert" alone or when you write to a new delta table, the target receives all incoming rows regardless of the Row policies set. If your data contains rows of other Row policies, they need to be excluded using a preceding Filter transform. <br><br> When all Update methods are selected a Merge is performed, where rows are inserted/deleted/upserted/updated as per the Row Policies set using a preceding Alter Row transform. | yes | `true` or `false` | insertable <br> deletable <br> upsertable <br> updateable |
542545
| Optimized Write | Achieve higher throughput for write operation via optimizing internal shuffle in Spark executors. As a result, you might notice fewer partitions and files that are of a larger size | no | `true` or `false` | optimizedWrite: true |
543546
| Auto Compact | After any write operation has completed, Spark will automatically execute the ```OPTIMIZE``` command to reorganize the data, resulting in more partitions if necessary, for better reading performance in the future | no | `true` or `false` | autoCompact: true |
544-
| Merge Schema | Merge schemaoption allows schema evolution, that is, any columns that are present in the current incoming stream but not in the target Delta table is automatically added to its schema. This option is supported across all update methods. | no | `true` or `false` | mergeSchema: true |
547+
| Merge Schema | Merge schema option allows schema evolution, that is, any columns that are present in the current incoming stream but not in the target Delta table is automatically added to its schema. This option is supported across all update methods. | no | `true` or `false` | mergeSchema: true |
545548

546549
**Example: Microsoft Fabric Lakehouse Table sink**
547550

548551
```
549552
sink(allowSchemaDrift: true,
550-
    validateSchema: false,
551-
    input(
552-
        CustomerID as string,
553-
        NameStyle as string,
554-
        Title as string,
555-
        FirstName as string,
556-
        MiddleName as string,
557-
        LastName as string,
558-
        Suffix as string,
559-
        CompanyName as string,
560-
        SalesPerson as string,
561-
        EmailAddress as string,
562-
        Phone as string,
563-
        PasswordHash as string,
564-
        PasswordSalt as string,
565-
        rowguid as string,
566-
        ModifiedDate as string
567-
    ),
568-
    deletable:false,
569-
    insertable:true,
570-
    updateable:false,
571-
    upsertable:false,
572-
    optimizedWrite: true,
573-
    mergeSchema: true,
574-
    autoCompact: true,
575-
    skipDuplicateMapInputs: true,
576-
    skipDuplicateMapOutputs: true) ~> CustomerTable
553+
validateSchema: false,
554+
input(
555+
CustomerID as string,
556+
NameStyle as string,
557+
Title as string,
558+
FirstName as string,
559+
MiddleName as string,
560+
LastName as string,
561+
Suffix as string,
562+
CompanyName as string,
563+
SalesPerson as string,
564+
EmailAddress as string,
565+
Phone as string,
566+
PasswordHash as string,
567+
PasswordSalt as string,
568+
rowguid as string,
569+
ModifiedDate as string
570+
),
571+
deletable:false,
572+
insertable:true,
573+
updateable:false,
574+
upsertable:false,
575+
optimizedWrite: true,
576+
mergeSchema: true,
577+
autoCompact: true,
578+
skipDuplicateMapInputs: true,
579+
skipDuplicateMapOutputs: true) ~> CustomerTable
577580
578581
```
579582

articles/data-factory/control-flow-get-metadata-activity.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ The Get Metadata activity takes a dataset as an input and returns metadata infor
5353
| [Azure Data Lake Storage Gen1](connector-azure-data-lake-store.md) | √/√ | √/√ || x/x | √/√ || x ||| √/√ |
5454
| [Azure Data Lake Storage Gen2](connector-azure-data-lake-storage.md) | √/√ | √/√ || x/x | √/√ ||||| √/√ |
5555
| [Azure Files](connector-azure-file-storage.md) | √/√ | √/√ || √/√ | √/√ || x ||| √/√ |
56+
| [Microsoft Fabric Lakehouse](connector-microsoft-fabric-lakehouse.md) | √/√ | √/√ || x/x | √/√ ||||| √/√ |
5657
| [File system](connector-file-system.md) | √/√ | √/√ || √/√ | √/√ || x ||| √/√ |
5758
| [SFTP](connector-sftp.md) | √/√ | √/√ || x/x | √/√ || x ||| √/√ |
5859
| [FTP](connector-ftp.md) | √/√ | √/√ || x/x | x/x || x ||| √/√ |

articles/data-factory/delete-activity.md

Lines changed: 28 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ ms.date: 07/17/2023
1515

1616
[!INCLUDE[appliesto-adf-asa-md](includes/appliesto-adf-asa-md.md)]
1717

18-
You can use the Delete Activity in Azure Data Factory to delete files or folders from on-premises storage stores or cloud storage stores. Use this activity to clean up or archive files when they are no longer needed.
18+
You can use the Delete Activity in Azure Data Factory to delete files or folders from on-premises storage stores or cloud storage stores. Use this activity to clean up or archive files when they're no longer needed.
1919

2020
> [!WARNING]
2121
> Deleted files or folders cannot be restored (unless the storage has soft-delete enabled). Be cautious when using the Delete activity to delete files or folders.
@@ -28,31 +28,32 @@ Here are some recommendations for using the Delete activity:
2828

2929
- Make sure that the service has write permissions to delete folders or files from the storage store.
3030

31-
- Make sure you are not deleting files that are being written at the same time.
31+
- Make sure you aren't deleting files that are being written at the same time.
3232

33-
- If you want to delete files or folder from an on-premises system, make sure you are using a self-hosted integration runtime with a version greater than 3.14.
33+
- If you want to delete files or folder from an on-premises system, make sure you're using a self-hosted integration runtime with a version greater than 3.14.
3434

3535
## Supported data stores
3636

37-
- [Azure Blob storage](connector-azure-blob-storage.md)
38-
- [Azure Data Lake Storage Gen1](connector-azure-data-lake-store.md)
39-
- [Azure Data Lake Storage Gen2](connector-azure-data-lake-storage.md)
40-
- [Azure Files](connector-azure-file-storage.md)
41-
- [File System](connector-file-system.md)
42-
- [FTP](connector-ftp.md)
43-
- [SFTP](connector-sftp.md)
44-
- [Amazon S3](connector-amazon-simple-storage-service.md)
45-
- [Amazon S3 Compatible Storage](connector-amazon-s3-compatible-storage.md)
46-
- [Google Cloud Storage](connector-google-cloud-storage.md)
47-
- [Oracle Cloud Storage](connector-oracle-cloud-storage.md)
48-
- [HDFS](connector-hdfs.md)
37+
- [Azure Blob storage](connector-azure-blob-storage.md)
38+
- [Azure Data Lake Storage Gen1](connector-azure-data-lake-store.md)
39+
- [Azure Data Lake Storage Gen2](connector-azure-data-lake-storage.md)
40+
- [Azure Files](connector-azure-file-storage.md)
41+
- [File System](connector-file-system.md)
42+
- [FTP](connector-ftp.md)
43+
- [SFTP](connector-sftp.md)
44+
- [Microsoft Fabric Lakehouse](connector-microsoft-fabric-lakehouse.md)
45+
- [Amazon S3](connector-amazon-simple-storage-service.md)
46+
- [Amazon S3 Compatible Storage](connector-amazon-s3-compatible-storage.md)
47+
- [Google Cloud Storage](connector-google-cloud-storage.md)
48+
- [Oracle Cloud Storage](connector-oracle-cloud-storage.md)
49+
- [HDFS](connector-hdfs.md)
4950

5051
## Create a Delete activity with UI
5152

5253
To use a Delete activity in a pipeline, complete the following steps:
5354

5455
1. Search for _Delete_ in the pipeline Activities pane, and drag a Delete activity to the pipeline canvas.
55-
1. Select the new Delete activity on the canvas if it is not already selected, and its **Source** tab, to edit its details.
56+
1. Select the new Delete activity on the canvas if it isn't already selected, and its **Source** tab, to edit its details.
5657

5758
:::image type="content" source="media/delete-activity/delete-activity.png" alt-text="Shows the UI for a Delete activity.":::
5859

@@ -97,10 +98,10 @@ To use a Delete activity in a pipeline, complete the following steps:
9798
| dataset | Provides the dataset reference to determine which files or folder to be deleted | Yes |
9899
| recursive | Indicates whether the files are deleted recursively from the subfolders or only from the specified folder. | No. The default is `false`. |
99100
| maxConcurrentConnections | The number of the connections to connect to storage store concurrently for deleting folder or files. | No. The default is `1`. |
100-
| enablelogging | Indicates whether you need to record the folder or file names that have been deleted. If true, you need to further provide a storage account to save the log file, so that you can track the behaviors of the Delete activity by reading the log file. | No |
101-
| logStorageSettings | Only applicable when enablelogging = true.<br/><br/>A group of storage properties that can be specified where you want to save the log file containing the folder or file names that have been deleted by the Delete activity. | No |
102-
| linkedServiceName | Only applicable when enablelogging = true.<br/><br/>The linked service of [Azure Storage](connector-azure-blob-storage.md#linked-service-properties), [Azure Data Lake Storage Gen1](connector-azure-data-lake-store.md#linked-service-properties), or [Azure Data Lake Storage Gen2](connector-azure-data-lake-storage.md#linked-service-properties) to store the log file that contains the folder or file names that have been deleted by the Delete activity. Be aware it must be configured with the same type of Integration Runtime from the one used by delete activity to delete files. | No |
103-
| path | Only applicable when enablelogging = true.<br/><br/>The path to save the log file in your storage account. If you do not provide a path, the service creates a container for you. | No |
101+
| enable logging | Indicates whether you need to record the deleted folder or file names. If true, you need to further provide a storage account to save the log file, so that you can track the behaviors of the Delete activity by reading the log file. | No |
102+
| logStorageSettings | Only applicable when enablelogging = true.<br/><br/>A group of storage properties that can be specified where you want to save the log file containing the folder or file names deleted by the Delete activity. | No |
103+
| linkedServiceName | Only applicable when enablelogging = true.<br/><br/>The linked service of [Azure Storage](connector-azure-blob-storage.md#linked-service-properties), [Azure Data Lake Storage Gen1](connector-azure-data-lake-store.md#linked-service-properties), or [Azure Data Lake Storage Gen2](connector-azure-data-lake-storage.md#linked-service-properties) to store the log file that contains the folder or file names deleted by the Delete activity. Be aware it must be configured with the same type of Integration Runtime from the one used by delete activity to delete files. | No |
104+
| path | Only applicable when enablelogging = true.<br/><br/>The path to save the log file in your storage account. If you don't provide a path, the service creates a container for you. | No |
104105

105106
## Monitoring
106107

@@ -143,7 +144,7 @@ The store has the following folder structure:
143144

144145
Root/<br/>&nbsp;&nbsp;&nbsp;&nbsp;Folder_A_1/<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.txt<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2.txt<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;3.csv<br/>&nbsp;&nbsp;&nbsp;&nbsp;Folder_A_2/<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;4.txt<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5.csv<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Folder_B_1/<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;6.txt<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;7.csv<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Folder_B_2/<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;8.txt
145146

146-
Now you are using the Delete activity to delete folder or files by the combination of different property value from the dataset and the Delete activity:
147+
Now you're using the Delete activity to delete folder or files by the combination of different property value from the dataset and the Delete activity:
147148

148149
| folderPath | fileName | recursive | Output |
149150
|:--- |:--- |:--- |:--- |
@@ -154,7 +155,7 @@ Now you are using the Delete activity to delete folder or files by the combinati
154155

155156
### Periodically clean up the time-partitioned folder or files
156157

157-
You can create a pipeline to periodically clean up the time partitioned folder or files. For example, the folder structure is similar as: `/mycontainer/2018/12/14/*.csv`. You can leverage the service system variable from schedule trigger to identify which folder or files should be deleted in each pipeline run.
158+
You can create a pipeline to periodically clean up the time partitioned folder or files. For example, the folder structure is similar as: `/mycontainer/2018/12/14/*.csv`. You can use the service system variable from schedule trigger to identify which folder or files should be deleted in each pipeline run.
158159

159160
#### Sample pipeline
160161

@@ -294,7 +295,7 @@ You can create a pipeline to periodically clean up the time partitioned folder o
294295

295296
### Clean up the expired files that were last modified before 2018.1.1
296297

297-
You can create a pipeline to clean up the old or expired files by leveraging file attribute filter: LastModified in dataset.
298+
You can create a pipeline to clean up the old or expired files by using file attribute filter: "LastModified" in dataset.
298299

299300
#### Sample pipeline
300301

@@ -375,7 +376,7 @@ You can create a pipeline to clean up the old or expired files by leveraging fil
375376

376377
### Move files by chaining the Copy activity and the Delete activity
377378

378-
You can move a file by using a Copy activity to copy a file and then a Delete activity to delete a file in a pipeline. When you want to move multiple files, you can use the GetMetadata activity + Filter activity + Foreach activity + Copy activity + Delete activity as in the following sample.
379+
You can move a file by using a Copy activity to copy a file and then a Delete activity to delete a file in a pipeline. When you want to move multiple files, you can use the GetMetadata activity + Filter activity + Foreach activity + Copy activity + Delete activity as in the following sample.
379380

380381
> [!NOTE]
381382
> If you want to move the entire folder by defining a dataset containing a folder path only, and then using a Copy activity and a Delete activity to reference to the same dataset representing a folder, you need to be very careful. You must ensure that there **will not** be any new files arriving into the folder between the copy operation and the delete operation. If new files arrive in the folder at the moment when your copy activity just completed the copy job but the Delete activity has not been started, then the Delete activity might delete the newly arriving file which has NOT been copied to the destination yet by deleting the entire folder.
@@ -771,12 +772,12 @@ You can also get the template to move files from [here](solution-template-move-f
771772

772773
## Known limitation
773774

774-
- Delete activity does not support deleting list of folders described by wildcard.
775+
- Delete activity doesn't support deleting list of folders described by wildcard.
775776

776-
- When using file attribute filter in delete activity: modifiedDatetimeStart and modifiedDatetimeEnd to select files to be deleted, make sure to set "wildcardFileName": "*" in delete activity as well.
777+
- When using file attribute filter in delete activity: modifiedDatetimeStart and modifiedDatetimeEnd to select files to be deleted, make sure to set "wildcardFileName": "*" in delete activity as well.
777778

778779
## Related content
779780

780781
Learn more about moving files in Azure Data Factory and Synapse pipelines.
781782

782-
- [Copy Data tool](copy-data-tool.md)
783+
- [Copy Data tool](copy-data-tool.md)

0 commit comments

Comments
 (0)