Skip to content

Commit 5222905

Browse files
Merge pull request #303707 from Lucky-Wang16/0801-Update_Azure_PostgreSQL_script_upsert
Update Azure PostgreSQL Script activity and Upsert
2 parents 4ce6608 + 9e0bf6b commit 5222905

File tree

2 files changed

+73
-17
lines changed

2 files changed

+73
-17
lines changed

articles/data-factory/connector-azure-database-for-postgresql.md

Lines changed: 69 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,12 @@ This connector is specialized for the [Azure Database for PostgreSQL service](/a
2525

2626
This Azure Database for PostgreSQL connector is supported for the following capabilities:
2727

28-
| Supported capabilities|IR | Managed private endpoint|
28+
| Supported capabilities | IR | Managed private endpoint | Connector supported versions |
2929
|---------| --------| --------|
30-
|[Copy activity](copy-activity-overview.md) (source/sink)|① ②||
31-
|[Mapping data flow](concepts-data-flow-overview.md) (source/sink)|① ||
32-
|[Lookup activity](control-flow-lookup-activity.md)|① ②||
30+
|[Copy activity](copy-activity-overview.md) (source/sink)|① ②||1.0 & 2.0 |
31+
|[Mapping data flow](concepts-data-flow-overview.md) (source/sink)|① ||1.0 & 2.0 |
32+
|[Lookup activity](control-flow-lookup-activity.md)|① ②||1.0 & 2.0 |
33+
|[Script activity](transform-data-using-script.md)|① ②||2.0 |
3334

3435
*① Azure integration runtime ② Self-hosted integration runtime*
3536

@@ -474,17 +475,20 @@ To copy data from Azure Database for PostgreSQL, set the source type in the copy
474475

475476
### Azure Database for PostgreSQL as sink
476477

477-
To copy data to Azure Database for PostgreSQL, the following properties are supported in the copy activity **sink** section:
478+
To copy data to Azure Database for PostgreSQL, set the sink type in the copy activity to **SqlSink**. The following properties are supported in the copy activity **sink** section:
478479

479-
| Property | Description | Required |
480-
|:--- |:--- |:--- |
481-
| type | The type property of the copy activity sink must be set to **AzurePostgreSqlSink**. | Yes |
482-
| preCopyScript | Specify a SQL query for the copy activity to execute before you write data into Azure Database for PostgreSQL in each run. You can use this property to clean up the preloaded data. | No |
483-
| writeMethod | The method used to write data into Azure Database for PostgreSQL.<br>Allowed values are: **CopyCommand** (default, which is more performant), **BulkInsert**. | No |
484-
| writeBatchSize | The number of rows loaded into Azure Database for PostgreSQL per batch.<br>Allowed value is an integer that represents the number of rows. | No (default is 1,000,000) |
485-
| writeBatchTimeout | Wait time for the batch insert operation to complete before it times out.<br>Allowed values are Timespan strings. An example is 00:30:00 (30 minutes). | No (default is 00:30:00) |
480+
| Property | Description | Required | Connector support version |
481+
|:--- |:--- |:--- |:--- |
482+
| type | The type property of the copy activity sink must be set to **AzurePostgreSQLSink**. | Yes | Version 1.0 & Version 2.0 |
483+
| preCopyScript | Specify a SQL query for the copy activity to execute before you write data into Azure Database for PostgreSQL in each run. You can use this property to clean up the preloaded data. | No | Version 1.0 & Version 2.0 |
484+
| writeMethod | The method used to write data into Azure Database for PostgreSQL.<br>Allowed values are: **CopyCommand** (default, which is more performant), **BulkInsert** and **Upsert** (Version 2.0 only). | No | Version 1.0 & Version 2.0 |
485+
| upsertSettings | Specify the group of the settings for write behavior. <br/> Apply when the WriteBehavior option is `Upsert`. | No | Version 2.0 |
486+
| ***Under `upsertSettings`:*** | | |
487+
| keys | Specify the column names for unique row identification. Either a single key or a series of keys can be used. Keys must be a primary key or unique column. If not specified, the primary key is used. | No | Version 2.0 |
488+
| writeBatchSize | The number of rows loaded into Azure Database for PostgreSQL per batch.<br>Allowed value is an integer that represents the number of rows. | No (default is 1,000,000) | Version 1.0 & Version 2.0 |
489+
| writeBatchTimeout | Wait time for the batch insert operation to complete before it times out.<br>Allowed values are Timespan strings. An example is 00:30:00 (30 minutes). | No (default is 00:30:00) | Version 1.0 & Version 2.0 |
486490

487-
**Example**:
491+
**Example 1: Copy Command**
488492

489493
```json
490494
"activities":[
@@ -518,6 +522,47 @@ To copy data to Azure Database for PostgreSQL, the following properties are supp
518522
]
519523
```
520524

525+
**Example 2: Upsert data**
526+
527+
```json
528+
"activities":[
529+
{
530+
"name": "CopyToAzureDatabaseForPostgreSQL",
531+
"type": "Copy",
532+
"inputs": [
533+
{
534+
"referenceName": "<input dataset name>",
535+
"type": "DatasetReference"
536+
}
537+
],
538+
"outputs": [
539+
{
540+
"referenceName": "<Azure PostgreSQL output dataset name>",
541+
"type": "DatasetReference"
542+
}
543+
],
544+
"typeProperties": {
545+
"source": {
546+
"type": "<source type>"
547+
},
548+
"sink": {
549+
"type": "AzurePostgreSQLSink",
550+
"writeMethod": "Upsert",
551+
"upsertSettings": {
552+
"keys": [
553+
"<column name>"
554+
]
555+
},
556+
}
557+
}
558+
}
559+
]
560+
```
561+
562+
### Upsert data
563+
564+
Copy activity natively supports upsert operations. To perform an upsert, user should provide key column(s) that are either primary keys or unique columns. If the user does not provide key column(s) then primary key column(s) in the sink table are used. Copy Activity will update non-key column(s) in the sink table where the key column value(s) match those in the source table; otherwise, it will insert new data.
565+
521566
## Parallel copy from Azure Database for PostgreSQL
522567

523568
The Azure Database for PostgreSQL connector in copy activity provides built-in data partitioning to copy data in parallel. You can find data partitioning options on the **Source** tab of the copy activity.
@@ -641,6 +686,17 @@ IncomingStream sink(allowSchemaDrift: true,
641686
skipDuplicateMapOutputs: true) ~> AzurePostgreSqlSink
642687
```
643688

689+
## Script activity
690+
691+
> [!IMPORTANT]
692+
> Script activity is only supported in the version 2.0 connector.
693+
> [!IMPORTANT]
694+
> Multi-query statements using output parameters are not supported. It is recommended that you split any output queries into separate script blocks within the same or different script activity.
695+
>
696+
> Multi-query statements using positional parameters are not supported. It is recommended that you split any positional queries into separate script blocks within the same or different script activity.
697+
698+
For more information about script activity, see [Script activity](transform-data-using-script.md).
699+
644700
## Lookup activity properties
645701

646702
For more information about the properties, see [Lookup activity](control-flow-lookup-activity.md).

articles/data-factory/transform-data-using-script.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ms.topic: conceptual
66
author: nabhishek
77
ms.author: abnarain
88
ms.custom: synapse
9-
ms.date: 10/03/2024
9+
ms.date: 08/01/2025
1010
ms.subservice: orchestration
1111
---
1212

@@ -20,12 +20,12 @@ Using the script activity, you can execute common operations with Data Manipulat
2020

2121
You can use the Script activity to invoke a SQL script in one of the following data stores in your enterprise or on an Azure virtual machine (VM):
2222

23+
- Azure Database for PostgreSQL (Version 2.0)
2324
- Azure SQL Database
24-
- Azure Synapse Analytics
25-
- SQL Server Database. If you're using SQL Server, install Self-hosted integration runtime on the same machine that hosts the database or on a separate machine that has access to the database. Self-Hosted integration runtime is a component that connects data sources on-premises/on Azure VM with cloud services in a secure and managed way. See the [Self-hosted integration runtime](create-self-hosted-integration-runtime.md) article for details.
25+
- Azure Synapse Analytics
26+
- SQL Server Database. If you're using SQL Server, install Self-hosted integration runtime on the same machine that hosts the database or on a separate machine that has access to the database. Self-Hosted integration runtime is a component that connects data sources on-premises/on Azure VM with cloud services in a secure and managed way. See the [Self-hosted integration runtime](create-self-hosted-integration-runtime.md) article for details.
2627
- Oracle
2728
- Snowflake
28-
- Azure Database for PostgreSQL
2929

3030
The script can contain either a single SQL statement or multiple SQL statements that run sequentially. You can use the Script task for the following purposes:
3131

0 commit comments

Comments
 (0)