Skip to content

Commit 45297e7

Browse files
committed
Update parallel copy supported scenarios
1 parent 576bf40 commit 45297e7

File tree

5 files changed

+18
-16
lines changed

5 files changed

+18
-16
lines changed

articles/data-factory/connector-oracle.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -192,11 +192,10 @@ To copy data from and to Oracle, set the type property of the dataset to `Oracle
192192

193193
This section provides a list of properties supported by the Oracle source and sink. For a full list of sections and properties available for defining activities, see [Pipelines](concepts-pipelines-activities.md).
194194

195-
### Oracle as a source type
195+
### Oracle as source
196196

197-
> [!TIP]
198-
>
199-
> To load data from Oracle efficiently by using data partitioning, see [Parallel copy from Oracle](#parallel-copy-from-oracle).
197+
>[!TIP]
198+
>To load data from Oracle efficiently by using data partitioning, see [Parallel copy from Oracle](#parallel-copy-from-oracle).
200199
201200
To copy data from Oracle, set the source type in the copy activity to `OracleSource`. The following properties are supported in the copy activity **source** section.
202201

@@ -243,7 +242,7 @@ To copy data from Oracle, set the source type in the copy activity to `OracleSou
243242
]
244243
```
245244

246-
### Oracle as a sink type
245+
### Oracle as sink
247246

248247
To copy data to Oracle, set the sink type in the copy activity to `OracleSink`. The following properties are supported in the copy activity **sink** section.
249248

articles/data-factory/connector-sap-business-warehouse-open-hub.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ms.workload: data-services
1212
ms.tgt_pltfrm: na
1313

1414
ms.topic: conceptual
15-
ms.date: 03/08/2019
15+
ms.date: 08/12/2019
1616
ms.author: jingwang
1717

1818
---
@@ -173,6 +173,8 @@ For a full list of sections and properties available for defining activities, se
173173

174174
To copy data from SAP BW Open Hub, set the source type in the copy activity to **SapOpenHubSource**. There are no additional type-specific properties needed in the copy activity **source** section.
175175

176+
To speed up the data loading, you can set [`parallelCopies`](copy-activity-performance.md#parallel-copy) on the copy activity to load data from SAP BW Open Hub in parallel. For example, if you set `parallelCopies` to four, Data Factory concurrently executes four RFC calls, and each RFC call retrieves a portion of data from your SAP BW Open Hub table partitioned by the DTP request ID and package ID. This applies when the number of unique DTP request ID + package ID is bigger than the value of `parallelCopies`.
177+
176178
**Example:**
177179

178180
```json
@@ -198,7 +200,8 @@ To copy data from SAP BW Open Hub, set the source type in the copy activity to *
198200
},
199201
"sink": {
200202
"type": "<sink type>"
201-
}
203+
},
204+
"parallelCopies": 4
202205
}
203206
}
204207
]

articles/data-factory/connector-sap-table.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ms.workload: data-services
1212
ms.tgt_pltfrm: na
1313

1414
ms.topic: conceptual
15-
ms.date: 08/01/2018
15+
ms.date: 08/12/2019
1616
ms.author: jingwang
1717

1818
---
@@ -198,7 +198,7 @@ To copy data from and to the SAP BW Open Hub linked service, the following prope
198198

199199
For a full list of the sections and properties for defining activities, see [Pipelines](concepts-pipelines-activities.md). The following section provides a list of the properties supported by the SAP table source.
200200

201-
### SAP table as a source
201+
### SAP table as source
202202

203203
To copy data from an SAP table, the following properties are supported:
204204

@@ -220,7 +220,7 @@ To copy data from an SAP table, the following properties are supported:
220220
<br/>
221221
>Taking `partitionOption` as `partitionOnInt` as an example, the number of rows in each partition is calculated with this formula: (total rows falling between `partitionUpperBound` and `partitionLowerBound`)/`maxPartitionsNumber`.<br/>
222222
<br/>
223-
>To run partitions in parallel to speed up copying, we strongly recommend making `maxPartitionsNumber` a multiple of the value of the `parallelCopies` property. For more information, see [Parallel copy](copy-activity-performance.md#parallel-copy).
223+
>To load data partitions in parallel to speed up copy, the parallel degree is controlled by the [`parallelCopies`](copy-activity-performance.md#parallel-copy) setting on the copy activity. For example, if you set `parallelCopies` to four, Data Factory concurrently generates and runs four queries based on your specified partition option and settings, and each query retrieves a portion of data from your SAP table. We strongly recommend making `maxPartitionsNumber` a multiple of the value of the `parallelCopies` property.
224224
225225
In `rfcTableOptions`, you can use the following common SAP query operators to filter the rows:
226226

@@ -266,7 +266,8 @@ In `rfcTableOptions`, you can use the following common SAP query operators to fi
266266
},
267267
"sink": {
268268
"type": "<sink type>"
269-
}
269+
},
270+
"parallelCopies": 4
270271
}
271272
}
272273
]

articles/data-factory/connector-teradata.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -183,11 +183,10 @@ To copy data from Teradata, the following properties are supported:
183183

184184
This section provides a list of properties supported by Teradata source. For a full list of sections and properties available for defining activities, see [Pipelines](concepts-pipelines-activities.md).
185185

186-
### Teradata as a source type
186+
### Teradata as source
187187

188-
> [!TIP]
189-
>
190-
> To load data from Teradata efficiently by using data partitioning, see the [Parallel copy from Teradata](#parallel-copy-from-teradata) section.
188+
>[!TIP]
189+
>To load data from Teradata efficiently by using data partitioning, see the [Parallel copy from Teradata](#parallel-copy-from-teradata) section.
191190
192191
To copy data from Teradata, the following properties are supported in the copy activity **source** section:
193192

articles/data-factory/copy-activity-performance.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -160,7 +160,7 @@ To control the load on machines that host your data stores, or to tune copy perf
160160
**Points to note:**
161161

162162
* When you copy data between file-based stores, **parallelCopies** determines the parallelism at the file level. The chunking within a single file happens underneath automatically and transparently. It's designed to use the best suitable chunk size for a given source data store type to load data in parallel and orthogonal to **parallelCopies**. The actual number of parallel copies the data movement service uses for the copy operation at run time is no more than the number of files you have. If the copy behavior is **mergeFile**, the copy activity can't take advantage of file-level parallelism.
163-
* When you copy data from stores that aren't file-based (except Oracle database as source with data partitioning enabled) to stores that are file-based, the data movement service ignores the **parallelCopies** property. Even if parallelism is specified, it's not applied in this case.
163+
* When you copy data from stores that are not file-based (except [Oracle](connector-oracle.md#oracle-as-source), [Teradata](connector-teradata.md#teradata-as-source), [SAP Table](connector-sap-table.md#sap-table-as-source), and [SAP Open Hub](connector-sap-business-warehouse-open-hub#sap-bw-open-hub-as-source) connector as source with data partitioning enabled) to stores that are file-based, the data movement service ignores the **parallelCopies** property. Even if parallelism is specified, it's not applied in this case.
164164
* The **parallelCopies** property is orthogonal to **dataIntegrationUnits**. The former is counted across all the Data Integration Units.
165165
* When you specify a value for the **parallelCopies** property, consider the load increase on your source and sink data stores. Also consider the load increase to the self-hosted integration runtime if the copy activity is empowered by it, for example, for hybrid copy. This load increase happens especially when you have multiple activities or concurrent runs of the same activities that run against the same data store. If you notice that either the data store or the self-hosted integration runtime is overwhelmed with the load, decrease the **parallelCopies** value to relieve the load.
166166

0 commit comments

Comments
 (0)