Skip to content

Commit 4b8d4fb

Browse files
authored
Merge pull request #229551 from dearandyxu/master
update SAP template
2 parents dec6c51 + a6136e5 commit 4b8d4fb

File tree

1 file changed

+62
-13
lines changed

1 file changed

+62
-13
lines changed

articles/data-factory/solution-template-replicate-multiple-objects-sap-cdc.md

Lines changed: 62 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -17,29 +17,78 @@ This article describes a solution template that you can use to replicate multipl
1717

1818
## About this solution template
1919

20-
This template reads an external control file in csv format on your storage store, which contains your SAP ODP contexts, SAP ODP objects and key columns from SAP source system as well as your containers, folders and partitions from Azure Data Lake Gen2 destination store. It then copies each of the SAP ODP object from SAP system to Azure Data Lake Gen2 in Delta format.
20+
This template reads an external control file in json format on your storage store, which contains your SAP ODP contexts, SAP ODP objects and key columns from SAP source system as well as your containers, folders and partitions from Azure Data Lake Gen2 destination store. It then copies each of the SAP ODP object from SAP system to Azure Data Lake Gen2 in Delta format.
2121

2222
The template contains three activities:
2323
- **Lookup** retrieves the SAP ODP objects list to be loaded and the destination store path from an external control file on your Azure Data Lake Gen2 store.
2424
- **ForEach** gets the SAP ODP objects list from the Lookup activity and iterates each object to the mapping dataflow activity.
2525
- **Mapping dataflow** replicates each SAP ODP object from SAP system to Azure Data Lake Gen2 in Delta format. It will do initial full load in the first run and then do incremental load in the subsequent runs automatically. It will merge the changes to Azure Data Lake Gen2 in Delta format.
2626

27-
An external control file in csv format is required for in this template. The schema for the control file is as below.
28-
- *context* is your SAP ODP context from the source SAP system. You can get more details [here](sap-change-data-capture-prepare-linked-service-source-dataset.md#set-up-the-source-dataset).
29-
- *object* is your SAP ODP object name to be loaded from the SAP system. You can get more details [here](sap-change-data-capture-prepare-linked-service-source-dataset.md#set-up-the-source-dataset).
30-
- *keys* are your key column names from SAP ODP objects used to do the dedupe in mapping dataflow.
31-
- *container* is your container name in the Azure Data Lake Gen2 as the destination store.
32-
- *folder* is your folder name in the Azure Data Lake Gen2 as the destination store.
33-
- *partition* is your column name used to create partitions for each unique value in such column to write data into Delta format on Azure Data Lake Gen2 via Spark cluster used by mapping dataflow. You can get more details [here](concepts-data-flow-performance.md#key)
34-
35-
:::image type="content" source="media/solution-template-replicate-multiple-objects-sap-cdc/sap-cdc-template-control-file.png" alt-text="Screenshot of SAP CDC control file.":::
36-
27+
An external control file in json format is required in this template. The schema for the control file is as below.
28+
- *checkPointKey* is your custom key to manage the checkpoint of your changed data capture in ADF. You can get more details [here](concepts-change-data-capture.md#checkpoint).
29+
- *sapContext* is your SAP ODP context from the source SAP system. You can get more details [here](sap-change-data-capture-prepare-linked-service-source-dataset.md#set-up-the-source-dataset).
30+
- *sapObjectName* is your SAP ODP object name to be loaded from the SAP system. You can get more details [here](sap-change-data-capture-prepare-linked-service-source-dataset.md#set-up-the-source-dataset).
31+
- *sapRunMode* is to determine how you want to load SAP object. It can be fullLoad, incrementalLoad or fullAndIncrementalLoad.
32+
- *sapKeyColumns* are your key column names from SAP ODP objects used to do the dedupe in mapping dataflow.
33+
- *sapPartitions* are list of partition conditions leading to separate extraction processes in the connected SAP system.
34+
- *deltaContainer* is your container name in the Azure Data Lake Gen2 as the destination store.
35+
- *deltaFolder* is your folder name in the Azure Data Lake Gen2 as the destination store.
36+
- *deltaKeyColumns* are your columns used to determine if a row from the source matches a row from the sink when you want to update or delete a row.
37+
- *deltaPartition* is your column used to create partitions for each unique value in such column to write data into Delta format on Azure Data Lake Gen2 via Spark cluster used by mapping dataflow. You can get more details [here](concepts-data-flow-performance.md#key)
38+
39+
A sample control file is as below:
40+
```json
41+
[
42+
{
43+
"checkPointKey":"cba2acf0-d5e2-4d84-a552-e0a059b6d320",
44+
"sapContext": "ABAP_CDS",
45+
"sapObjectName": "ZPERFCDPOS$F",
46+
"sapRunMode": "fullAndIncrementalLoad",
47+
"sapKeyColumns": [
48+
"TABKEY"
49+
],
50+
"sapPartitions": [
51+
[{
52+
"fieldName": "TEXTCASE",
53+
"sign": "I",
54+
"option": "EQ",
55+
"low": "1"
56+
},
57+
{
58+
"fieldName": "TEXTCASE",
59+
"sign": "I",
60+
"option": "EQ",
61+
"low": "X"
62+
}]
63+
],
64+
"deltaContainer":"delta",
65+
"deltaFolder":"ZPERFCDPOS",
66+
"deltaKeyColumns":["TABKEY"],
67+
"deltaPartition":"TEXTCASE",
68+
"stagingStorageFolder":"stagingcontainer/stagingfolder"
69+
},
70+
{
71+
"checkPointKey":"fgaeca7f-d3d4-406f-bb48-a17faa83f76c",
72+
"sapContext": "SAPI",
73+
"sapObjectName": "Z0131",
74+
"sapRunMode": "incrementalLoad",
75+
"sapKeyColumns": [
76+
"ID"
77+
],
78+
"sapPartitions": [],
79+
"deltaContainer":"delta",
80+
"deltaFolder":"Z0131",
81+
"deltaKeyColumns":["ID"],
82+
"deltaPartition":"COMPANY",
83+
"stagingStorageFolder":"stagingcontainer/stagingfolder"
84+
}
85+
]
86+
```
3787

3888
## How to use this solution template
3989

40-
1. Create and upload a control file into CSV format to your Azure Data Lake Gen2 as the destination store. The default container to store the control file is **demo** and default control file name is **SAP2DeltaLookup.csv**.
90+
1. Create and upload a control file into json format to your Azure Data Lake Gen2 as the destination store. The default container to store the control file is **demo** and default control file name is **SapToDeltaParameters.json**.
4191

42-
:::image type="content" source="media/solution-template-replicate-multiple-objects-sap-cdc/sap-cdc-template-control-file.png" alt-text="Screenshot of SAP CDC control file.":::
4392

4493
2. Go to the **Replicate multiple tables from SAP ODP to Azure Data Lake Storage Gen2 in Delta format** template and **click** it.
4594

0 commit comments

Comments
 (0)