Skip to content

Commit cd20af1

Browse files
committed
edits
1 parent f12638f commit cd20af1

5 files changed

+43
-39
lines changed

articles/data-factory/sap-change-data-capture-debug-shir-logs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,4 +34,4 @@ After you've uploaded and sent your self-hosted integration runtime logs, contac
3434

3535
## Next steps
3636

37-
[Auto-generate a pipeline by using the SAP ODP data partitioning template](sap-change-data-capture-data-partitioning-template.md)
37+
[Auto-generate a pipeline by using the SAP data partitioning template](sap-change-data-capture-data-partitioning-template.md)
Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Overview and architecture of the SAP CDC solution
2+
title: Overview and architecture of the SAP CDC solution (preview)
33
titleSuffix: Azure Data Factory
44
description: Learn about the SAP change data capture (CDC) solution (preview) in Azure Data Factory and understand its architecture.
55
author: ukchrist
@@ -14,50 +14,54 @@ ms.author: ulrichchrist
1414

1515
[!INCLUDE[appliesto-adf-asa-md](includes/appliesto-adf-asa-md.md)]
1616

17-
This article introduces and describes the architecture of the SAP change data capture (CDC) solution (preview) in Azure Data Factory.
17+
Learn about the SAP change data capture (CDC) solution (preview) in Azure Data Factory and understand its architecture.
1818

19-
Azure Data Factory is a data integration (ETL and ELT) platform as a service (PaaS). For SAP data integration, Data Factory currently offers six connectors:
19+
Azure Data Factory is an ETL and ELT data integration platform as a service (PaaS). For SAP data integration, Data Factory currently offers six general availability connectors:
2020

2121
:::image type="content" source="media/sap-change-data-capture-solution/sap-supported-cdc-connectors.png" alt-text="Screenshot of the six general availability connectors for SAP systems in Data Factory.":::
2222

23+
## Data extraction needs
24+
2325
The SAP connectors in Data Factory extract SAP source data only in batches. Each batch processes existing and new data the same. In data extraction in batch mode, changes between existing and new datasets aren't identified. This type of extraction mode isn’t optimal when you have large datasets like tables that have millions or billions of records that change often.
2426

25-
You can keep your copy of SAP data fresh and up-to-date by frequently extracting the full dataset, but this approach is expensive and inefficient. You also can use a manual, limited workaround to extract mostly new or updated records. In a process called *watermarking*, extraction requires using a timestamp column, monotonously increasing values, and continuously tracking the highest value since the last extraction. Some tables don't have a column that you can use for watermarking. This process also doesn't identify a deleted record as a change in the dataset.
27+
You can keep your copy of SAP data fresh and up-to-date by frequently extracting the full dataset, but this approach is expensive and inefficient. You also can use a manual, limited workaround to extract mostly new or updated records. In a process called *watermarking*, extraction requires using a timestamp column, monotonously increasing values, and continuously tracking the highest value since the last extraction. But some tables don't have a column that you can use for watermarking. This process also doesn't identify a deleted record as a change in the dataset.
28+
29+
## The SAP CDC solution
2630

27-
Microsoft customers indicate that they need a connector that can extract only the delta between two sets of data. In data, a *delta* is any change in a dataset that's the result of an update, insert, or deletion in the data. A delta extraction connector uses the [SAP change data capture (CDC) feature](https://help.sap.com/docs/SAP_DATA_SERVICES/ec06fadc50b64b6184f835e4f0e1f52f/1752bddf523c45f18ce305ac3bcd7e08.html?q=change%20data%20capture) that exists in most SAP systems to determine the delta in a dataset. The SAP CDC solution in Data Factory uses the SAP Operational Data Provisioning (ODP) framework to replicate the delta in an SAP source dataset.
31+
Microsoft customers indicate that they need a connector that can extract only the delta between two sets of data. In data, a *delta* is any change in a dataset that's the result of an update, insert, or deletion in the dataset. A delta extraction connector uses the [SAP change data capture (CDC) feature](https://help.sap.com/docs/SAP_DATA_SERVICES/ec06fadc50b64b6184f835e4f0e1f52f/1752bddf523c45f18ce305ac3bcd7e08.html?q=change%20data%20capture) that exists in most SAP systems to determine the delta in a dataset. The SAP CDC solution in Data Factory uses the SAP Operational Data Provisioning (ODP) framework to replicate the delta in an SAP source dataset.
2832

29-
This article provides a high-level architecture of the SAP CDC solution in Azure Data Factory. For more information about the SAP CDC solution, see:
33+
This article provides a high-level architecture of the SAP CDC solution in Azure Data Factory. Get more information about the SAP CDC solution:
3034

3135
- [Prerequisites and setup](sap-change-data-capture-prerequisites-configuration.md)
3236
- [Set up a self-hosted integration runtime](sap-change-data-capture-shir-preparation.md)
3337
- [Set up a linked service and source dataset](sap-change-data-capture-prepare-linked-service-source-dataset.md)
34-
- [Use the SAP ODP data extraction template](sap-change-data-capture-data-replication-template.md)
35-
- [Use the SAP ODP data partition template](sap-change-data-capture-data-partitioning-template.md)
36-
- [Manage your SAP CDC solution](sap-change-data-capture-management.md)
38+
- [Use the SAP data extraction template](sap-change-data-capture-data-replication-template.md)
39+
- [Use the SAP data partition template](sap-change-data-capture-data-partitioning-template.md)
40+
- [Manage the solution](sap-change-data-capture-management.md)
3741

3842
## How to use the SAP CDC solution
3943

40-
The SAP CDC solution consists of a connector that you access through the SAP ODP (preview) linked service, SAP source dataset, and the SAP ODP data replication template or SAP ODP data partitioning template. Choose the template to use when you set up a new pipeline in Azure Data Factory Studio. To access preview templates, you must [enable the preview experience in Azure Data Factory Studio](how-to-manage-studio-preview-exp.md#how-to-enabledisable-preview-experience).
44+
The SAP CDC solution is a connector that you access through an SAP ODP (preview) linked service, an SAP ODP source dataset, and the SAP data replication template or the SAP data partitioning template. Choose your template when you set up a new pipeline in Azure Data Factory Studio. To access preview templates, you must [enable the preview experience in Azure Data Factory Studio](how-to-manage-studio-preview-exp.md#how-to-enabledisable-preview-experience).
4145

42-
The SAP ODP connector connects to all SAP systems that support ODP, including SAP R/3, SAP ECC, SAP S/4HANA, SAP BW, and SAP BW/4HANA. The connector works either directly at the application layer or indirectly via an SAP Landscape Transformation Replication Server (SLT) as a proxy. Without relying on watermarking, it can extract SAP data either fully or incrementally. The data the connector extracts includes not only physical tables, but also logical objects that are created by using the tables. An example of a table-based object is an SAP Advanced Business Application Programming (ABAP) Core Data Services (CDS) view.
46+
The SAP CDC solution connects to all SAP systems that support ODP, including SAP R/3, SAP ECC, SAP S/4HANA, SAP BW, and SAP BW/4HANA. The solution works either directly at the application layer or indirectly via an SAP Landscape Transformation Replication Server (SLT) as a proxy. Without relying on watermarking, it can extract SAP data either fully or incrementally. The data the SAP CDC solution extracts includes not only physical tables but also logical objects that are created by using the tables. An example of a table-based object is an SAP Advanced Business Application Programming (ABAP) Core Data Services (CDS) view.
4347

44-
Use the SAP CDC preview solution with Data Factory features like copy and data flow activities, pipeline templates, and tumbling window triggers for a low-latency SAP CDC replication solution in a self-managed pipeline.
48+
Use the SAP CDC solution with Data Factory features like copy activities and data flow activities, pipeline templates, and tumbling window triggers for a low-latency SAP CDC replication solution in a self-managed pipeline.
4549

46-
## SAP CDC solution architecture
50+
## The SAP CDC solution architecture
4751

4852
The SAP CDC solution in Azure Data Factory is a connector between SAP and Azure. The SAP side includes the SAP ODP connector that invokes the ODP API over standard Remote Function Call (RFC) modules to extract full and delta raw SAP data.
4953

5054
The Azure side includes the Data Factory copy activity that loads the raw SAP data into a storage destination like Azure Blob Storage or Azure Data Lake Storage Gen2. The data is saved in CSV or Parquet format, essentially archiving or preserving all historical changes.
5155

5256
The Azure side also might include a Data Factory data flow activity that transforms the raw SAP data, merges all changes, and loads the results in a destination like Azure SQL Database or Azure Synapse Analytics, essentially replicating the SAP data. The Data Factory data flow activity also can load the results in Data Lake Storage Gen2 in delta format. You can use time travel capabilities to produce snapshots of SAP data at any specific period in the past.
5357

54-
In Azure Data Factory Studio, the SAP ODP template that you use to auto-generate a Data Factory pipeline connects SAP with Azure. You can run the pipeline frequently by using a Data Factory tumbling window trigger to replicate SAP data in Azure with low latency and without using watermarking.
58+
In Azure Data Factory Studio, the SAP template that you use to auto-generate a Data Factory pipeline connects SAP with Azure. You can run the pipeline frequently by using a Data Factory tumbling window trigger to replicate SAP data in Azure with low latency and without using watermarking.
5559

5660
:::image type="content" source="media/sap-change-data-capture-solution/sap-cdc-architecture-diagram.png" border="false" alt-text="Diagram of the architecture of the SAP CDC solution.":::
5761

58-
To get started, create a Data Factory copy activity by using an SAP ODP linked service, SAP ODP source dataset, and an SAP ODP data replication template or SAP ODP data partitioning template. The copy activity runs on a self-hosted integration runtime that you install on an on-premises computer or on a virtual machine (VM). An on-premises computer has a line of sight to your SAP source systems and to the SLT replication server. The Data Factory data flow activity runs on a serverless Azure Databricks or Apache Spark cluster, or on an Azure integration runtime.
62+
To get started, create a Data Factory copy activity by using an SAP ODP linked service, an SAP ODP source dataset, and an SAP data replication template or SAP data partitioning template. The copy activity runs on a self-hosted integration runtime that you install on an on-premises computer or on a virtual machine (VM). An on-premises computer has a line of sight to your SAP source systems and to the SLT. The Data Factory data flow activity runs on a serverless Azure Databricks or Apache Spark cluster, or on an Azure integration runtime.
5963

60-
The SAP ODP connector uses ODP to extract various data source types, including:
64+
The SAP CDC solution uses ODP to extract various data source types, including:
6165

6266
- SAP extractors, originally built to extract data from ECC and load it into BW
6367
- ABAP CDS views, the new data extraction standard for S/4HANA
@@ -72,4 +76,4 @@ Because ODP completely decouples providers from subscribers, any SAP documentati
7276

7377
## Next steps
7478

75-
[Prerequisites and setup for the solution](sap-change-data-capture-prerequisites-configuration.md)
79+
[Prerequisites and setup for the SAP CDC solution](sap-change-data-capture-prerequisites-configuration.md)

0 commit comments

Comments
 (0)