Skip to content

Commit 56586d9

Browse files
Merge pull request #232830 from n0elleli/cdcupdate
Cdcupdate
2 parents 0eca239 + 0f8b270 commit 56586d9

File tree

2 files changed

+9
-4
lines changed

2 files changed

+9
-4
lines changed

articles/data-factory/concepts-change-data-capture.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,15 +22,15 @@ To learn more, see [Azure Data Factory overview](introduction.md) or [Azure Syna
2222

2323
## Overview
2424

25-
When you perform data integration and ETL processes in the cloud, your jobs can perform much better and be more effective when you only read the source data that has changed since the last time the pipeline ran, rather than always querying an entire dataset on each run. ADF provides multiple different ways for you to easily get delta data only from the last run.
25+
When you perform data integration and ETL processes in the cloud, your jobs can perform better and be more effective when you only read the source data that has changed since the last time the pipeline ran, rather than always querying an entire dataset on each run. ADF provides multiple different ways for you to easily get delta data only from the last run.
2626

2727
### Change Data Capture factory resource
2828

29-
The easiest and quickest way to get started in data factory with CDC is through the factory level Change Data Capture resource. From the main pipeline designer, click on New under Factory Resources to create a new Change Data Capture. The CDC factory resource will provide a configuration walk-through experience where you will point to your sources and destinations, apply optional transformations, and then click start to begin your data capture. With the CDC resource, you will not need to design pipelines or data flow activities and the only billing will be 4 cores of General Purpose data flows while your data in being processed. You set a latency which ADF will use to wake-up and look for changed data. That is the only time you will be billed. The top-level CDC resource is also the ADF method of running your processes continuously. Pipelines in ADF are batch only. But the CDC resource can run continuously.
29+
The easiest and quickest way to get started in data factory with CDC is through the factory level Change Data Capture resource. From the main pipeline designer, click on **New** under Factory Resources to create a new Change Data Capture. The CDC factory resource provides a configuration walk-through experience where you can select your sources and destinations, apply optional transformations, and then click start to begin your data capture. With the CDC resource, you do not need to design pipelines or data flow activities. You are also only billed for four cores of General Purpose data flows while your data in being processed. You can set a preferred latency, which ADF will use to wake up and look for changed data. That is the only time you will be billed. The top-level CDC resource is also the ADF method of running your processes continuously. Pipelines in ADF are batch only, but the CDC resource can run continuously.
3030

3131
### Native change data capture in mapping data flow
3232

33-
The changed data including inserted, updated and deleted rows can be automatically detected and extracted by ADF mapping data flow from the source databases. No timestamp or ID columns are required to identify the changes since it uses the native change data capture technology in the databases. By simply chaining a source transform and a sink transform reference to a database dataset in a mapping data flow, you will see the changes happened on the source database to be automatically applied to the target database, so that you can easily synchronize data between two tables. You can also add any transformations in between for any business logic to process the delta data. When defining your sink data destination, you can set insert, update, upsert, and delete operations in your sink without the need of an Alter Row transformation because ADF is able to automatically detect the row makers.
33+
The changed data including inserted, updated and deleted rows can be automatically detected and extracted by ADF mapping data flow from the source databases. No timestamp or ID columns are required to identify the changes since it uses the native change data capture technology in the databases. By simply chaining a source transform and a sink transform reference to a database dataset in a mapping data flow, you can see the changes happened on the source database to be automatically applied to the target database, so that you can easily synchronize data between two tables. You can also add any transformations in between for any business logic to process the delta data. When defining your sink data destination, you can set insert, update, upsert, and delete operations in your sink without the need of an Alter Row transformation because ADF is able to automatically detect the row makers.
3434

3535
> [!VIDEO https://www.microsoft.com/en-us/videoplayer/embed/RE5bkg2]
3636
@@ -40,6 +40,7 @@ The changed data including inserted, updated and deleted rows can be automatical
4040
- [SQL Server](connector-sql-server.md)
4141
- [Azure SQL Managed Instance](connector-azure-sql-managed-instance.md)
4242
- [Azure Cosmos DB (SQL API)](connector-azure-cosmos-db.md)
43+
- [Azure Cosmos DB analytical store](../cosmos-db/analytical-store-introduction.md)
4344

4445
### Auto incremental extraction in mapping data flow
4546

@@ -71,7 +72,7 @@ You can always build your own delta data extraction pipeline for all ADF support
7172

7273
**Change files capture from file based storages**
7374

74-
- When you want to load data from Azure Blob Storage, Azure Data Lake Storage Gen2 or Azure Data Lake Storage Gen1, mapping data flow provides you the opportunity to get new or updated files only by simple one click. It is the simplest and recommended way for you to achieve delta load from these file based storages in mapping data flow.
75+
- When you want to load data from Azure Blob Storage, Azure Data Lake Storage Gen2 or Azure Data Lake Storage Gen1, mapping data flow provides you with the opportunity to get new or updated files only by simple one click. It is the simplest and recommended way for you to achieve delta load from these file based storages in mapping data flow.
7576
- You can get more [best practices](https://techcommunity.microsoft.com/t5/azure-data-factory-blog/best-practices-of-how-to-use-adf-copy-activity-to-copy-new-files/ba-p/1532484).
7677

7778

articles/data-factory/connector-azure-cosmos-db.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -533,6 +533,10 @@ When you debug the pipeline, this feature works the same. Be aware that the chec
533533

534534
In the monitoring section, you always have the chance to rerun a pipeline. When you are doing so, the changed data is always captured from the previous checkpoint of your selected pipeline run.
535535

536+
In addition, Azure Cosmos DB analytical store now supports Change Data Capture (CDC) for Azure Cosmos DB API for NoSQL and Azure Cosmos DB API for Mongo DB (public preview). Azure Cosmos DB analytical store allows you to efficiently consume a continuous and incremental feed of changed (inserted, updated, and deleted) data from analytical store.
537+
538+
539+
536540
## Next steps
537541

538542
For a list of data stores that Copy Activity supports as sources and sinks, see [supported data stores](copy-activity-overview.md#supported-data-stores-and-formats).

0 commit comments

Comments
 (0)