Skip to content

Commit 2f9d9af

Browse files
authored
Create connector-azure-cosmos-analytical-store.md
adding new connector doc
1 parent 0f8b270 commit 2f9d9af

File tree

1 file changed

+84
-0
lines changed

1 file changed

+84
-0
lines changed
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
title: Copy and transform data in Azure Cosmos DB analytical store
3+
titleSuffix: Azure Data Factory & Azure Synapse
4+
description: Learn how to transform data in Azure Cosmos DB analytical store using Azure Data Factory and Azure Synapse Analytics.
5+
ms.author: noelleli
6+
author: n0elleli
7+
ms.service: data-factory
8+
ms.subservice: data-movement
9+
ms.topic: conceptual
10+
ms.custom:
11+
ms.date: 03/31/2023
12+
---
13+
14+
# Copy and transform data in Azure Cosmos DB for NoSQL by using Azure Data Factory
15+
16+
> [!div class="op_single_selector" title1="Select the version of Data Factory service you are using:"]
17+
> * [Current version](connector-azure-cosmos-analytical-store.md)
18+
19+
[!INCLUDE[appliesto-adf-asa-md](includes/appliesto-adf-asa-md.md)]
20+
21+
This article outlines how to use Data Flow to transform data in Azure Cosmos DB analytical store. To learn more, read the introductory articles for [Azure Data Factory](introduction.md) and [Azure Synapse Analytics](../synapse-analytics/overview-what-is.md).
22+
23+
>[!NOTE]
24+
>The Azure Cosmos DB analytical store connector supports [change data capture](concepts-change-data-capture.md) Azure Cosmos DB API for NoSQL and Azure Cosmos DB API for Mongo DB, currently in public preview.
25+
26+
## Supported capabilities
27+
28+
This Azure Cosmos DB for NoSQL connector is supported for the following capabilities:
29+
30+
| Supported capabilities|IR | Managed private endpoint|
31+
|---------| --------| --------|
32+
|[Mapping data flow](concepts-data-flow-overview.md) (source/sink)|① ||
33+
34+
35+
<small>*&#9312; Azure integration runtime &#9313; Self-hosted integration runtime*</small>
36+
37+
38+
## Mapping data flow properties
39+
40+
When transforming data in mapping data flow, you can read and write to collections in Azure Cosmos DB. For more information, see the [source transformation](data-flow-source.md) and [sink transformation](data-flow-sink.md) in mapping data flows.
41+
42+
> [!Note]
43+
> The Azure Cosmos DB analytical store is found with the [Azure Cosmos DB for NoSQL](connector-azure-cosmos-db.md) dataset type.
44+
45+
46+
### Source transformation
47+
48+
Settings specific to Azure Cosmos DB are available in the **Source Options** tab of the source transformation.
49+
50+
**Include system columns:** If true, ```id```, ```_ts```, and other system columns will be included in your data flow metadata from Azure Cosmos DB. When updating collections, it is important to include this so that you can grab the existing row ID.
51+
52+
**Page size:** The number of documents per page of the query result. Default is "-1" which uses the service dynamic page up to 1000.
53+
54+
**Throughput:** Set an optional value for the number of RUs you'd like to apply to your Azure Cosmos DB collection for each execution of this data flow during the read operation. Minimum is 400.
55+
56+
**Preferred regions:** Choose the preferred read regions for this process.
57+
58+
**Change feed:** If true, you will get data from [Azure Cosmos DB change feed](../cosmos-db/change-feed.md) which is a persistent record of changes to a container in the order they occur from last run automatically. When you set it true, do not set both **Infer drifted column types** and **Allow schema drift** as true at the same time. For more details, see [Azure Cosmos DB change feed)](#azure-cosmos-db-change-feed).
59+
60+
**Start from beginning:** If true, you will get initial load of full snapshot data in the first run, followed by capturing changed data in next runs. If false, the initial load will be skipped in the first run, followed by capturing changed data in next runs. The setting is aligned with the same setting name in [Azure Cosmos DB reference](https://github.com/Azure/azure-cosmosdb-spark/wiki/Configuration-references#reading-cosmosdb-collection-change-feed). For more details, see [Azure Cosmos DB change feed](#azure-cosmos-db-change-feed).
61+
62+
### Sink transformation
63+
64+
Settings specific to Azure Cosmos DB are available in the **Settings** tab of the sink transformation.
65+
66+
**Update method:** Determines what operations are allowed on your database destination. The default is to only allow inserts. To update, upsert, or delete rows, an alter-row transformation is required to tag rows for those actions. For updates, upserts and deletes, a key column or columns must be set to determine which row to alter.
67+
68+
**Collection action:** Determines whether to recreate the destination collection prior to writing.
69+
* None: No action will be done to the collection.
70+
* Recreate: The collection will get dropped and recreated
71+
72+
**Batch size**: An integer that represents how many objects are being written to Azure Cosmos DB collection in each batch. Usually, starting with the default batch size is sufficient. To further tune this value, note:
73+
74+
- Azure Cosmos DB limits single request's size to 2MB. The formula is "Request Size = Single Document Size * Batch Size". If you hit error saying "Request size is too large", reduce the batch size value.
75+
- The larger the batch size, the better throughput the service can achieve, while make sure you allocate enough RUs to empower your workload.
76+
77+
**Partition key:** Enter a string that represents the partition key for your collection. Example: ```/movies/title```
78+
79+
**Throughput:** Set an optional value for the number of RUs you'd like to apply to your Azure Cosmos DB collection for each execution of this data flow. Minimum is 400.
80+
81+
**Write throughput budget:** An integer that represents the RUs you want to allocate for this Data Flow write operation, out of the total throughput allocated to the collection.
82+
83+
## Next steps
84+
Get started with [change data capture in Azure Cosmos DB analytical store ](../cosmos-db/get-started-change-data-capture.md).

0 commit comments

Comments
 (0)