Skip to content

Commit 0e46e7b

Browse files
authored
Merge pull request #98655 from linda33wj/cosmos-db-tips
Update Cosmos DB connector content
2 parents 8ad340a + 62ac05a commit 0e46e7b

File tree

2 files changed

+28
-22
lines changed

2 files changed

+28
-22
lines changed

articles/data-factory/connector-azure-cosmos-db-mongodb-api.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -169,6 +169,9 @@ The following properties are supported in the Copy Activity **sink** section:
169169
| writeBatchSize | The **writeBatchSize** property controls the size of documents to write in each batch. You can try increasing the value for **writeBatchSize** to improve performance and decreasing the value if your document size being large. |No<br />(the default is **10,000**) |
170170
| writeBatchTimeout | The wait time for the batch insert operation to finish before it times out. The allowed value is timespan. | No<br/>(the default is **00:30:00** - 30 minutes) |
171171

172+
>[!TIP]
173+
>To import JSON documents as-is, refer to [Import or export JSON documents](#import-and-export-json-documents) section; to copy from tabular-shaped data, refer to [Schema mapping](#schema-mapping).
174+
172175
**Example**
173176

174177
```json
@@ -201,18 +204,18 @@ The following properties are supported in the Copy Activity **sink** section:
201204
]
202205
```
203206

204-
>[!TIP]
205-
>To import JSON documents as-is, refer to [Import or export JSON documents](#import-or-export-json-documents) section; to copy from tabular-shaped data, refer to [Schema mapping](#schema-mapping).
206-
207-
## Import or export JSON documents
207+
## Import and export JSON documents
208208

209209
You can use this Azure Cosmos DB connector to easily:
210210

211-
* Import JSON documents from various sources to Azure Cosmos DB, including from Azure Blob storage, Azure Data Lake Store, and other file-based stores that Azure Data Factory supports.
212-
* Export JSON documents from an Azure Cosmos DB collection to various file-based stores.
213211
* Copy documents between two Azure Cosmos DB collections as-is.
212+
* Import JSON documents from various sources to Azure Cosmos DB, including from MongoDB, Azure Blob storage, Azure Data Lake Store, and other file-based stores that Azure Data Factory supports.
213+
* Export JSON documents from an Azure Cosmos DB collection to various file-based stores.
214+
215+
To achieve schema-agnostic copy:
214216

215-
To achieve such schema-agnostic copy, skip the "structure" (also called *schema*) section in dataset and schema mapping in copy activity.
217+
* When you use the Copy Data tool, select the **Export as-is to JSON files or Cosmos DB collection** option.
218+
* When you use activity authoring, choose JSON format with the corresponding file store for source or sink.
216219

217220
## Schema mapping
218221

articles/data-factory/connector-azure-cosmos-db.md

Lines changed: 18 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ms.service: multiple
1010
ms.workload: data-services
1111
ms.topic: conceptual
1212
ms.custom: seo-lt-2019
13-
ms.date: 11/13/2019
13+
ms.date: 12/11/2019
1414
---
1515

1616
# Copy and transform data in Azure Cosmos DB (SQL API) by using Azure Data Factory
@@ -36,7 +36,7 @@ For Copy activity, this Azure Cosmos DB (SQL API) connector supports:
3636

3737
- Copy data from and to the Azure Cosmos DB [SQL API](https://docs.microsoft.com/azure/cosmos-db/documentdb-introduction).
3838
- Write to Azure Cosmos DB as **insert** or **upsert**.
39-
- Import and export JSON documents as-is, or copy data from or to a tabular dataset. Examples include a SQL database and a CSV file. To copy documents as-is to or from JSON files or to or from another Azure Cosmos DB collection, see Import or export JSON documents.
39+
- Import and export JSON documents as-is, or copy data from or to a tabular dataset. Examples include a SQL database and a CSV file. To copy documents as-is to or from JSON files or to or from another Azure Cosmos DB collection, see [Import and export JSON documents](#import-and-export-json-documents).
4040

4141
Data Factory integrates with the [Azure Cosmos DB bulk executor library](https://github.com/Azure/azure-cosmosdb-bulkexecutor-dotnet-getting-started) to provide the best performance when you write to Azure Cosmos DB.
4242

@@ -141,19 +141,9 @@ If you use "DocumentDbCollection" type dataset, it is still supported as-is for
141141
}
142142
```
143143

144-
### Schema by Data Factory
145-
146-
For schema-free data stores like Azure Cosmos DB, Copy Activity infers the schema in one of the ways described in the following list. Unless you want to [import or export JSON documents as-is](#import-or-export-json-documents), the best practice is to specify the structure of data in the **structure** section.
147-
148-
Data Factory honors the mapping you specified on the activity. If a row doesn't contain a value for a column, a null value is provided for the column value.
149-
150-
If you don't specify a mapping, the Data Factory service infers the schema by using the first row in the data. If the first row doesn't contain the full schema, some columns will be missing in the result of the activity operation.
151-
152144
## Copy Activity properties
153145

154-
This section provides a list of properties that the Azure Cosmos DB (SQL API) source and sink support.
155-
156-
For a full list of sections and properties that are available for defining activities, see [Pipelines](concepts-pipelines-activities.md).
146+
This section provides a list of properties that the Azure Cosmos DB (SQL API) source and sink support. For a full list of sections and properties that are available for defining activities, see [Pipelines](concepts-pipelines-activities.md).
157147

158148
### Azure Cosmos DB (SQL API) as source
159149

@@ -205,6 +195,8 @@ If you use "DocumentDbCollectionSource" type source, it is still supported as-is
205195
]
206196
```
207197

198+
When copy data from Cosmos DB, unless you want to [export JSON documents as-is](#import-and-export-json-documents), the best practice is to specify the mapping in copy activity. Data Factory honors the mapping you specified on the activity - if a row doesn't contain a value for a column, a null value is provided for the column value. If you don't specify a mapping, Data Factory infers the schema by using the first row in the data. If the first row doesn't contain the full schema, some columns will be missing in the result of the activity operation.
199+
208200
### Azure Cosmos DB (SQL API) as sink
209201

210202
To copy data to Azure Cosmos DB (SQL API), set the **sink** type in Copy Activity to **DocumentDbCollectionSink**.
@@ -218,6 +210,9 @@ The following properties are supported in the Copy Activity **source** section:
218210
| writeBatchSize | Data Factory uses the [Azure Cosmos DB bulk executor library](https://github.com/Azure/azure-cosmosdb-bulkexecutor-dotnet-getting-started) to write data to Azure Cosmos DB. The **writeBatchSize** property controls the size of documents that ADF provide to the library. You can try increasing the value for **writeBatchSize** to improve performance and decreasing the value if your document size being large - see below tips. |No<br />(the default is **10,000**) |
219211
| disableMetricsCollection | Data Factory collects metrics such as Cosmos DB RUs for copy performance optimization and recommendations. If you are concerned with this behavior, specify `true` to turn it off. | No (default is `false`) |
220212

213+
>[!TIP]
214+
>To import JSON documents as-is, refer to [Import or export JSON documents](#import-and-export-json-documents) section; to copy from tabular-shaped data, refer to [Migrate from relational database to Cosmos DB](#migrate-from-relational-database-to-cosmos-db).
215+
221216
>[!TIP]
222217
>Cosmos DB limits single request's size to 2MB. The formula is Request Size = Single Document Size * Write Batch Size. If you hit error saying **"Request size is too large."**, **reduce the `writeBatchSize` value** in copy sink configuration.
223218
@@ -255,6 +250,10 @@ If you use "DocumentDbCollectionSink" type source, it is still supported as-is f
255250
]
256251
```
257252

253+
### Schema mapping
254+
255+
To copy data from Azure Cosmos DB to tabular sink or reversed, refer to [schema mapping](copy-activity-schema-and-type-mapping.md#schema-mapping).
256+
258257
## Mapping data flow properties
259258

260259
Learn details from [source transformation](data-flow-source.md) and [sink transformation](data-flow-sink.md) in mapping data flow.
@@ -263,19 +262,23 @@ Learn details from [source transformation](data-flow-source.md) and [sink transf
263262

264263
To learn details about the properties, check [Lookup activity](control-flow-lookup-activity.md).
265264

266-
## Import or export JSON documents
265+
## Import and export JSON documents
267266

268267
You can use this Azure Cosmos DB (SQL API) connector to easily:
269268

269+
* Copy documents between two Azure Cosmos DB collections as-is.
270270
* Import JSON documents from various sources to Azure Cosmos DB, including from Azure Blob storage, Azure Data Lake Store, and other file-based stores that Azure Data Factory supports.
271271
* Export JSON documents from an Azure Cosmos DB collection to various file-based stores.
272-
* Copy documents between two Azure Cosmos DB collections as-is.
273272

274273
To achieve schema-agnostic copy:
275274

276275
* When you use the Copy Data tool, select the **Export as-is to JSON files or Cosmos DB collection** option.
277276
* When you use activity authoring, choose JSON format with the corresponding file store for source or sink.
278277

278+
## Migrate from relational database to Cosmos DB
279+
280+
When migrating from a relational database e.g. SQL Server to Azure Cosmos DB, copy activity can easily map tabular data from source to flatten JSON documents in Cosmos DB. In some cases, you may want to redesign the data model to optimize it for the NoSQL use-cases according to [Data modelling in Azure Cosmos DB](../cosmos-db/modeling-data.md), for example, to denormalize the data by embedding all of the related sub-items within one JSON document. For such case, refer to [this blog post](https://medium.com/@ArsenVlad/denormalizing-via-embedding-when-copying-data-from-sql-to-cosmos-db-649a649ae0fb) with a walkthrough on how to achieve it using Azure Data Factory copy activity.
281+
279282
## Next steps
280283

281284
For a list of data stores that Copy Activity supports as sources and sinks in Azure Data Factory, see [supported data stores](copy-activity-overview.md##supported-data-stores-and-formats).

0 commit comments

Comments
 (0)