You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/data-factory/connector-azure-cosmos-db-mongodb-api.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -169,6 +169,9 @@ The following properties are supported in the Copy Activity **sink** section:
169
169
| writeBatchSize | The **writeBatchSize** property controls the size of documents to write in each batch. You can try increasing the value for **writeBatchSize** to improve performance and decreasing the value if your document size being large. |No<br />(the default is **10,000**) |
170
170
| writeBatchTimeout | The wait time for the batch insert operation to finish before it times out. The allowed value is timespan. | No<br/>(the default is **00:30:00** - 30 minutes) |
171
171
172
+
>[!TIP]
173
+
>To import JSON documents as-is, refer to [Import or export JSON documents](#import-and-export-json-documents) section; to copy from tabular-shaped data, refer to [Schema mapping](#schema-mapping).
174
+
172
175
**Example**
173
176
174
177
```json
@@ -201,16 +204,13 @@ The following properties are supported in the Copy Activity **sink** section:
201
204
]
202
205
```
203
206
204
-
>[!TIP]
205
-
>To import JSON documents as-is, refer to [Import or export JSON documents](#import-or-export-json-documents) section; to copy from tabular-shaped data, refer to [Schema mapping](#schema-mapping).
206
-
207
-
## Import or export JSON documents
207
+
## Import and export JSON documents
208
208
209
209
You can use this Azure Cosmos DB connector to easily:
210
210
211
+
* Copy documents between two Azure Cosmos DB collections as-is.
211
212
* Import JSON documents from various sources to Azure Cosmos DB, including from Azure Blob storage, Azure Data Lake Store, and other file-based stores that Azure Data Factory supports.
212
213
* Export JSON documents from an Azure Cosmos DB collection to various file-based stores.
213
-
* Copy documents between two Azure Cosmos DB collections as-is.
214
214
215
215
To achieve such schema-agnostic copy, skip the "structure" (also called *schema*) section in dataset and schema mapping in copy activity.
Copy file name to clipboardExpand all lines: articles/data-factory/connector-azure-cosmos-db.md
+16-11Lines changed: 16 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ ms.service: multiple
10
10
ms.workload: data-services
11
11
ms.topic: conceptual
12
12
ms.custom: seo-lt-2019
13
-
ms.date: 11/13/2019
13
+
ms.date: 12/11/2019
14
14
---
15
15
16
16
# Copy and transform data in Azure Cosmos DB (SQL API) by using Azure Data Factory
@@ -141,14 +141,6 @@ If you use "DocumentDbCollection" type dataset, it is still supported as-is for
141
141
}
142
142
```
143
143
144
-
### Schema by Data Factory
145
-
146
-
For schema-free data stores like Azure Cosmos DB, Copy Activity infers the schema in one of the ways described in the following list. Unless you want to [import or export JSON documents as-is](#import-or-export-json-documents), the best practice is to specify the structure of data in the **structure** section.
147
-
148
-
Data Factory honors the mapping you specified on the activity. If a row doesn't contain a value for a column, a null value is provided for the column value.
149
-
150
-
If you don't specify a mapping, the Data Factory service infers the schema by using the first row in the data. If the first row doesn't contain the full schema, some columns will be missing in the result of the activity operation.
151
-
152
144
## Copy Activity properties
153
145
154
146
This section provides a list of properties that the Azure Cosmos DB (SQL API) source and sink support.
@@ -205,6 +197,8 @@ If you use "DocumentDbCollectionSource" type source, it is still supported as-is
205
197
]
206
198
```
207
199
200
+
When copy data from Cosmos DB, unless you want to [export JSON documents as-is](#import-and-export-json-documents), the best practice is to specify the mapping in copy activity. Data Factory honors the mapping you specified on the activity - if a row doesn't contain a value for a column, a null value is provided for the column value. If you don't specify a mapping, Data Factory infers the schema by using the first row in the data. If the first row doesn't contain the full schema, some columns will be missing in the result of the activity operation.
201
+
208
202
### Azure Cosmos DB (SQL API) as sink
209
203
210
204
To copy data to Azure Cosmos DB (SQL API), set the **sink** type in Copy Activity to **DocumentDbCollectionSink**.
@@ -218,6 +212,9 @@ The following properties are supported in the Copy Activity **source** section:
218
212
| writeBatchSize | Data Factory uses the [Azure Cosmos DB bulk executor library](https://github.com/Azure/azure-cosmosdb-bulkexecutor-dotnet-getting-started) to write data to Azure Cosmos DB. The **writeBatchSize** property controls the size of documents that ADF provide to the library. You can try increasing the value for **writeBatchSize** to improve performance and decreasing the value if your document size being large - see below tips. |No<br />(the default is **10,000**) |
219
213
| disableMetricsCollection | Data Factory collects metrics such as Cosmos DB RUs for copy performance optimization and recommendations. If you are concerned with this behavior, specify `true` to turn it off. | No (default is `false`) |
220
214
215
+
>[!TIP]
216
+
>To import JSON documents as-is, refer to [Import or export JSON documents](#import-and-export-json-documents) section; to copy from tabular-shaped data, refer to [Migrate from relational database to Cosmos DB](#migrate-from-relational-database-to-cosmos-db).
217
+
221
218
>[!TIP]
222
219
>Cosmos DB limits single request's size to 2MB. The formula is Request Size = Single Document Size * Write Batch Size. If you hit error saying **"Request size is too large."**, **reduce the `writeBatchSize` value** in copy sink configuration.
223
220
@@ -255,6 +252,10 @@ If you use "DocumentDbCollectionSink" type source, it is still supported as-is f
255
252
]
256
253
```
257
254
255
+
### Schema mapping
256
+
257
+
To copy data from Azure Cosmos DB to tabular sink or reversed, refer to [schema mapping](copy-activity-schema-and-type-mapping.md#schema-mapping).
258
+
258
259
## Mapping data flow properties
259
260
260
261
Learn details from [source transformation](data-flow-source.md) and [sink transformation](data-flow-sink.md) in mapping data flow.
@@ -263,19 +264,23 @@ Learn details from [source transformation](data-flow-source.md) and [sink transf
263
264
264
265
To learn details about the properties, check [Lookup activity](control-flow-lookup-activity.md).
265
266
266
-
## Import or export JSON documents
267
+
## Import and export JSON documents
267
268
268
269
You can use this Azure Cosmos DB (SQL API) connector to easily:
269
270
271
+
* Copy documents between two Azure Cosmos DB collections as-is.
270
272
* Import JSON documents from various sources to Azure Cosmos DB, including from Azure Blob storage, Azure Data Lake Store, and other file-based stores that Azure Data Factory supports.
271
273
* Export JSON documents from an Azure Cosmos DB collection to various file-based stores.
272
-
* Copy documents between two Azure Cosmos DB collections as-is.
273
274
274
275
To achieve schema-agnostic copy:
275
276
276
277
* When you use the Copy Data tool, select the **Export as-is to JSON files or Cosmos DB collection** option.
277
278
* When you use activity authoring, choose JSON format with the corresponding file store for source or sink.
278
279
280
+
## Migrate from relational database to Cosmos DB
281
+
282
+
When migrating from a relational database e.g. SQL Server to Azure Cosmos DB, copy activity can easily map tabular data from source to flatten JSON documents in Cosmos DB. In some cases, you may want to redesign the data model to optimize it for the NoSQL use-cases according to [Data modelling in Azure Cosmos DB](../cosmos-db/modeling-data.md), for example, to denormalize the data by embedding all of the related sub-items within one JSON document. For such case, refer to [this blog post](https://medium.com/@ArsenVlad/denormalizing-via-embedding-when-copying-data-from-sql-to-cosmos-db-649a649ae0fb) with a walkthrough on how to achieve it using Azure Data Factory copy activity.
283
+
279
284
## Next steps
280
285
281
286
For a list of data stores that Copy Activity supports as sources and sinks in Azure Data Factory, see [supported data stores](copy-activity-overview.md##supported-data-stores-and-formats).
0 commit comments