You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/data-factory/connector-azure-cosmos-db-mongodb-api.md
+10-7Lines changed: 10 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -169,6 +169,9 @@ The following properties are supported in the Copy Activity **sink** section:
169
169
| writeBatchSize | The **writeBatchSize** property controls the size of documents to write in each batch. You can try increasing the value for **writeBatchSize** to improve performance and decreasing the value if your document size being large. |No<br />(the default is **10,000**) |
170
170
| writeBatchTimeout | The wait time for the batch insert operation to finish before it times out. The allowed value is timespan. | No<br/>(the default is **00:30:00** - 30 minutes) |
171
171
172
+
>[!TIP]
173
+
>To import JSON documents as-is, refer to [Import or export JSON documents](#import-and-export-json-documents) section; to copy from tabular-shaped data, refer to [Schema mapping](#schema-mapping).
174
+
172
175
**Example**
173
176
174
177
```json
@@ -201,18 +204,18 @@ The following properties are supported in the Copy Activity **sink** section:
201
204
]
202
205
```
203
206
204
-
>[!TIP]
205
-
>To import JSON documents as-is, refer to [Import or export JSON documents](#import-or-export-json-documents) section; to copy from tabular-shaped data, refer to [Schema mapping](#schema-mapping).
206
-
207
-
## Import or export JSON documents
207
+
## Import and export JSON documents
208
208
209
209
You can use this Azure Cosmos DB connector to easily:
210
210
211
-
* Import JSON documents from various sources to Azure Cosmos DB, including from Azure Blob storage, Azure Data Lake Store, and other file-based stores that Azure Data Factory supports.
212
-
* Export JSON documents from an Azure Cosmos DB collection to various file-based stores.
213
211
* Copy documents between two Azure Cosmos DB collections as-is.
212
+
* Import JSON documents from various sources to Azure Cosmos DB, including from MongoDB, Azure Blob storage, Azure Data Lake Store, and other file-based stores that Azure Data Factory supports.
213
+
* Export JSON documents from an Azure Cosmos DB collection to various file-based stores.
214
+
215
+
To achieve schema-agnostic copy:
214
216
215
-
To achieve such schema-agnostic copy, skip the "structure" (also called *schema*) section in dataset and schema mapping in copy activity.
217
+
* When you use the Copy Data tool, select the **Export as-is to JSON files or Cosmos DB collection** option.
218
+
* When you use activity authoring, choose JSON format with the corresponding file store for source or sink.
Copy file name to clipboardExpand all lines: articles/data-factory/connector-azure-cosmos-db.md
+18-15Lines changed: 18 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ ms.service: multiple
10
10
ms.workload: data-services
11
11
ms.topic: conceptual
12
12
ms.custom: seo-lt-2019
13
-
ms.date: 11/13/2019
13
+
ms.date: 12/11/2019
14
14
---
15
15
16
16
# Copy and transform data in Azure Cosmos DB (SQL API) by using Azure Data Factory
@@ -36,7 +36,7 @@ For Copy activity, this Azure Cosmos DB (SQL API) connector supports:
36
36
37
37
- Copy data from and to the Azure Cosmos DB [SQL API](https://docs.microsoft.com/azure/cosmos-db/documentdb-introduction).
38
38
- Write to Azure Cosmos DB as **insert** or **upsert**.
39
-
- Import and export JSON documents as-is, or copy data from or to a tabular dataset. Examples include a SQL database and a CSV file. To copy documents as-is to or from JSON files or to or from another Azure Cosmos DB collection, see Import or export JSON documents.
39
+
- Import and export JSON documents as-is, or copy data from or to a tabular dataset. Examples include a SQL database and a CSV file. To copy documents as-is to or from JSON files or to or from another Azure Cosmos DB collection, see [Import and export JSON documents](#import-and-export-json-documents).
40
40
41
41
Data Factory integrates with the [Azure Cosmos DB bulk executor library](https://github.com/Azure/azure-cosmosdb-bulkexecutor-dotnet-getting-started) to provide the best performance when you write to Azure Cosmos DB.
42
42
@@ -141,19 +141,9 @@ If you use "DocumentDbCollection" type dataset, it is still supported as-is for
141
141
}
142
142
```
143
143
144
-
### Schema by Data Factory
145
-
146
-
For schema-free data stores like Azure Cosmos DB, Copy Activity infers the schema in one of the ways described in the following list. Unless you want to [import or export JSON documents as-is](#import-or-export-json-documents), the best practice is to specify the structure of data in the **structure** section.
147
-
148
-
Data Factory honors the mapping you specified on the activity. If a row doesn't contain a value for a column, a null value is provided for the column value.
149
-
150
-
If you don't specify a mapping, the Data Factory service infers the schema by using the first row in the data. If the first row doesn't contain the full schema, some columns will be missing in the result of the activity operation.
151
-
152
144
## Copy Activity properties
153
145
154
-
This section provides a list of properties that the Azure Cosmos DB (SQL API) source and sink support.
155
-
156
-
For a full list of sections and properties that are available for defining activities, see [Pipelines](concepts-pipelines-activities.md).
146
+
This section provides a list of properties that the Azure Cosmos DB (SQL API) source and sink support. For a full list of sections and properties that are available for defining activities, see [Pipelines](concepts-pipelines-activities.md).
157
147
158
148
### Azure Cosmos DB (SQL API) as source
159
149
@@ -205,6 +195,8 @@ If you use "DocumentDbCollectionSource" type source, it is still supported as-is
205
195
]
206
196
```
207
197
198
+
When copy data from Cosmos DB, unless you want to [export JSON documents as-is](#import-and-export-json-documents), the best practice is to specify the mapping in copy activity. Data Factory honors the mapping you specified on the activity - if a row doesn't contain a value for a column, a null value is provided for the column value. If you don't specify a mapping, Data Factory infers the schema by using the first row in the data. If the first row doesn't contain the full schema, some columns will be missing in the result of the activity operation.
199
+
208
200
### Azure Cosmos DB (SQL API) as sink
209
201
210
202
To copy data to Azure Cosmos DB (SQL API), set the **sink** type in Copy Activity to **DocumentDbCollectionSink**.
@@ -218,6 +210,9 @@ The following properties are supported in the Copy Activity **source** section:
218
210
| writeBatchSize | Data Factory uses the [Azure Cosmos DB bulk executor library](https://github.com/Azure/azure-cosmosdb-bulkexecutor-dotnet-getting-started) to write data to Azure Cosmos DB. The **writeBatchSize** property controls the size of documents that ADF provide to the library. You can try increasing the value for **writeBatchSize** to improve performance and decreasing the value if your document size being large - see below tips. |No<br />(the default is **10,000**) |
219
211
| disableMetricsCollection | Data Factory collects metrics such as Cosmos DB RUs for copy performance optimization and recommendations. If you are concerned with this behavior, specify `true` to turn it off. | No (default is `false`) |
220
212
213
+
>[!TIP]
214
+
>To import JSON documents as-is, refer to [Import or export JSON documents](#import-and-export-json-documents) section; to copy from tabular-shaped data, refer to [Migrate from relational database to Cosmos DB](#migrate-from-relational-database-to-cosmos-db).
215
+
221
216
>[!TIP]
222
217
>Cosmos DB limits single request's size to 2MB. The formula is Request Size = Single Document Size * Write Batch Size. If you hit error saying **"Request size is too large."**, **reduce the `writeBatchSize` value** in copy sink configuration.
223
218
@@ -255,6 +250,10 @@ If you use "DocumentDbCollectionSink" type source, it is still supported as-is f
255
250
]
256
251
```
257
252
253
+
### Schema mapping
254
+
255
+
To copy data from Azure Cosmos DB to tabular sink or reversed, refer to [schema mapping](copy-activity-schema-and-type-mapping.md#schema-mapping).
256
+
258
257
## Mapping data flow properties
259
258
260
259
Learn details from [source transformation](data-flow-source.md) and [sink transformation](data-flow-sink.md) in mapping data flow.
@@ -263,19 +262,23 @@ Learn details from [source transformation](data-flow-source.md) and [sink transf
263
262
264
263
To learn details about the properties, check [Lookup activity](control-flow-lookup-activity.md).
265
264
266
-
## Import or export JSON documents
265
+
## Import and export JSON documents
267
266
268
267
You can use this Azure Cosmos DB (SQL API) connector to easily:
269
268
269
+
* Copy documents between two Azure Cosmos DB collections as-is.
270
270
* Import JSON documents from various sources to Azure Cosmos DB, including from Azure Blob storage, Azure Data Lake Store, and other file-based stores that Azure Data Factory supports.
271
271
* Export JSON documents from an Azure Cosmos DB collection to various file-based stores.
272
-
* Copy documents between two Azure Cosmos DB collections as-is.
273
272
274
273
To achieve schema-agnostic copy:
275
274
276
275
* When you use the Copy Data tool, select the **Export as-is to JSON files or Cosmos DB collection** option.
277
276
* When you use activity authoring, choose JSON format with the corresponding file store for source or sink.
278
277
278
+
## Migrate from relational database to Cosmos DB
279
+
280
+
When migrating from a relational database e.g. SQL Server to Azure Cosmos DB, copy activity can easily map tabular data from source to flatten JSON documents in Cosmos DB. In some cases, you may want to redesign the data model to optimize it for the NoSQL use-cases according to [Data modelling in Azure Cosmos DB](../cosmos-db/modeling-data.md), for example, to denormalize the data by embedding all of the related sub-items within one JSON document. For such case, refer to [this blog post](https://medium.com/@ArsenVlad/denormalizing-via-embedding-when-copying-data-from-sql-to-cosmos-db-649a649ae0fb) with a walkthrough on how to achieve it using Azure Data Factory copy activity.
281
+
279
282
## Next steps
280
283
281
284
For a list of data stores that Copy Activity supports as sources and sinks in Azure Data Factory, see [supported data stores](copy-activity-overview.md##supported-data-stores-and-formats).
0 commit comments