Skip to content

Commit 25c4c47

Browse files
authored
Merge pull request #230513 from TimShererWithAquent/us2036619k
Freshness Pass User Story: 2036619 Azure Cosmos DB bulk executor library overview
2 parents f2a30cf + a2683e3 commit 25c4c47

File tree

1 file changed

+33
-27
lines changed

1 file changed

+33
-27
lines changed

articles/cosmos-db/bulk-executor-overview.md

Lines changed: 33 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -5,50 +5,56 @@ author: abinav2307
55
ms.service: cosmos-db
66
ms.custom: ignite-2022
77
ms.topic: overview
8-
ms.date: 05/28/2019
8+
ms.date: 3/30/2023
99
ms.author: abramees
1010
ms.reviewer: mjbrown
1111
---
1212

1313
# Azure Cosmos DB bulk executor library overview
1414
[!INCLUDE[NoSQL](includes/appliesto-nosql.md)]
15-
16-
Azure Cosmos DB is a fast, flexible, and globally distributed database service that is designed to elastically scale out to support:
1715

18-
* Large read and write throughput (millions of operations per second).
19-
* Storing high volumes of (hundreds of terabytes, or even more) transactional and operational data with predictable millisecond latency.
16+
Azure Cosmos DB is a fast, flexible, and globally distributed database service that elastically scales out to support:
2017

21-
The bulk executor library helps you leverage this massive throughput and storage. The bulk executor library allows you to perform bulk operations in Azure Cosmos DB through bulk import and bulk update APIs. You can read more about the features of bulk executor library in the following sections.
18+
* Large read and write throughput, on the order of millions of operations per second.
19+
* Storing high volumes of transactional and operational data, on the order of hundreds of terabytes or even more, with predictable millisecond latency.
2220

23-
> [!NOTE]
24-
> Currently, bulk executor library supports import and update operations and this library is supported by Azure Cosmos DB API for NoSQL and Gremlin accounts only.
21+
The bulk executor library helps you use this massive throughput and storage. The bulk executor library allows you to perform bulk operations in Azure Cosmos DB through bulk import and bulk update APIs. You can read more about the features of bulk executor library in the following sections.
22+
23+
> [!NOTE]
24+
> Currently, bulk executor library supports import and update operations. Azure Cosmos DB API supports this library for NoSQL and Gremlin accounts only.
2525
2626
> [!IMPORTANT]
27-
> The bulk executor library is not currently supported on [serverless](serverless.md) accounts. On .NET, it is recommended to use the [bulk support](https://devblogs.microsoft.com/cosmosdb/introducing-bulk-support-in-the-net-sdk/) available in the V3 version of the SDK.
28-
27+
> The bulk executor library is not currently supported on [serverless](serverless.md) accounts. On .NET, we recommend that you use the [bulk support](https://devblogs.microsoft.com/cosmosdb/introducing-bulk-support-in-the-net-sdk/) available in the V3 version of the SDK.
28+
2929
## Key features of the bulk executor library
30-
31-
* It significantly reduces the client-side compute resources needed to saturate the throughput allocated to a container. A single threaded application that writes data using the bulk import API achieves 10 times greater write throughput when compared to a multi-threaded application that writes data in parallel while saturating the client machine's CPU.
3230

33-
* It abstracts away the tedious tasks of writing application logic to handle rate limiting of request, request timeouts, and other transient exceptions by efficiently handling them within the library.
31+
* Using the bulk executor library significantly reduces the client-side compute resources needed to saturate the throughput allocated to a container. A single threaded application that writes data using the bulk import API achieves 10 times greater write throughput when compared to a multi-threaded application that writes data in parallel while it saturates the client machine's CPU.
32+
33+
* The bulk executor library abstracts away the tedious tasks of writing application logic to handle rate limiting of request, request timeouts, and other transient exceptions. It efficiently handles them within the library.
34+
35+
* It provides a simplified mechanism for applications to perform bulk operations to scale out. A single bulk executor instance that runs on an Azure virtual machine can consume greater than 500 K RU/s. You can achieve a higher throughput rate by adding more instances on individual client virtual machines.
36+
37+
* The bulk executor library can bulk import more than a terabyte of data within an hour by using a scale-out architecture.
38+
39+
* It can bulk update existing data in Azure Cosmos DB containers as patches.
40+
41+
## How does the bulk executor operate?
42+
43+
When a bulk operation to import or update documents is triggered with a batch of entities, they're initially shuffled into buckets that correspond to their Azure Cosmos DB partition key range. Within each bucket that corresponds to a partition key range, they're broken down into mini-batches.
44+
45+
Each mini-batch acts as a payload that is committed on the server-side. The bulk executor library has built in optimizations for concurrent execution of the mini-batches both within and across partition key ranges.
3446

35-
* It provides a simplified mechanism for applications performing bulk operations to scale out. A single bulk executor instance running on an Azure VM can consume greater than 500K RU/s and you can achieve a higher throughput rate by adding additional instances on individual client VMs.
36-
37-
* It can bulk import more than a terabyte of data within an hour by using a scale-out architecture.
47+
The following diagram illustrates how bulk executor batches data into different partition keys:
3848

39-
* It can bulk update existing data in Azure Cosmos DB containers as patches.
40-
41-
## How does the bulk executor operate?
49+
:::image type="content" source="./media/bulk-executor-overview/bulk-executor-architecture.png" alt-text="Diagram shows bulk executor architecture.":::
4250

43-
When a bulk operation to import or update documents is triggered with a batch of entities, they are initially shuffled into buckets corresponding to their Azure Cosmos DB partition key range. Within each bucket that corresponds to a partition key range, they are broken down into mini-batches and each mini-batch act as a payload that is committed on the server-side. The bulk executor library has built in optimizations for concurrent execution of these mini-batches both within and across partition key ranges. Following image illustrates how bulk executor batches data into different partition keys:
51+
The bulk executor library makes sure to maximally utilize the throughput allocated to a collection. It uses an [AIMD-style congestion control mechanism](https://tools.ietf.org/html/rfc5681) for each Azure Cosmos DB partition key rangeto efficiently handle rate limiting and timeouts.
4452

45-
:::image type="content" source="./media/bulk-executor-overview/bulk-executor-architecture.png" alt-text="Bulk executor architecture" :::
53+
For more information about sample applications that consume the bulk executor library, see [Use the bulk executor .NET library to perform bulk operations in Azure Cosmos DB](nosql/bulk-executor-dotnet.md) and [Perform bulk operations on Azure Cosmos DB data](bulk-executor-java.md).
4654

47-
The bulk executor library makes sure to maximally utilize the throughput allocated to a collection. It uses an [AIMD-style congestion control mechanism](https://tools.ietf.org/html/rfc5681) for each Azure Cosmos DB partition key range to efficiently handle rate limiting and timeouts.
55+
For reference information, see [.NET bulk executor library](nosql/sdk-dotnet-bulk-executor-v2.md) and [Java bulk executor library](nosql/sdk-java-bulk-executor-v2.md).
4856

49-
## Next Steps
57+
## Next steps
5058

51-
* Learn more by trying out the sample applications consuming the bulk executor library in [.NET](nosql/bulk-executor-dotnet.md) and [Java](bulk-executor-java.md).
52-
* Check out the bulk executor SDK information and release notes in [.NET](nosql/sdk-dotnet-bulk-executor-v2.md) and [Java](nosql/sdk-java-bulk-executor-v2.md).
53-
* The bulk executor library is integrated into the Azure Cosmos DB Spark connector, to learn more, see [Azure Cosmos DB Spark connector](./nosql/quickstart-spark.md) article.
54-
* The bulk executor library is also integrated into a new version of [Azure Cosmos DB connector](../data-factory/connector-azure-cosmos-db.md) for Azure Data Factory to copy data.
59+
* [Azure Cosmos DB Spark connector](./nosql/quickstart-spark.md)
60+
* [Azure Cosmos DB connector](../data-factory/connector-azure-cosmos-db.md)

0 commit comments

Comments
 (0)