Skip to content

Commit 905ec30

Browse files
authored
Merge pull request #113764 from anfeldma-ms/perTipsForTeamApproval
Java SDK v4 Perf tips for SDK team approval
2 parents bf00536 + 659ba69 commit 905ec30

File tree

4 files changed

+514
-37
lines changed

4 files changed

+514
-37
lines changed

articles/cosmos-db/TOC.yml

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -972,10 +972,14 @@
972972
href: sql-api-query-metrics.md
973973
- name: Performance tips for .NET SDK
974974
href: performance-tips.md
975-
- name: Performance tips for Java SDK
976-
href: performance-tips-java.md
977-
- name: Performance tips for Async Java SDK
978-
href: performance-tips-async-java.md
975+
- name: Performance tips for Java SDK v4
976+
href: performance-tips-java-sdk-v4-sql.md
977+
- name: Performance tips for older Java SDKs
978+
items:
979+
- name: Async Java SDK v2
980+
href: performance-tips-async-java.md
981+
- name: Sync Java SDK v2
982+
href: performance-tips-java.md
979983
- name: Performance test - sample app
980984
href: performance-testing.md
981985
- name: Cost-effective reads and writes

articles/cosmos-db/performance-tips-async-java.md

Lines changed: 36 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,31 @@
11
---
2-
title: Azure Cosmos DB performance tips for Async Java
3-
description: Learn client configuration options to improve Azure Cosmos database performance
4-
author: SnehaGunda
2+
title: Performance tips for Azure Cosmos DB Async Java SDK v2
3+
description: Learn client configuration options to improve Azure Cosmos database performance for Async Java SDK v2
4+
author: anfeldma-ms
55
ms.service: cosmos-db
66
ms.devlang: java
77
ms.topic: conceptual
8-
ms.date: 05/23/2019
9-
ms.author: sngun
8+
ms.date: 05/05/2020
9+
ms.author: anfeldma
1010

1111
---
1212

13-
# Performance tips for Azure Cosmos DB and Async Java
13+
# Performance tips for Azure Cosmos DB Async Java SDK v2
1414

1515
> [!div class="op_single_selector"]
16-
> * [Async Java](performance-tips-async-java.md)
17-
> * [Java](performance-tips-java.md)
16+
> * [Java SDK v4](performance-tips-java-sdk-v4-sql.md)
17+
> * [Async Java SDK v2](performance-tips-async-java.md)
18+
> * [Sync Java SDK v2](performance-tips-java.md)
1819
> * [.NET](performance-tips.md)
1920
>
2021
21-
Azure Cosmos DB is a fast and flexible distributed database that scales seamlessly with guaranteed latency and throughput. You do not have to make major architecture changes or write complex code to scale your database with Azure Cosmos DB. Scaling up and down is as easy as making a single API call or SDK method call. However, because Azure Cosmos DB is accessed via network calls there are client-side optimizations you can make to achieve peak performance when using the [SQL Async Java SDK](sql-api-sdk-async-java.md).
22+
> [!IMPORTANT]
23+
> This is *not* the latest Java SDK for Azure Cosmos DB! Consider using Azure Cosmos DB Java SDK v4 for your project. To upgrade, follow the instructions in the [Migrate to Azure Cosmos DB Java SDK v4](https://github.com/Azure-Samples/azure-cosmos-java-sql-api-samples/blob/master/migration-guide.md) guide and the [Reactor vs RxJava](https://github.com/Azure-Samples/azure-cosmos-java-sql-api-samples/blob/master/reactor-rxjava-guide.md) guide.
24+
>
25+
> The performance tips in this article are for Azure Cosmos DB Async Java SDK v2 only. See the Azure Cosmos DB Async Java SDK v2 [Release notes](sql-api-sdk-async-java.md), [Maven repository](https://mvnrepository.com/artifact/com.microsoft.azure/azure-cosmosdb), and Azure Cosmos DB Async Java SDK v2 [troubleshooting guide](troubleshoot-java-async-sdk.md) for more information.
26+
>
27+
28+
Azure Cosmos DB is a fast and flexible distributed database that scales seamlessly with guaranteed latency and throughput. You do not have to make major architecture changes or write complex code to scale your database with Azure Cosmos DB. Scaling up and down is as easy as making a single API call or SDK method call. However, because Azure Cosmos DB is accessed via network calls there are client-side optimizations you can make to achieve peak performance when using the [Azure Cosmos DB Async Java SDK v2](sql-api-sdk-async-java.md).
2229

2330
So if you're asking "How can I improve my database performance?" consider the following options:
2431

@@ -27,15 +34,17 @@ So if you're asking "How can I improve my database performance?" consider the fo
2734
* **Connection mode: Use Direct mode**
2835
<a id="direct-connection"></a>
2936

30-
How a client connects to Azure Cosmos DB has important implications on performance, especially in terms of client-side latency. The *ConnectionMode* is a key configuration setting available for configuring the client *ConnectionPolicy*. For Async Java SDK, the two available ConnectionModes are:
37+
How a client connects to Azure Cosmos DB has important implications on performance, especially in terms of client-side latency. The *ConnectionMode* is a key configuration setting available for configuring the client *ConnectionPolicy*. For Azure Cosmos DB Async Java SDK v2, the two available ConnectionModes are:
3138

3239
* [Gateway (default)](/java/api/com.microsoft.azure.cosmosdb.connectionmode)
3340
* [Direct](/java/api/com.microsoft.azure.cosmosdb.connectionmode)
3441

3542
Gateway mode is supported on all SDK platforms and it is the configured option by default. If your applications run within a corporate network with strict firewall restrictions, Gateway mode is the best choice since it uses the standard HTTPS port and a single endpoint. The performance tradeoff, however, is that Gateway mode involves an additional network hop every time data is read or written to Azure Cosmos DB. Because of this, Direct mode offers better performance due to fewer network hops.
3643

3744
The *ConnectionMode* is configured during the construction of the *DocumentClient* instance with the *ConnectionPolicy* parameter.
38-
45+
46+
### <a id="asyncjava2-connectionpolicy"></a>Async Java SDK V2 (Maven com.microsoft.azure::azure-cosmosdb)
47+
3948
```java
4049
public ConnectionPolicy getConnectionPolicy() {
4150
ConnectionPolicy policy = new ConnectionPolicy();
@@ -58,7 +67,7 @@ So if you're asking "How can I improve my database performance?" consider the fo
5867
## SDK Usage
5968
* **Install the most recent SDK**
6069

61-
The Azure Cosmos DB SDKs are constantly being improved to provide the best performance. See the [Azure Cosmos DB SDK](sql-api-sdk-async-java.md) pages to determine the most recent SDK and review improvements.
70+
The Azure Cosmos DB SDKs are constantly being improved to provide the best performance. See the Azure Cosmos DB Async Java SDK v2 [Release Notes](sql-api-sdk-async-java.md) pages to determine the most recent SDK and review improvements.
6271

6372
* **Use a singleton Azure Cosmos DB client for the lifetime of your application**
6473

@@ -68,9 +77,9 @@ So if you're asking "How can I improve my database performance?" consider the fo
6877

6978
* **Tuning ConnectionPolicy**
7079

71-
By default, Direct mode Cosmos DB requests are made over TCP when using the Async Java SDK. Internally the SDK uses a special Direct mode architecture to dynamically manage network resources and get the best performance.
80+
By default, Direct mode Cosmos DB requests are made over TCP when using the Azure Cosmos DB Async Java SDK v2. Internally the SDK uses a special Direct mode architecture to dynamically manage network resources and get the best performance.
7281

73-
In the Async Java SDK, Direct mode is the best choice to improve database performance with most workloads.
82+
In the Azure Cosmos DB Async Java SDK v2, Direct mode is the best choice to improve database performance with most workloads.
7483

7584
* ***Overview of Direct mode***
7685

@@ -103,22 +112,22 @@ So if you're asking "How can I improve my database performance?" consider the fo
103112

104113
* ***Programming tips for Direct mode***
105114

106-
Review the Azure Cosmos DB [Async Java SDK Troubleshooting](troubleshoot-java-async-sdk.md) article as a baseline for resolving any Async Java SDK issues.
115+
Review the Azure Cosmos DB Async Java SDK v2 [Troubleshooting](troubleshoot-java-async-sdk.md) article as a baseline for resolving any SDK issues.
107116

108117
Some important programming tips when using Direct mode:
109118

110119
+ **Use multithreading in your application for efficient TCP data transfer** - After making a request, your application should subscribe to receive data on another thread. Not doing so forces unintended "half-duplex" operation and the subsequent requests are blocked waiting for the previous request's reply.
111120

112121
+ **Carry out compute-intensive workloads on a dedicated thread** - For similar reasons to the previous tip, operations such as complex data processing are best placed in a separate thread. A request that pulls in data from another data store (for example if the thread utilizes Azure Cosmos DB and Spark data stores simultaneously) may experience increased latency and it is recommended to spawn an additional thread that awaits a response from the other data store.
113122

114-
+ The underlying network IO in the Async Java SDK is managed by Netty, see these [tips for avoiding coding patterns that block Netty IO threads](troubleshoot-java-async-sdk.md#invalid-coding-pattern-blocking-netty-io-thread).
123+
+ The underlying network IO in the Azure Cosmos DB Async Java SDK v2 is managed by Netty, see these [tips for avoiding coding patterns that block Netty IO threads](troubleshoot-java-async-sdk.md#invalid-coding-pattern-blocking-netty-io-thread).
115124

116125
+ **Data modeling** - The Azure Cosmos DB SLA assumes document size to be less than 1KB. Optimizing your data model and programming to favor smaller document size will generally lead to decreased latency. If you are going to need storage and retrieval of docs larger than 1KB, the recommended approach is for documents to link to data in Azure Blob Storage.
117126

118127

119128
* **Tuning parallel queries for partitioned collections**
120129

121-
Azure Cosmos DB SQL Async Java SDK supports parallel queries, which enable you to query a partitioned collection in parallel. For more information, see [code samples](https://github.com/Azure/azure-cosmosdb-java/tree/master/examples/src/test/java/com/microsoft/azure/cosmosdb/rx/examples) related to working with the SDKs. Parallel queries are designed to improve query latency and throughput over their serial counterpart.
130+
Azure Cosmos DB Async Java SDK v2 supports parallel queries, which enable you to query a partitioned collection in parallel. For more information, see [code samples](https://github.com/Azure/azure-cosmosdb-java/tree/master/examples/src/test/java/com/microsoft/azure/cosmosdb/rx/examples) related to working with the SDKs. Parallel queries are designed to improve query latency and throughput over their serial counterpart.
122131

123132
* ***Tuning setMaxDegreeOfParallelism\:***
124133

@@ -156,10 +165,12 @@ So if you're asking "How can I improve my database performance?" consider the fo
156165

157166
* **Use Appropriate Scheduler (Avoid stealing Event loop IO Netty threads)**
158167

159-
The Async Java SDK uses [netty](https://netty.io/) for non-blocking IO. The SDK uses a fixed number of IO netty event loop threads (as many CPU cores your machine has) for executing IO operations. The Observable returned by API emits the result on one of the shared IO event loop netty threads. So it is important to not block the shared IO event loop netty threads. Doing CPU intensive work or blocking operation on the IO event loop netty thread may cause deadlock or significantly reduce SDK throughput.
168+
The Azure Cosmos DB Async Java SDK v2 uses [netty](https://netty.io/) for non-blocking IO. The SDK uses a fixed number of IO netty event loop threads (as many CPU cores your machine has) for executing IO operations. The Observable returned by API emits the result on one of the shared IO event loop netty threads. So it is important to not block the shared IO event loop netty threads. Doing CPU intensive work or blocking operation on the IO event loop netty thread may cause deadlock or significantly reduce SDK throughput.
160169

161170
For example the following code executes a cpu intensive work on the event loop IO netty thread:
162171

172+
### <a id="asyncjava2-noscheduler"></a>Async Java SDK V2 (Maven com.microsoft.azure::azure-cosmosdb)
173+
163174
```java
164175
Observable<ResourceResponse<Document>> createDocObs = asyncDocumentClient.createDocument(
165176
collectionLink, document, null, true);
@@ -176,6 +187,8 @@ So if you're asking "How can I improve my database performance?" consider the fo
176187

177188
After result is received if you want to do CPU intensive work on the result you should avoid doing so on event loop IO netty thread. You can instead provide your own Scheduler to provide your own thread for running your work.
178189

190+
### <a id="asyncjava2-scheduler"></a>Async Java SDK V2 (Maven com.microsoft.azure::azure-cosmosdb)
191+
179192
```java
180193
import rx.schedulers;
181194

@@ -196,7 +209,7 @@ So if you're asking "How can I improve my database performance?" consider the fo
196209
Based on the type of your work you should use the appropriate existing RxJava Scheduler for your work. Read here
197210
[``Schedulers``](http://reactivex.io/RxJava/1.x/javadoc/rx/schedulers/Schedulers.html).
198211

199-
For More Information, Please look at the [GitHub page](https://github.com/Azure/azure-cosmosdb-java) for Async Java SDK.
212+
For More Information, Please look at the [GitHub page](https://github.com/Azure/azure-cosmosdb-java) for Azure Cosmos DB Async Java SDK v2.
200213

201214
* **Disable netty's logging**
202215
@@ -256,6 +269,8 @@ For other platforms (Red Hat, Windows, Mac, etc.) refer to these instructions ht
256269
257270
Azure Cosmos DB’s indexing policy allows you to specify which document paths to include or exclude from indexing by leveraging Indexing Paths (setIncludedPaths and setExcludedPaths). The use of indexing paths can offer improved write performance and lower index storage for scenarios in which the query patterns are known beforehand, as indexing costs are directly correlated to the number of unique paths indexed. For example, the following code shows how to exclude an entire section of the documents (also known as a subtree) from indexing using the "*" wildcard.
258271
272+
### <a id="asyncjava2-indexing"></a>Async Java SDK V2 (Maven com.microsoft.azure::azure-cosmosdb)
273+
259274
```Java
260275
Index numberIndex = Index.Range(DataType.Number);
261276
numberIndex.set("precision", -1);
@@ -281,6 +296,8 @@ For other platforms (Red Hat, Windows, Mac, etc.) refer to these instructions ht
281296
282297
To measure the overhead of any operation (create, update, or delete), inspect the [x-ms-request-charge](/rest/api/cosmos-db/common-cosmosdb-rest-request-headers) header to measure the number of request units consumed by these operations. You can also look at the equivalent RequestCharge property in ResourceResponse\<T> or FeedResponse\<T>.
283298
299+
### <a id="asyncjava2-requestcharge"></a>Async Java SDK V2 (Maven com.microsoft.azure::azure-cosmosdb)
300+
284301
```Java
285302
ResourceResponse<Document> response = asyncClient.createDocument(collectionLink, documentDefinition, null,
286303
false).toBlocking.single();

0 commit comments

Comments
 (0)