You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Azure Cosmos DB is a fast and flexible distributed database that scales seamlessly with guaranteed latency and throughput. You do not have to make major architecture changes or write complex code to scale your database with Azure Cosmos DB. Scaling up and down is as easy as making a single API call or SDK method call. However, because Azure Cosmos DB is accessed via network calls there are client-side optimizations you can make to achieve peak performance when using the [SQL Async Java SDK](sql-api-sdk-async-java.md).
22
+
> [!IMPORTANT]
23
+
> This is *not* the latest Java SDK for Azure Cosmos DB! Consider using Azure Cosmos DB Java SDK v4 for your project. To upgrade, follow the instructions in the [Migrate to Azure Cosmos DB Java SDK v4](https://github.com/Azure-Samples/azure-cosmos-java-sql-api-samples/blob/master/migration-guide.md) guide and the [Reactor vs RxJava](https://github.com/Azure-Samples/azure-cosmos-java-sql-api-samples/blob/master/reactor-rxjava-guide.md) guide.
24
+
>
25
+
> The performance tips in this article are for Azure Cosmos DB Async Java SDK v2 only. See the Azure Cosmos DB Async Java SDK v2 [Release notes](sql-api-sdk-async-java.md), [Maven repository](https://mvnrepository.com/artifact/com.microsoft.azure/azure-cosmosdb), and Azure Cosmos DB Async Java SDK v2 [troubleshooting guide](troubleshoot-java-async-sdk.md) for more information.
26
+
>
27
+
28
+
Azure Cosmos DB is a fast and flexible distributed database that scales seamlessly with guaranteed latency and throughput. You do not have to make major architecture changes or write complex code to scale your database with Azure Cosmos DB. Scaling up and down is as easy as making a single API call or SDK method call. However, because Azure Cosmos DB is accessed via network calls there are client-side optimizations you can make to achieve peak performance when using the [Azure Cosmos DB Async Java SDK v2](sql-api-sdk-async-java.md).
22
29
23
30
So if you're asking "How can I improve my database performance?" consider the following options:
24
31
@@ -27,15 +34,17 @@ So if you're asking "How can I improve my database performance?" consider the fo
27
34
***Connection mode: Use Direct mode**
28
35
<aid="direct-connection"></a>
29
36
30
-
How a client connects to Azure Cosmos DB has important implications on performance, especially in terms of client-side latency. The *ConnectionMode* is a key configuration setting available for configuring the client *ConnectionPolicy*. For Async Java SDK, the two available ConnectionModes are:
37
+
How a client connects to Azure Cosmos DB has important implications on performance, especially in terms of client-side latency. The *ConnectionMode* is a key configuration setting available for configuring the client *ConnectionPolicy*. For Azure Cosmos DB Async Java SDK v2, the two available ConnectionModes are:
Gateway mode is supported on all SDK platforms and it is the configured option by default. If your applications run within a corporate network with strict firewall restrictions, Gateway mode is the best choice since it uses the standard HTTPS port and a single endpoint. The performance tradeoff, however, is that Gateway mode involves an additional network hop every time data is read or written to Azure Cosmos DB. Because of this, Direct mode offers better performance due to fewer network hops.
36
43
37
44
The *ConnectionMode* is configured during the construction of the *DocumentClient* instance with the *ConnectionPolicy* parameter.
@@ -58,7 +67,7 @@ So if you're asking "How can I improve my database performance?" consider the fo
58
67
## SDK Usage
59
68
***Install the most recent SDK**
60
69
61
-
The Azure Cosmos DB SDKs are constantly being improved to provide the best performance. See the [Azure Cosmos DB SDK](sql-api-sdk-async-java.md) pages to determine the most recent SDK and review improvements.
70
+
The Azure Cosmos DB SDKs are constantly being improved to provide the best performance. See the Azure Cosmos DB Async Java SDK v2 [Release Notes](sql-api-sdk-async-java.md) pages to determine the most recent SDK and review improvements.
62
71
63
72
***Use a singleton Azure Cosmos DB client for the lifetime of your application**
64
73
@@ -68,9 +77,9 @@ So if you're asking "How can I improve my database performance?" consider the fo
68
77
69
78
***Tuning ConnectionPolicy**
70
79
71
-
By default, Direct mode Cosmos DB requests are made over TCP when using the Async Java SDK. Internally the SDK uses a special Direct mode architecture to dynamically manage network resources and get the best performance.
80
+
By default, Direct mode Cosmos DB requests are made over TCP when using the Azure Cosmos DB Async Java SDK v2. Internally the SDK uses a special Direct mode architecture to dynamically manage network resources and get the best performance.
72
81
73
-
In the Async Java SDK, Direct mode is the best choice to improve database performance with most workloads.
82
+
In the Azure Cosmos DB Async Java SDK v2, Direct mode is the best choice to improve database performance with most workloads.
74
83
75
84
****Overview of Direct mode***
76
85
@@ -103,22 +112,22 @@ So if you're asking "How can I improve my database performance?" consider the fo
103
112
104
113
****Programming tips for Direct mode***
105
114
106
-
Review the Azure Cosmos DB [Async Java SDK Troubleshooting](troubleshoot-java-async-sdk.md) article as a baseline for resolving any Async Java SDK issues.
115
+
Review the Azure Cosmos DB Async Java SDK v2 [Troubleshooting](troubleshoot-java-async-sdk.md) article as a baseline for resolving any SDK issues.
107
116
108
117
Some important programming tips when using Direct mode:
109
118
110
119
+**Use multithreading in your application for efficient TCP data transfer** - After making a request, your application should subscribe to receive data on another thread. Not doing so forces unintended "half-duplex" operation and the subsequent requests are blocked waiting for the previous request's reply.
111
120
112
121
+**Carry out compute-intensive workloads on a dedicated thread** - For similar reasons to the previous tip, operations such as complex data processing are best placed in a separate thread. A request that pulls in data from another data store (for example if the thread utilizes Azure Cosmos DB and Spark data stores simultaneously) may experience increased latency and it is recommended to spawn an additional thread that awaits a response from the other data store.
113
122
114
-
+ The underlying network IO in the Async Java SDK is managed by Netty, see these [tips for avoiding coding patterns that block Netty IO threads](troubleshoot-java-async-sdk.md#invalid-coding-pattern-blocking-netty-io-thread).
123
+
+ The underlying network IO in the Azure Cosmos DB Async Java SDK v2 is managed by Netty, see these [tips for avoiding coding patterns that block Netty IO threads](troubleshoot-java-async-sdk.md#invalid-coding-pattern-blocking-netty-io-thread).
115
124
116
125
+**Data modeling** - The Azure Cosmos DB SLA assumes document size to be less than 1KB. Optimizing your data model and programming to favor smaller document size will generally lead to decreased latency. If you are going to need storage and retrieval of docs larger than 1KB, the recommended approach is for documents to link to data in Azure Blob Storage.
117
126
118
127
119
128
***Tuning parallel queries for partitioned collections**
120
129
121
-
Azure Cosmos DB SQL Async Java SDK supports parallel queries, which enable you to query a partitioned collection in parallel. For more information, see [code samples](https://github.com/Azure/azure-cosmosdb-java/tree/master/examples/src/test/java/com/microsoft/azure/cosmosdb/rx/examples) related to working with the SDKs. Parallel queries are designed to improve query latency and throughput over their serial counterpart.
130
+
Azure Cosmos DB Async Java SDK v2 supports parallel queries, which enable you to query a partitioned collection in parallel. For more information, see [code samples](https://github.com/Azure/azure-cosmosdb-java/tree/master/examples/src/test/java/com/microsoft/azure/cosmosdb/rx/examples) related to working with the SDKs. Parallel queries are designed to improve query latency and throughput over their serial counterpart.
122
131
123
132
****Tuning setMaxDegreeOfParallelism\:***
124
133
@@ -156,10 +165,12 @@ So if you're asking "How can I improve my database performance?" consider the fo
The Async Java SDK uses [netty](https://netty.io/) for non-blocking IO. The SDK uses a fixed number of IO netty event loop threads (as many CPU cores your machine has) for executing IO operations. The Observable returned by API emits the result on one of the shared IO event loop netty threads. So it is important to not block the shared IO event loop netty threads. Doing CPU intensive work or blocking operation on the IO event loop netty thread may cause deadlock or significantly reduce SDK throughput.
168
+
The Azure Cosmos DB Async Java SDK v2 uses [netty](https://netty.io/) for non-blocking IO. The SDK uses a fixed number of IO netty event loop threads (as many CPU cores your machine has) for executing IO operations. The Observable returned by API emits the result on one of the shared IO event loop netty threads. So it is important to not block the shared IO event loop netty threads. Doing CPU intensive work or blocking operation on the IO event loop netty thread may cause deadlock or significantly reduce SDK throughput.
160
169
161
170
For example the following code executes a cpu intensive work on the event loop IO netty thread:
@@ -176,6 +187,8 @@ So if you're asking "How can I improve my database performance?" consider the fo
176
187
177
188
After result is received if you want to doCPU intensive work on the result you should avoid doing so on event loop IO netty thread. You can instead provide your own Scheduler to provide your own thread for running your work.
ForMoreInformation, Please look at the [GitHub page](https://github.com/Azure/azure-cosmosdb-java) for Async Java SDK.
212
+
ForMoreInformation, Please look at the [GitHub page](https://github.com/Azure/azure-cosmosdb-java) for Azure Cosmos DB Async Java SDK v2.
200
213
201
214
***Disable netty's logging**
202
215
@@ -256,6 +269,8 @@ For other platforms (Red Hat, Windows, Mac, etc.) refer to these instructions ht
256
269
257
270
Azure Cosmos DB’s indexing policy allows you to specify which document paths to include or exclude from indexing by leveraging Indexing Paths (setIncludedPaths and setExcludedPaths). The use of indexing paths can offer improved write performance and lower index storage for scenarios in which the query patterns are known beforehand, as indexing costs are directly correlated to the number of unique paths indexed. For example, the following code shows how to exclude an entire section of the documents (also known as a subtree) from indexing using the "*" wildcard.
@@ -281,6 +296,8 @@ For other platforms (Red Hat, Windows, Mac, etc.) refer to these instructions ht
281
296
282
297
To measure the overhead of any operation (create, update, or delete), inspect the [x-ms-request-charge](/rest/api/cosmos-db/common-cosmosdb-rest-request-headers) header to measure the number of request units consumed by these operations. You can also look at the equivalent RequestCharge property in ResourceResponse\<T> or FeedResponse\<T>.
0 commit comments