Skip to content

Commit 89c982a

Browse files
authored
Merge pull request #89266 from ealsur/users/ealsur/perftipsv3
Cosmos DB - NET Performance tips
2 parents d87c119 + 7374dbf commit 89c982a

File tree

1 file changed

+45
-22
lines changed

1 file changed

+45
-22
lines changed

articles/cosmos-db/performance-tips.md

Lines changed: 45 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -28,18 +28,17 @@ So if you're asking "How can I improve my database performance?" consider the fo
2828

2929
How a client connects to Azure Cosmos DB has important implications on performance, especially in terms of observed client-side latency. There are two key configuration settings available for configuring client Connection Policy – the connection *mode* and the connection *protocol*. The two available modes are:
3030

31-
* Gateway Mode (default)
31+
* Gateway mode
3232

33-
Gateway Mode is supported on all SDK platforms and is the configured default. If your application runs within a corporate network with strict firewall restrictions, Gateway Mode is the best choice since it uses the standard HTTPS port and a single endpoint. The performance tradeoff, however, is that Gateway Mode involves an additional network hop every time data is read or written to Azure Cosmos DB. Because of this, Direct Mode offers better performance due to fewer network hops. Gateway connection mode is also recommended when you run applications in environments with limited number of socket connections, for example when using Azure Functions or if you are on a consumption plan.
33+
Gateway mode is supported on all SDK platforms and is the configured default for [Microsoft.Azure.DocumentDB SDK](sql-api-sdk-dotnet.md). If your application runs within a corporate network with strict firewall restrictions, gateway mode is the best choice since it uses the standard HTTPS port and a single endpoint. The performance tradeoff, however, is that gateway mode involves an additional network hop every time data is read or written to Azure Cosmos DB. Because of this, Direct Mode offers better performance due to fewer network hops. Gateway connection mode is also recommended when you run applications in environments with limited number of socket connections.
3434

35-
* Direct Mode
35+
When using the SDK in Azure Functions, particularly in [consumption plan](../azure-functions/functions-scale.md#consumption-plan), be mindful of the current [limits in connections](../azure-functions/manage-connections.md). In that case, gateway mode might be recommended if you are also working with other HTTP based clients within your Azure Functions application.
3636

37-
Direct mode supports connectivity through TCP and HTTPS protocols. If you are using the latest version of .NET SDK, direct connectivity mode is supported in .NET Standard 2.0 and .NET framework. When using Direct Mode, there are two protocol options available:
37+
* Direct mode
3838

39-
* TCP
40-
* HTTPS
39+
Direct mode supports connectivity through TCP and HTTPS protocols and is the default connectivity mode if you are using [Microsoft.Azure.Cosmos/.Net V3 SDK](sql-api-sdk-dotnet-standard.md).
4140

42-
When using Gateway mode, Cosmos DB uses port 443 and ports 10250, 10255 and 10256 when using Azure Cosmos DB's API for MongoDB. The 10250 port maps to a default MongoDB instance without geo-replication and 10255/10256 ports map to the MongoDB instance with geo-replication functionality. When using TCP in Direct Mode, in addition to the Gateway ports, you need to ensure the port range between 10000 and 20000 is open because Azure Cosmos DB uses dynamic TCP ports. If these ports are not open and you attempt to use TCP, you receive a 503 Service Unavailable error. The following table shows connectivity modes available for different APIs and the service ports user for each API:
41+
When using gateway mode, Cosmos DB uses port 443 and ports 10250, 10255 and 10256 when using Azure Cosmos DB's API for MongoDB. The 10250 port maps to a default MongoDB instance without geo-replication and 10255/10256 ports map to the MongoDB instance with geo-replication functionality. When using TCP in Direct Mode, in addition to the Gateway ports, you need to ensure the port range between 10000 and 20000 is open because Azure Cosmos DB uses dynamic TCP ports. If these ports are not open and you attempt to use TCP, you receive a 503 Service Unavailable error. The following table shows connectivity modes available for different APIs and the service ports user for each API:
4342

4443
|Connection mode |Supported protocol |Supported SDKs |API/Service port |
4544
|---------|---------|---------|---------|
@@ -49,7 +48,20 @@ So if you're asking "How can I improve my database performance?" consider the fo
4948

5049
Azure Cosmos DB offers a simple and open RESTful programming model over HTTPS. Additionally, it offers an efficient TCP protocol, which is also RESTful in its communication model and is available through the .NET client SDK. Both Direct TCP and HTTPS use SSL for initial authentication and encrypting traffic. For best performance, use the TCP protocol when possible.
5150

52-
The Connectivity Mode is configured during the construction of the DocumentClient instance with the ConnectionPolicy parameter. If Direct Mode is used, the Protocol can also be set within the ConnectionPolicy parameter.
51+
For SDK V3, the connectivity mode is configured while creating the CosmosClient instance, as part of the CosmosClientOptions.
52+
53+
```csharp
54+
var serviceEndpoint = new Uri("https://contoso.documents.net");
55+
var authKey = "your authKey from the Azure portal";
56+
CosmosClient client = new CosmosClient(serviceEndpoint, authKey,
57+
new CosmosClientOptions
58+
{
59+
ConnectionMode = ConnectionMode.Direct,
60+
ConnectionProtocol = Protocol.Tcp
61+
});
62+
```
63+
64+
For the Microsoft.Azure.DocumentDB SDK, the connectivity mode is configured during the construction of the DocumentClient instance with the ConnectionPolicy parameter. If Direct Mode is used, the Protocol can also be set within the ConnectionPolicy parameter.
5365

5466
```csharp
5567
var serviceEndpoint = new Uri("https://contoso.documents.net");
@@ -62,15 +74,19 @@ So if you're asking "How can I improve my database performance?" consider the fo
6274
});
6375
```
6476

65-
Because TCP is only supported in Direct Mode, if Gateway Mode is used, then the HTTPS protocol is always used to communicate with the Gateway and the Protocol value in the ConnectionPolicy is ignored.
77+
Because TCP is only supported in Direct Mode, if gateway mode is used, then the HTTPS protocol is always used to communicate with the Gateway and the Protocol value in the ConnectionPolicy is ignored.
6678

6779
![Illustration of the Azure Cosmos DB connection policy](./media/performance-tips/connection-policy.png)
6880

6981
2. **Call OpenAsync to avoid startup latency on first request**
7082

71-
By default, the first request has a higher latency because it has to fetch the address routing table. To avoid this startup latency on the first request, you should call OpenAsync() once during initialization as follows.
83+
By default, the first request has a higher latency because it has to fetch the address routing table. When using the [SDK V2](sql-api-sdk-dotnet.md), to avoid this startup latency on the first request, you should call OpenAsync() once during initialization as follows.
7284

7385
await client.OpenAsync();
86+
87+
> [!NOTE]
88+
> OpenAsync method will generate requests to obtain the address routing table for all the containers in the account. For accounts that have many containers but their application accesses a subset of them, it would generate an unnecessary amount of traffic that makes the initialization slow. So using OpenAsync method might not be useful in this scenario as it slows down application startup.
89+
7490
<a id="same-region"></a>
7591
3. **Collocate clients in same Azure region for performance**
7692

@@ -91,15 +107,22 @@ So if you're asking "How can I improve my database performance?" consider the fo
91107
1. **Install the most recent SDK**
92108

93109
The Azure Cosmos DB SDKs are constantly being improved to provide the best performance. See the [Azure Cosmos DB SDK](sql-api-sdk-dotnet-standard.md) pages to determine the most recent SDK and review improvements.
94-
2. **Use a singleton Azure Cosmos DB client for the lifetime of your application**
95110

96-
Each DocumentClient instance is thread-safe and performs efficient connection management and address caching when operating in Direct Mode. To allow efficient connection management and better performance by DocumentClient, it is recommended to use a single instance of DocumentClient per AppDomain for the lifetime of the application.
111+
2. **Use Stream APIs**
112+
113+
The [.Net SDK V3](sql-api-sdk-dotnet-standard.md) contains stream APIs that can receive and return data without serializing.
114+
115+
The middle-tier applications that don't consume the responses from the SDK directly but relay them to other application tiers can benefit from the stream APIs. See the [Item management](https://github.com/Azure/azure-cosmos-dotnet-v3/blob/master/Microsoft.Azure.Cosmos.Samples/Usage/ItemManagement) samples for examples on stream handling.
116+
117+
3. **Use a singleton Azure Cosmos DB client for the lifetime of your application**
118+
119+
Each DocumentClient and CosmosClient instance is thread-safe and performs efficient connection management and address caching when operating in direct mode. To allow efficient connection management and better performance by the SDK client, it is recommended to use a single instance per AppDomain for the lifetime of the application.
97120

98121
<a id="max-connection"></a>
99-
3. **Increase System.Net MaxConnections per host when using Gateway mode**
122+
4. **Increase System.Net MaxConnections per host when using Gateway mode**
100123

101124
Azure Cosmos DB requests are made over HTTPS/REST when using Gateway mode, and are subjected to the default connection limit per hostname or IP address. You may need to set the MaxConnections to a higher value (100-1000) so that the client library can utilize multiple simultaneous connections to Azure Cosmos DB. In the .NET SDK 1.8.0 and above, the default value for [ServicePointManager.DefaultConnectionLimit](https://msdn.microsoft.com/library/system.net.servicepointmanager.defaultconnectionlimit.aspx) is 50 and to change the value, you can set the [Documents.Client.ConnectionPolicy.MaxConnectionLimit](https://msdn.microsoft.com/library/azure/microsoft.azure.documents.client.connectionpolicy.maxconnectionlimit.aspx) to a higher value.
102-
4. **Tuning parallel queries for partitioned collections**
125+
5. **Tuning parallel queries for partitioned collections**
103126

104127
SQL .NET SDK version 1.9.0 and above support parallel queries, which enable you to query a partitioned collection in parallel. For more information, see [code samples](https://github.com/Azure/azure-documentdb-dotnet/blob/master/samples/code-samples/Queries/Program.cs) related to working with the SDKs. Parallel queries are designed to improve query latency and throughput over their serial counterpart. Parallel queries provide two parameters that users can tune to custom-fit their requirements, (a) MaxDegreeOfParallelism: to control the maximum number of partitions then can be queried in parallel, and (b) MaxBufferedItemCount: to control the number of pre-fetched results.
105128
@@ -112,10 +135,10 @@ So if you're asking "How can I improve my database performance?" consider the fo
112135
Parallel query is designed to pre-fetch results while the current batch of results is being processed by the client. The pre-fetching helps in overall latency improvement of a query. MaxBufferedItemCount is the parameter to limit the number of pre-fetched results. Setting MaxBufferedItemCount to the expected number of results returned (or a higher number) allows the query to receive maximum benefit from pre-fetching.
113136

114137
Pre-fetching works the same way irrespective of the MaxDegreeOfParallelism, and there is a single buffer for the data from all partitions.
115-
5. **Turn on server-side GC**
138+
6. **Turn on server-side GC**
116139

117140
Reducing the frequency of garbage collection may help in some cases. In .NET, set [gcServer](https://msdn.microsoft.com/library/ms229357.aspx) to true.
118-
6. **Implement backoff at RetryAfter intervals**
141+
7. **Implement backoff at RetryAfter intervals**
119142

120143
During performance testing, you should increase load until a small rate of requests get throttled. If throttled, the client application should backoff on throttle for the server-specified retry interval. Respecting the backoff ensures that you spend minimal amount of time waiting between retries. Retry policy support is included in Version 1.8.0 and above of the SQL [.NET](sql-api-sdk-dotnet.md) and [Java](sql-api-sdk-java.md), version 1.9.0 and above of the [Node.js](sql-api-sdk-node.md) and [Python](sql-api-sdk-python.md), and all supported versions of the [.NET Core](sql-api-sdk-dotnet-core.md) SDKs. For more information, [RetryAfter](https://msdn.microsoft.com/library/microsoft.azure.documents.documentclientexception.retryafter.aspx).
121144
@@ -125,15 +148,15 @@ So if you're asking "How can I improve my database performance?" consider the fo
125148
readDocument.RequestDiagnosticsString
126149
```
127150

128-
7. **Scale out your client-workload**
151+
8. **Scale out your client-workload**
129152

130153
If you are testing at high throughput levels (>50,000 RU/s), the client application may become the bottleneck due to the machine capping out on CPU or Network utilization. If you reach this point, you can continue to push the Azure Cosmos DB account further by scaling out your client applications across multiple servers.
131-
8. **Cache document URIs for lower read latency**
154+
9. **Cache document URIs for lower read latency**
132155

133156
Cache document URIs whenever possible for the best read performance. You have to define logic to cache the resourceid when you create the resource. Resourceid based lookups are faster than name based lookups, so caching these values improves the performance.
134157

135158
<a id="tune-page-size"></a>
136-
1. **Tune the page size for queries/read feeds for better performance**
159+
10. **Tune the page size for queries/read feeds for better performance**
137160

138161
When performing a bulk read of documents using read feed functionality (for example, ReadDocumentFeedAsync) or when issuing a SQL query, the results are returned in a segmented fashion if the result set is too large. By default, results are returned in chunks of 100 items or 1 MB, whichever limit is hit first.
139162

@@ -142,19 +165,19 @@ So if you're asking "How can I improve my database performance?" consider the fo
142165
> [!NOTE]
143166
> The maxItemCount property shouldn't be used just for pagination purpose. It's main usage it to improve the performance of queries by reducing the maximum number of items returned in a single page.
144167

145-
You can also set the page size using the available Azure Cosmos DB SDKs. The [MaxItemCount](/dotnet/api/microsoft.azure.documents.client.feedoptions.maxitemcount?view=azure-dotnet) property in FeedOptions allows you to set the maximum number of items to be returned in the enmuration operation. When `maxItemCount` is set to -1, the SDK automatically finds the most optimal value depending on the document size. For example:
168+
You can also set the page size using the available Azure Cosmos DB SDKs. The [MaxItemCount](/dotnet/api/microsoft.azure.documents.client.feedoptions.maxitemcount?view=azure-dotnet) property in FeedOptions allows you to set the maximum number of items to be returned in the enumeration operation. When `maxItemCount` is set to -1, the SDK automatically finds the most optimal value depending on the document size. For example:
146169

147170
```csharp
148171
IQueryable<dynamic> authorResults = client.CreateDocumentQuery(documentCollection.SelfLink, "SELECT p.Author FROM Pages p WHERE p.Title = 'About Seattle'", new FeedOptions { MaxItemCount = 1000 });
149172
```
150173

151174
When a query is executed, the resulting data is sent within a TCP packet. If you specify too low value for `maxItemCount`, the number of trips required to send the data within the TCP packet are high, which impacts the performance. So if you are not sure what value to set for `maxItemCount` property, it's best to set it to -1 and let the SDK choose the default value.
152175

153-
10. **Increase number of threads/tasks**
176+
11. **Increase number of threads/tasks**
154177

155178
See [Increase number of threads/tasks](#increase-threads) in the Networking section.
156179

157-
11. **Use 64-bit host processing**
180+
12. **Use 64-bit host processing**
158181

159182
The SQL SDK works in a 32-bit host process when you are using SQL .NET SDK version 1.11.4 and above. However, if you are using cross partition queries, 64-bit host processing is recommended for improved performance. The following types of applications have 32-bit host process as the default, so in order to change that to 64-bit, follow these steps based on the type of your application:
160183

0 commit comments

Comments
 (0)