Skip to content

Commit 8ec48c3

Browse files
Merge pull request #247551 from jcodella/jcodella-patch-1-1
Update performance-tips-query-sdk.md
2 parents 062303c + 5586210 commit 8ec48c3

File tree

1 file changed

+48
-44
lines changed

1 file changed

+48
-44
lines changed

articles/cosmos-db/nosql/performance-tips-query-sdk.md

Lines changed: 48 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ zone_pivot_groups: programming-languages-set-cosmos
1515
# Query performance tips for Azure Cosmos DB SDKs
1616
[!INCLUDE[NoSQL](../includes/appliesto-nosql.md)]
1717

18+
1819
Azure Cosmos DB is a fast, flexible distributed database that scales seamlessly with guaranteed latency and throughput levels. You don't have to make major architecture changes or write complex code to scale your database with Azure Cosmos DB. Scaling up and down is as easy as making a single API call. To learn more, see [provision container throughput](how-to-provision-container-throughput.md) or [provision database throughput](how-to-provision-database-throughput.md).
1920

2021
::: zone pivot="programming-language-csharp"
@@ -23,6 +24,53 @@ Azure Cosmos DB is a fast, flexible distributed database that scales seamlessly
2324

2425
To execute a query, a query plan needs to be built. This in general represents a network request to the Azure Cosmos DB Gateway, which adds to the latency of the query operation. There are two ways to remove this request and reduce the latency of the query operation:
2526

27+
### Optimizing single partition queries with Optimistic Direct Execution
28+
29+
Azure Cosmos DB NoSQL has an optimization called Optimistic Direct Execution (ODE), which can improve the efficiency of certain NoSQL queries. Specifically, queries that don’t require distribution include those that can be executed on a single physical partition or that have responses that don't require [pagination](query/pagination.md). Queries that don’t require distribution can confidently skip some processes, such as client-side query plan generation and query rewrite, thereby reducing query latency and RU cost. If you specify the partition key in the request or query itself (or have only one physical partition), and the results of your query don’t require pagination, then ODE can improve your queries.
30+
31+
ODE is now available and enabled by default in the .NET SDK (preview) version 3.35.0-preview and later. When you execute a query and specify a partition key in the request or query itself, or your database has only one physical partition, your query execution can leverage the benefits of ODE. To disable ODE, set EnableOptimisticDirectExecution to false in the QueryRequestOptions.
32+
33+
Single partition queries that feature GROUP BY, ORDER BY, DISTINCT, and aggregation functions (like sum, mean, min, and max) can significantly benefit from using ODE. However, in scenarios where the query is targeting multiple partitions or still requires pagination, the latency of the query response and RU cost might be higher than without using ODE. Therefore, when using ODE, we recommend to:
34+
- Specify the partition key in the call or query itself.
35+
- Ensure that your data size hasn’t grown and caused the partition to split.
36+
- Ensure that your query results don’t require pagination to get the full benefit of ODE.
37+
38+
Here are a few examples of simple single partition queries which can benefit from ODE:
39+
```
40+
- SELECT * FROM r
41+
- SELECT * FROM r WHERE r.pk == "value"
42+
- SELECT * FROM r WHERE r.id > 5
43+
- SELECT r.id FROM r JOIN id IN r.id
44+
- SELECT TOP 5 r.id FROM r ORDER BY r.id
45+
- SELECT * FROM r WHERE r.id > 5 OFFSET 5 LIMIT 3
46+
```
47+
There can be cases where single partition queries may still require distribution if the number of data items increases over time and your Azure Cosmos DB database [splits the partition](../partitioning-overview.md#physical-partitions). Examples of queries where this could occur include:
48+
```
49+
- SELECT Count(r.id) AS count_a FROM r
50+
- SELECT DISTINCT r.id FROM r
51+
- SELECT Max(r.a) as min_a FROM r
52+
- SELECT Avg(r.a) as min_a FROM r
53+
- SELECT Sum(r.a) as sum_a FROM r WHERE r.a > 0
54+
```
55+
Some complex queries can always require distribution, even if targeting a single partition. Examples of such queries include:
56+
```
57+
- SELECT Sum(id) as sum_id FROM r JOIN id IN r.id
58+
- SELECT DISTINCT r.id FROM r GROUP BY r.id
59+
- SELECT DISTINCT r.id, Sum(r.id) as sum_a FROM r GROUP BY r.id
60+
- SELECT Count(1) FROM (SELECT DISTINCT r.id FROM root r)
61+
- SELECT Avg(1) AS avg FROM root r
62+
```
63+
64+
It's important to note that ODE might not always retrieve the query plan and, as a result, is not able to disallow or turn off for unsupported queries. For example, after partition split, such queries are no longer eligible for ODE and, therefore, won't run because client-side query plan evaluation will block those. To ensure compatibility/service continuity, it's critical to ensure that only queries that are fully supported in scenarios without ODE (that is, they execute and produce the correct result in the general multi-partition case) are used with ODE.
65+
66+
>[!NOTE]
67+
> Using ODE can potentially cause a new type of continuation token to be generated. Such a token is not recognized by the older SDKs by design and this could result in a Malformed Continuation Token Exception. If you have a scenario where tokens generated from the newer SDKs are used by an older SDK, we recommend a 2 step approach to upgrade:
68+
>
69+
>- Upgrade to the new SDK and disable ODE, both together as part of a single deployment. Wait for all nodes to upgrade.
70+
> - In order to disable ODE, set EnableOptimisticDirectExecution to false in the QueryRequestOptions.
71+
>- Enable ODE as part of second deployment for all nodes.
72+
73+
2674
### Use local Query Plan generation
2775

2876
The SQL SDK includes a native ServiceInterop.dll to parse and optimize queries locally. ServiceInterop.dll is supported only on the **Windows x64** platform. The following types of applications use 32-bit host processing by default. To change host processing to 64-bit processing, follow these steps, based on the type of your application:
@@ -229,50 +277,6 @@ IQueryable<dynamic> authorResults = client.CreateDocumentQuery(
229277

230278
Pre-fetching works the same way regardless of the degree of parallelism, and there's a single buffer for the data from all partitions.
231279

232-
## Optimizing single partition queries with Optimistic Direct Execution
233-
234-
Azure Cosmos DB NoSQL has an optimization called Optimistic Direct Execution (ODE), which can improve the efficiency of certain NoSQL queries. Specifically, queries that don’t require distribution include those that can be executed on a single physical partition or that have responses that don't require [pagination](query/pagination.md). Queries that don’t require distribution can confidently skip some processes, such as client-side query plan generation and query rewrite, thereby reducing query latency and RU cost. If you specify the partition key in the request or query itself (or have only one physical partition), and the results of your query don’t require pagination, then ODE can improve your queries.
235-
236-
Single partition queries that feature GROUP BY, ORDER BY, DISTINCT, and aggregation functions (like sum, mean, min, and max) can significantly benefit from using ODE. However, in scenarios where the query is targeting multiple partitions or still requires pagination, the latency of the query response and RU cost might be higher than without using ODE. Therefore, when using ODE, we recommend to:
237-
- Specify the partition key in the call or query itself.
238-
- Ensure that your data size hasn’t grown and caused the partition to split.
239-
- Ensure that your query results don’t require pagination to get the full benefit of ODE.
240-
241-
Here are a few examples of simple single partition queries which can benefit from ODE:
242-
```
243-
- SELECT * FROM r
244-
- SELECT VALUE r.id FROM r
245-
- SELECT * FROM r WHERE r.id > 5
246-
- SELECT r.id FROM r JOIN id IN r.id
247-
- SELECT TOP 5 r.id FROM r ORDER BY r.id
248-
- SELECT * FROM r WHERE r.id > 5 OFFSET 5 LIMIT 3
249-
```
250-
There can be cases where single partition queries may still require distribution if the number of data items increases over time and your Azure Cosmos DB database [splits the partition](../partitioning-overview.md#physical-partitions). Examples of queries where this could occur include:
251-
```
252-
- SELECT Count(r.id) AS count_a FROM r
253-
- SELECT DISTINCT r.id FROM r
254-
- SELECT Max(r.a) as min_a FROM r
255-
- SELECT Avg(r.a) as min_a FROM r
256-
- SELECT Sum(r.a) as sum_a FROM r WHERE r.a > 0
257-
```
258-
Some complex queries can always require distribution, even if targeting a single partition. Examples of such queries include:
259-
```
260-
- SELECT Sum(id) as sum_id FROM r JOIN id IN r.id
261-
- SELECT DISTINCT r.id FROM r GROUP BY r.id
262-
- SELECT DISTINCT r.id, Sum(r.id) as sum_a FROM r GROUP BY r.id
263-
- SELECT Count(1) FROM (SELECT DISTINCT r.id FROM root r)
264-
- SELECT Avg(1) AS avg FROM root r
265-
```
266-
267-
It's important to note that ODE might not always retrieve the query plan and, as a result, is not able to disallow or turn off for unsupported queries. For example, after partition split, such queries are no longer eligible for ODE and, therefore, won't run because client-side query plan evaluation will block those. To ensure compatibility/service continuity, it's critical to ensure that only queries that are fully supported in scenarios without ODE (that is, they execute and produce the correct result in the general multi-partition case) are used with ODE.
268-
269-
### Using ODE via the SDKs
270-
ODE is now available and enabled by default in the .NET Preview SDK for versions 3.35.0 and later. When you execute a query and specify a partition key in the request or query itself, or your database has only one physical partition, your query execution can leverage the benefits of ODE.
271-
272-
To disable ODE, set the flag `EnableOptimisticDirectExecution` to false in your QueryRequestOptions object.
273-
274-
275-
276280
## Next steps
277281

278282
To learn more about performance using the .NET SDK:

0 commit comments

Comments
 (0)