Skip to content

Commit 01f8960

Browse files
authored
Update performance-tips-query-sdk.md
1 parent 3ab98bf commit 01f8960

File tree

1 file changed

+41
-44
lines changed

1 file changed

+41
-44
lines changed

articles/cosmos-db/nosql/performance-tips-query-sdk.md

Lines changed: 41 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ zone_pivot_groups: programming-languages-set-cosmos
1515
# Query performance tips for Azure Cosmos DB SDKs
1616
[!INCLUDE[NoSQL](../includes/appliesto-nosql.md)]
1717

18+
1819
Azure Cosmos DB is a fast, flexible distributed database that scales seamlessly with guaranteed latency and throughput levels. You don't have to make major architecture changes or write complex code to scale your database with Azure Cosmos DB. Scaling up and down is as easy as making a single API call. To learn more, see [provision container throughput](how-to-provision-container-throughput.md) or [provision database throughput](how-to-provision-database-throughput.md).
1920

2021
::: zone pivot="programming-language-csharp"
@@ -23,6 +24,46 @@ Azure Cosmos DB is a fast, flexible distributed database that scales seamlessly
2324

2425
To execute a query, a query plan needs to be built. This in general represents a network request to the Azure Cosmos DB Gateway, which adds to the latency of the query operation. There are two ways to remove this request and reduce the latency of the query operation:
2526

27+
### Optimizing single partition queries with Optimistic Direct Execution
28+
29+
Azure Cosmos DB NoSQL has an optimization called Optimistic Direct Execution (ODE), which can improve the efficiency of certain NoSQL queries. Specifically, queries that don’t require distribution include those that can be executed on a single physical partition or that have responses that don't require [pagination](query/pagination.md). Queries that don’t require distribution can confidently skip some processes, such as client-side query plan generation and query rewrite, thereby reducing query latency and RU cost. If you specify the partition key in the request or query itself (or have only one physical partition), and the results of your query don’t require pagination, then ODE can improve your queries.
30+
31+
ODE is now available and enabled by default in the .NET SDK (preview) version 3.35.0-preview and later. When you execute a query and specify a partition key in the request or query itself, or your database has only one physical partition, your query execution can leverage the benefits of ODE. To disable ODE, set the flag `EnableOptimisticDirectExecution` to false in your QueryRequestOptions object.
32+
33+
Single partition queries that feature GROUP BY, ORDER BY, DISTINCT, and aggregation functions (like sum, mean, min, and max) can significantly benefit from using ODE. However, in scenarios where the query is targeting multiple partitions or still requires pagination, the latency of the query response and RU cost might be higher than without using ODE. Therefore, when using ODE, we recommend to:
34+
- Specify the partition key in the call or query itself.
35+
- Ensure that your data size hasn’t grown and caused the partition to split.
36+
- Ensure that your query results don’t require pagination to get the full benefit of ODE.
37+
38+
Here are a few examples of simple single partition queries which can benefit from ODE:
39+
```
40+
- SELECT * FROM r
41+
- SELECT * FROM r WHERE r.pk == "value"
42+
- SELECT * FROM r WHERE r.id > 5
43+
- SELECT r.id FROM r JOIN id IN r.id
44+
- SELECT TOP 5 r.id FROM r ORDER BY r.id
45+
- SELECT * FROM r WHERE r.id > 5 OFFSET 5 LIMIT 3
46+
```
47+
There can be cases where single partition queries may still require distribution if the number of data items increases over time and your Azure Cosmos DB database [splits the partition](../partitioning-overview.md#physical-partitions). Examples of queries where this could occur include:
48+
```
49+
- SELECT Count(r.id) AS count_a FROM r
50+
- SELECT DISTINCT r.id FROM r
51+
- SELECT Max(r.a) as min_a FROM r
52+
- SELECT Avg(r.a) as min_a FROM r
53+
- SELECT Sum(r.a) as sum_a FROM r WHERE r.a > 0
54+
```
55+
Some complex queries can always require distribution, even if targeting a single partition. Examples of such queries include:
56+
```
57+
- SELECT Sum(id) as sum_id FROM r JOIN id IN r.id
58+
- SELECT DISTINCT r.id FROM r GROUP BY r.id
59+
- SELECT DISTINCT r.id, Sum(r.id) as sum_a FROM r GROUP BY r.id
60+
- SELECT Count(1) FROM (SELECT DISTINCT r.id FROM root r)
61+
- SELECT Avg(1) AS avg FROM root r
62+
```
63+
64+
It's important to note that ODE might not always retrieve the query plan and, as a result, is not able to disallow or turn off for unsupported queries. For example, after partition split, such queries are no longer eligible for ODE and, therefore, won't run because client-side query plan evaluation will block those. To ensure compatibility/service continuity, it's critical to ensure that only queries that are fully supported in scenarios without ODE (that is, they execute and produce the correct result in the general multi-partition case) are used with ODE.
65+
66+
2667
### Use local Query Plan generation
2768

2869
The SQL SDK includes a native ServiceInterop.dll to parse and optimize queries locally. ServiceInterop.dll is supported only on the **Windows x64** platform. The following types of applications use 32-bit host processing by default. To change host processing to 64-bit processing, follow these steps, based on the type of your application:
@@ -229,50 +270,6 @@ IQueryable<dynamic> authorResults = client.CreateDocumentQuery(
229270

230271
Pre-fetching works the same way regardless of the degree of parallelism, and there's a single buffer for the data from all partitions.
231272

232-
## Optimizing single partition queries with Optimistic Direct Execution
233-
234-
Azure Cosmos DB NoSQL has an optimization called Optimistic Direct Execution (ODE), which can improve the efficiency of certain NoSQL queries. Specifically, queries that don’t require distribution include those that can be executed on a single physical partition or that have responses that don't require [pagination](query/pagination.md). Queries that don’t require distribution can confidently skip some processes, such as client-side query plan generation and query rewrite, thereby reducing query latency and RU cost. If you specify the partition key in the request or query itself (or have only one physical partition), and the results of your query don’t require pagination, then ODE can improve your queries.
235-
236-
Single partition queries that feature GROUP BY, ORDER BY, DISTINCT, and aggregation functions (like sum, mean, min, and max) can significantly benefit from using ODE. However, in scenarios where the query is targeting multiple partitions or still requires pagination, the latency of the query response and RU cost might be higher than without using ODE. Therefore, when using ODE, we recommend to:
237-
- Specify the partition key in the call or query itself.
238-
- Ensure that your data size hasn’t grown and caused the partition to split.
239-
- Ensure that your query results don’t require pagination to get the full benefit of ODE.
240-
241-
Here are a few examples of simple single partition queries which can benefit from ODE:
242-
```
243-
- SELECT * FROM r
244-
- SELECT VALUE r.id FROM r
245-
- SELECT * FROM r WHERE r.id > 5
246-
- SELECT r.id FROM r JOIN id IN r.id
247-
- SELECT TOP 5 r.id FROM r ORDER BY r.id
248-
- SELECT * FROM r WHERE r.id > 5 OFFSET 5 LIMIT 3
249-
```
250-
There can be cases where single partition queries may still require distribution if the number of data items increases over time and your Azure Cosmos DB database [splits the partition](../partitioning-overview.md#physical-partitions). Examples of queries where this could occur include:
251-
```
252-
- SELECT Count(r.id) AS count_a FROM r
253-
- SELECT DISTINCT r.id FROM r
254-
- SELECT Max(r.a) as min_a FROM r
255-
- SELECT Avg(r.a) as min_a FROM r
256-
- SELECT Sum(r.a) as sum_a FROM r WHERE r.a > 0
257-
```
258-
Some complex queries can always require distribution, even if targeting a single partition. Examples of such queries include:
259-
```
260-
- SELECT Sum(id) as sum_id FROM r JOIN id IN r.id
261-
- SELECT DISTINCT r.id FROM r GROUP BY r.id
262-
- SELECT DISTINCT r.id, Sum(r.id) as sum_a FROM r GROUP BY r.id
263-
- SELECT Count(1) FROM (SELECT DISTINCT r.id FROM root r)
264-
- SELECT Avg(1) AS avg FROM root r
265-
```
266-
267-
It's important to note that ODE might not always retrieve the query plan and, as a result, is not able to disallow or turn off for unsupported queries. For example, after partition split, such queries are no longer eligible for ODE and, therefore, won't run because client-side query plan evaluation will block those. To ensure compatibility/service continuity, it's critical to ensure that only queries that are fully supported in scenarios without ODE (that is, they execute and produce the correct result in the general multi-partition case) are used with ODE.
268-
269-
### Using ODE via the SDKs
270-
ODE is now available and enabled by default in the .NET Preview SDK for versions 3.35.0 and later. When you execute a query and specify a partition key in the request or query itself, or your database has only one physical partition, your query execution can leverage the benefits of ODE.
271-
272-
To disable ODE, set the flag `EnableOptimisticDirectExecution` to false in your QueryRequestOptions object.
273-
274-
275-
276273
## Next steps
277274

278275
To learn more about performance using the .NET SDK:

0 commit comments

Comments
 (0)