Skip to content

Commit 68e7754

Browse files
Merge pull request #261993 from jcocchi/add-bypass-cache
Cosmos DB: Add bypass integrated cache section
2 parents 12438e8 + 7a9898f commit 68e7754

File tree

2 files changed

+62
-8
lines changed

2 files changed

+62
-8
lines changed

articles/cosmos-db/how-to-configure-integrated-cache.md

Lines changed: 49 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ You must ensure the request consistency is session or eventual. If not, the requ
6868
6969
## Adjust MaxIntegratedCacheStaleness
7070

71-
Configure `MaxIntegratedCacheStaleness`, which is the maximum time in which you are willing to tolerate stale cached data. It is recommended to set the `MaxIntegratedCacheStaleness` as high as possible because it will increase the likelihood that repeated point reads and queries can be cache hits. If you set `MaxIntegratedCacheStaleness` to 0, your read request will **never** use the integrated cache, regardless of the consistency level. When not configured, the default `MaxIntegratedCacheStaleness` is 5 minutes.
71+
Configure `MaxIntegratedCacheStaleness`, which is the maximum time in which you're willing to tolerate stale cached data. It's recommended to set the `MaxIntegratedCacheStaleness` as high as possible because it will increase the likelihood that repeated point reads and queries can be cache hits. If you set `MaxIntegratedCacheStaleness` to 0, your read request will **never** use the integrated cache, regardless of the consistency level. When not configured, the default `MaxIntegratedCacheStaleness` is 5 minutes.
7272

7373
>[!NOTE]
7474
> The `MaxIntegratedCacheStaleness` can be set as high as 10 years. In practice, this value is the maximum staleness and the cache may be reset sooner due to node restarts which may occur.
@@ -133,6 +133,54 @@ container.query_items(
133133
---
134134

135135

136+
## Bypass the integrated cache (Preview)
137+
138+
Use the `BypassIntegratedCache` request option to control which requests use the integrated cache. Writes, point reads, and queries that bypass the integrated cache won't use cache storage, saving space for other items. Requests that bypass the cache are still routed through the dedicated gateway. These requests are served from the backend and cost RUs.
139+
140+
Bypassing the cache is supported in these versions of each SDK:
141+
142+
| SDK | Supported versions |
143+
| --- | ------------------ |
144+
| **.NET SDK v3** | *>= 3.35.0-preview* |
145+
| **Java SDK v4** | *>= 4.49.0* |
146+
| **Node.js SDK** | Not supported |
147+
| **Python SDK** | Not supported |
148+
149+
### [.NET](#tab/dotnet)
150+
151+
```csharp
152+
FeedIterator<MyClass> myQuery = container.GetItemQueryIterator<MyClass>(new QueryDefinition("SELECT * FROM c"), requestOptions: new QueryRequestOptions
153+
{
154+
DedicatedGatewayRequestOptions = new DedicatedGatewayRequestOptions
155+
{
156+
BypassIntegratedCache = true
157+
}
158+
}
159+
);
160+
```
161+
162+
### [Java](#tab/java)
163+
164+
```java
165+
DedicatedGatewayRequestOptions dgOptions = new DedicatedGatewayRequestOptions()
166+
.setIntegratedCacheBypassed(true);
167+
CosmosQueryRequestOptions queryOptions = new CosmosQueryRequestOptions()
168+
.setDedicatedGatewayRequestOptions(dgOptions);
169+
170+
CosmosPagedFlux<MyClass> pagedFluxResponse = container.queryItems(
171+
"SELECT * FROM c", queryOptions, MyClass.class);
172+
```
173+
174+
### [Node.js](#tab/nodejs)
175+
176+
The bypass integrated cache request option isn't available in the Node.js SDK.
177+
178+
### [Python](#tab/python)
179+
180+
The bypass integrated cache request option isn't available in the Python SDK.
181+
182+
---
183+
136184
## Verify cache hits
137185

138186
Finally, you can restart your application and verify integrated cache hits for repeated point reads or queries by seeing if the request charge is 0. Once you’ve modified your `CosmosClient` to use the dedicated gateway endpoint, all requests will be routed through the dedicated gateway.

articles/cosmos-db/integrated-cache.md

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ms.service: cosmos-db
66
ms.subservice: nosql
77
ms.custom: ignite-2022
88
ms.topic: conceptual
9-
ms.date: 03/15/2023
9+
ms.date: 12/27/2023
1010
ms.author: sidandrews
1111
ms.reviewer: jucocchi
1212
---
@@ -21,7 +21,7 @@ An integrated cache is automatically configured within the dedicated gateway. Th
2121
* An item cache for point reads
2222
* A query cache for queries
2323

24-
The integrated cache is a read-through, write-through cache with a Least Recently Used (LRU) eviction policy. The item cache and query cache share the same capacity within the integrated cache and the LRU eviction policy applies to both. Data is evicted from the cache strictly based on when it was least recently used, regardless of whether it's a point read or query. The cached data within each node depends on the data that was recently [written or read](integrated-cache.md#item-cache) through that specific node. If an item or query is cached on one node, it isn't necessarily cached on the others.
24+
The integrated cache is a read-through, write-through cache with a Least Recently Used (LRU) eviction policy. The item cache and query caches share the same capacity within the integrated cache and the LRU eviction policy applies to both. Data is evicted from the cache strictly based on when it was least recently used, regardless of whether it's a point read or query. The cached data within each node depends on the data that was recently [written or read](integrated-cache.md#item-cache) through that specific node. If an item or query is cached on one node, it isn't necessarily cached on the others.
2525

2626
> [!NOTE]
2727
> Do you have any feedback about the integrated cache? We want to hear it! Feel free to share feedback directly with the Azure Cosmos DB engineering team:
@@ -68,12 +68,12 @@ Because each node has an independent cache, it's possible items are invalidated
6868

6969
## Query cache
7070

71-
The query cache is used to cache queries. The query cache transforms a query into a key/value lookup where the key is the query text and the value is the query results. The integrated cache doesn't have a query engine, it only stores the key/value lookup for each query. Query results are stored as a set, and the cache doesn't keep track of individual items. A given item can be stored in the query cache multiple times if it appears in the result set of multiple queries. Updates to the underlying items won't be reflected in query results unless the [max integrated cache staleness](#maxintegratedcachestaleness) for the query is reached and the query is served from the backend database.
71+
The query cache is used to cache queries. The query cache transforms a query into a key/value lookup where the key is the query text and the value is the query results. The integrated cache doesn't have a query engine, it only stores the key/value lookup for each query. Query results are stored as a set, and the cache doesn't keep track of individual items. A given item can be stored in the query cache multiple times if it appears in the result set of multiple queries. Updates to the underlying items aren't reflected in query results unless the [max integrated cache staleness](#maxintegratedcachestaleness) for the query is reached and the query is served from the backend database.
7272

7373
### Populating the query cache
7474

7575
- If the cache doesn't have a result for that query (cache miss) on the node it was routed through, the query is sent to the backend. After the query is run, the cache will store the results for that query
76-
- Queries with the same shape but different parameters or request options that affect the results (ex. max item count) will be stored as their own key/value pair
76+
- Queries with the same shape but different parameters or request options that affect the results (ex. max item count) are stored as their own key/value pair
7777

7878
### Query cache eviction
7979

@@ -108,7 +108,7 @@ The easiest way to configure either session or eventual consistency for all read
108108

109109
The `MaxIntegratedCacheStaleness` is the maximum acceptable staleness for cached point reads and queries, regardless of the selected consistency. The `MaxIntegratedCacheStaleness` is configurable at the request-level. For example, if you set a `MaxIntegratedCacheStaleness` of 2 hours, your request will only return cached data if the data is less than 2 hours old. To increase the likelihood of repeated reads utilizing the integrated cache, you should set the `MaxIntegratedCacheStaleness` as high as your business requirements allow.
110110

111-
It's important to understand that the `MaxIntegratedCacheStaleness`, when configured on a request that ends up populating the cache, doesn't affect how long that request is cached. `MaxIntegratedCacheStaleness` enforces consistency when you try to use cached data. There's no global TTL or cache retention setting, so data is only evicted from the cache if either the integrated cache is full or a new read is run with a lower `MaxIntegratedCacheStaleness` than the age of the current cached entry.
111+
The `MaxIntegratedCacheStaleness`, when configured on a request that ends up populating the cache, doesn't affect how long that request is cached. `MaxIntegratedCacheStaleness` enforces consistency when you try to read cached data. There's no global TTL or cache retention setting, so data is only evicted from the cache if either the integrated cache is full or a new read is run with a lower `MaxIntegratedCacheStaleness` than the age of the current cached entry.
112112

113113
This is an improvement from how most caches work and allows for the following other customizations:
114114

@@ -133,6 +133,12 @@ To better understand the `MaxIntegratedCacheStaleness` parameter, consider the f
133133

134134
[Learn to configure the `MaxIntegratedCacheStaleness`.](how-to-configure-integrated-cache.md#adjust-maxintegratedcachestaleness)
135135

136+
## Bypass the integrated cache (Preview)
137+
138+
The integrated cache has a limited storage capacity determined by the dedicated gateway SKU provisioned. By default, all requests from clients configured with the dedicated gateway connection string go through the integrated cache and take up cache space. You can control which items and queries are cached with the bypass integrated cache request option, currently in preview. This request option is useful for item writes or read requests that aren't expected to be frequently repeated. Bypassing the integrated cache for items with infrequent access saves cache space for items with more repeats, increasing RU saving potential and reducing evictions. Requests that bypass the cache are still routed through the dedicated gateway. These requests are served from the backend and cost RUs.
139+
140+
[Learn to bypass the integrated cache.](how-to-configure-integrated-cache.md#bypass-the-integrated-cache-preview)
141+
136142
## Metrics
137143

138144
It's helpful to monitor some key metrics for the integrated cache. These metrics include:
@@ -161,11 +167,11 @@ The below examples show how to debug some common scenarios:
161167

162168
### I can’t tell if my application is using the dedicated gateway
163169

164-
Check the `DedicatedGatewayRequests`. This metric includes all requests that use the dedicated gateway, regardless of whether they hit the integrated cache. If your application uses the standard gateway or direct mode with your original connection string, you won't see an error message, but the `DedicatedGatewayRequests` will be zero. If your application uses direct mode with your dedicated gateway connection string, you may still see a small number of `DedicatedGatewayRequests`.
170+
Check the `DedicatedGatewayRequests`. This metric includes all requests that use the dedicated gateway, regardless of whether they hit the integrated cache. If your application uses the standard gateway or direct mode with your original connection string, you will not see an error message, but the `DedicatedGatewayRequests` will be zero. If your application uses direct mode with your dedicated gateway connection string, you may still see a few `DedicatedGatewayRequests`.
165171

166172
### I can’t tell if my requests are hitting the integrated cache
167173

168-
Check the `IntegratedCacheItemHitRate` and `IntegratedCacheQueryHitRate`. If both of these values are zero, then requests aren't hitting the integrated cache. Check that you're using the dedicated gateway connection string, [connecting with gateway mode](nosql/sdk-connection-modes.md), and [have set session or eventual consistency](consistency-levels.md#configure-the-default-consistency-level).
174+
Check the `IntegratedCacheItemHitRate` and `IntegratedCacheQueryHitRate`. If both of these values are zero, then requests aren't hitting the integrated cache. Check that you're using the dedicated gateway connection string, [connecting with gateway mode](nosql/sdk-connection-modes.md), and [are using session or eventual consistency](consistency-levels.md#configure-the-default-consistency-level).
169175

170176
### I want to understand if my dedicated gateway is too small
171177

0 commit comments

Comments
 (0)