You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cosmos-db/dedicated-gateway.md
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -69,17 +69,20 @@ The dedicated gateway is available in the following sizes. The integrated cache
69
69
|**D8s**|**8**|**32 GB**|
70
70
|**D16s**|**16**|**64 GB**|
71
71
72
-
> [!NOTE]
72
+
> [!TIP]
73
73
> Once created, you can add or remove dedicated gateway nodes, but you can't modify the size of the nodes. To change the size of your dedicated gateway nodes you can deprovision the cluster and provision it again in a different size. This will result in a short period of downtime unless you change the connection string in your application to use the standard gateway during reprovisioning.
74
74
75
75
There are many different ways to provision a dedicated gateway:
76
76
77
-
-[Provision a dedicated gateway using the Azure Portal](how-to-configure-integrated-cache.md#provision-the-dedicated-gateway)
77
+
-[Provision a dedicated gateway using the Azure portal](how-to-configure-integrated-cache.md#provision-the-dedicated-gateway)
- Note: You cannot deprovision a dedicated gateway using ARM templates
82
82
83
+
> [!NOTE]
84
+
> You can provision a dedicated gateway in Azure Cosmos DB accounts with [availability zones](../availability-zones/az-region.md) by request. Reach out to [email protected] for more information.
85
+
83
86
## Dedicated gateway in multi-region accounts
84
87
85
88
When you provision a dedicated gateway cluster in multi-region accounts, identical dedicated gateway clusters are provisioned in each region. For example, consider an Azure Cosmos DB account in East US and North Europe. If you provision a dedicated gateway cluster with two D8 nodes in this account, you'd have four D8 nodes in total - two in East US and two in North Europe. You don't need to explicitly configure dedicated gateways in each region and your connection string remains the same. There are also no changes to best practices for performing failovers.
@@ -91,7 +94,6 @@ Like nodes within a cluster, dedicated gateway nodes across regions are independ
91
94
The dedicated gateway has the following limitations:
92
95
93
96
- Dedicated gateways are only supported on API for NoSQL accounts
94
-
- You can't provision a dedicated gateway in Azure Cosmos DB accounts with [availability zones](../availability-zones/az-region.md).
95
97
- You can't use [role-based access control (RBAC)](how-to-setup-rbac.md) to authenticate data plane requests routed through the dedicated gateway
Copy file name to clipboardExpand all lines: articles/cosmos-db/integrated-cache.md
+26-20Lines changed: 26 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,22 +6,22 @@ ms.service: cosmos-db
6
6
ms.subservice: nosql
7
7
ms.custom: ignite-2022
8
8
ms.topic: conceptual
9
-
ms.date: 08/29/2022
9
+
ms.date: 03/15/2023
10
10
ms.author: sidandrews
11
11
ms.reviewer: jucocchi
12
12
---
13
13
14
14
# Azure Cosmos DB integrated cache - Overview
15
15
[!INCLUDE[NoSQL](includes/appliesto-nosql.md)]
16
16
17
-
The Azure Cosmos DB integrated cache is an in-memory cache that helps you ensure manageable costs and low latency as your request volume grows. The integrated cache is easy to set up and you don’t need to spend time writing custom code for cache invalidation or managing backend infrastructure. Your integrated cache uses a[dedicated gateway](dedicated-gateway.md) within your Azure Cosmos DB account. The integrated cache is the first of many Azure Cosmos DB features that will utilize a dedicated gateway for improved performance. You can choose from three possible dedicated gateway sizes based on the number of cores and memory needed for your workload.
17
+
The Azure Cosmos DB integrated cache is an in-memory cache that helps you ensure manageable costs and low latency as your request volume grows. The integrated cache is easy to set up and you don’t need to spend time writing custom code for cache invalidation or managing backend infrastructure. The integrated cache uses the[dedicated gateway](dedicated-gateway.md) within your Azure Cosmos DB account. When provisioning your dedicated gateway, you can choose the number of nodes and the node size based on the number of cores and memory needed for your workload. Each dedicated gateway node has a separate integrated cache from the others.
18
18
19
19
An integrated cache is automatically configured within the dedicated gateway. The integrated cache has two parts:
20
20
21
21
* An item cache for point reads
22
22
* A query cache for queries
23
23
24
-
The integrated cache is a read-through, write-through cache with a Least Recently Used (LRU) eviction policy. The item cache and query cache share the same capacity within the integrated cache and the LRU eviction policy applies to both. In other words, data is evicted from the cache strictly based on when it was least recently used, regardless of whether it's a point read or query.
24
+
The integrated cache is a read-through, write-through cache with a Least Recently Used (LRU) eviction policy. The item cache and query cache share the same capacity within the integrated cache and the LRU eviction policy applies to both. Data is evicted from the cache strictly based on when it was least recently used, regardless of whether it's a point read or query. The cached data within each node depends on the data that was recently [written or read](integrated-cache.md#item-cache) through that specific node. If an item or query is cached on one node, it isn't necessarily cached on the others.
25
25
26
26
> [!NOTE]
27
27
> Do you have any feedback about the integrated cache? We want to hear it! Feel free to share feedback directly with the Azure Cosmos DB engineering team:
The main goal of the integrated cache is to reduce costs for read-heavy workloads. Low latency, while helpful, isn't the main benefit of the integrated cache because Azure Cosmos DB is already fast without caching.
33
33
34
-
Point reads and queries that hit the integrated cache will have an RU charge of 0. Cache hits will have a much lower per-operation cost than reads from the backend database.
34
+
Point reads and queries that hit the integrated cache have an RU charge of 0. Cache hits have a much lower per-operation cost than reads from the backend database.
35
35
36
-
Workloads that fit the following characteristics should evaluate if the integrated cache will help lower costs:
36
+
Workloads that fit the following characteristics should evaluate if the integrated cache helps lower costs:
37
37
38
38
- Read-heavy workloads
39
39
- Many repeated point reads on large items
40
40
- Many repeated high RU queries
41
41
- Hot partition key for reads
42
42
43
-
The biggest factor in expected savings is the degree to which reads repeat themselves. If your workload consistently executes the same point reads or queries within a short period of time, it's a great candidate for the integrated cache. When using the integrated cache for repeated reads, you only use RUs for the first read. Subsequent reads routed through the same dedicated gateway node (within the `MaxIntegratedCacheStaleness` window and if the data hasn't been evicted) won't use throughput.
43
+
The biggest factor in expected savings is the degree to which reads repeat themselves. If your workload consistently executes the same point reads or queries within a short period of time, it's a great candidate for the integrated cache. When using the integrated cache for repeated reads, you only use RUs for the first read. Subsequent reads routed through the same dedicated gateway node (within the `MaxIntegratedCacheStaleness` window and if the data hasn't been evicted) don't use throughput.
44
44
45
45
Some workloads shouldn't consider the integrated cache, including:
46
46
@@ -49,65 +49,71 @@ Some workloads shouldn't consider the integrated cache, including:
49
49
50
50
## Item cache
51
51
52
-
You can use the item cache for point reads (in other words, key/value look ups based on the Item ID and partition key).
52
+
Item cache is used for point reads (key/value look ups based on the Item ID and partition key).
53
53
54
54
### Populating the item cache
55
55
56
-
- New writes, updates, and deletes are automatically populated in the item cache
57
-
- If your app tries to read a specific item that wasn’t previously in the cache (cache miss), the item would now be stored in the item cache
56
+
- New writes, updates, and deletes are automatically populated in the item cache of the node that the request is routed through
57
+
- Items from point read requests where the item isn’t already in the cache (cache miss) of the node the request is routed through are added to the item cache
58
+
- Requests that are part of a [transactional batch](./nosql/transactional-batch.md) or written in [bulk mode](./nosql/how-to-migrate-from-bulk-executor-library.md#enable-bulk-support) don't populate the item cache
58
59
59
60
### Item cache invalidation and eviction
60
61
62
+
Because each node has an independent cache, it's possible items are invalidated or evicted in the cache of one node and not the others. Items in the cache of a given node are invalidated and evicted based on the below criteria:
63
+
61
64
- Item update or delete
62
65
- Least recently used (LRU)
63
66
- Cache retention time (in other words, the `MaxIntegratedCacheStaleness`)
64
67
65
68
## Query cache
66
69
67
-
The query cache can be used to cache queries. The query cache transforms a query into a key/value lookup where the key is the query text and the value is query results. The integrated cache doesn't have a query engine, it only stores the key/value lookup for each query.
70
+
The query cache is used to cache queries. The query cache transforms a query into a key/value lookup where the key is the query text and the value is the query results. The integrated cache doesn't have a query engine, it only stores the key/value lookup for each query. Query results are stored as a set, and the cache doesn't keep track of individual items. A given item can be stored in the query cache multiple times if it appears in the result set of multiple queries. Updates to the underlying items won't be reflected in query results unless the [max integrated cache staleness](#maxintegratedcachestaleness) for the query is reached and the query is served from the backend database.
68
71
69
72
### Populating the query cache
70
73
71
-
- If the cache doesn't have a result for that query (cache miss), the query is sent to the backend. After the query is run, the cache will store the results for that query
74
+
- If the cache doesn't have a result for that query (cache miss) on the node it was routed through, the query is sent to the backend. After the query is run, the cache will store the results for that query
75
+
- Queries with the same shape but different parameters or request options that affect the results (ex. max item count) will be stored as their own key/value pair
72
76
73
77
### Query cache eviction
74
78
79
+
Query cache eviction is based on the node the request was routed through. It's possible queries could be evicted or refreshed on one node and not the others.
80
+
75
81
- Least recently used (LRU)
76
82
- Cache retention time (in other words, the `MaxIntegratedCacheStaleness`)
77
83
78
84
### Working with the query cache
79
85
80
86
You don't need special code when working with the query cache, even if your queries have multiple pages of results. The best practices and code for query pagination are the same whether your query hits the integrated cache or is executed on the backend query engine.
81
87
82
-
The query cache will automatically cache query continuation tokens where applicable. If you have a query with multiple pages of results, any pages that are stored in the integrated cache will have an RU charge of 0. If your subsequent pages of query results require backend execution, they'll have a continuation token from the previous page so they can avoid duplicating previous work.
88
+
The query cache automatically caches query continuation tokens where applicable. If you have a query with multiple pages of results, any pages that are stored in the integrated cache have an RU charge of 0. If subsequent pages of query results require backend execution, they'll have a continuation token from the previous page so they can avoid duplicating previous work.
83
89
84
-
> [!NOTE]
90
+
> [!IMPORTANT]
85
91
> Integrated cache instances within different dedicated gateway nodes have independent caches from one another. If data is cached within one node, it is not necessarily cached in the others. Multiple pages of the same query are not guaranteed to be routed to the same dedicated gateway node.
86
92
87
93
## Integrated cache consistency
88
94
89
-
The integrated cache supports read requests with session and eventual [consistency](consistency-levels.md) only. If a read has consistent prefix, bounded staleness, or strong consistency, it will always bypass the integrated cache and be served from the backend.
95
+
The integrated cache supports read requests with session and eventual [consistency](consistency-levels.md) only. If a read has consistent prefix, bounded staleness, or strong consistency, it bypasses the integrated cache and is served from the backend.
90
96
91
97
The easiest way to configure either session or eventual consistency for all reads is to [set it at the account-level](consistency-levels.md#configure-the-default-consistency-level). However, if you would only like some of your reads to have a specific consistency, you can also configure consistency at the [request-level](how-to-manage-consistency.md#override-the-default-consistency-level).
92
98
93
99
> [!NOTE]
94
-
> Write requests with other consistencies will still populate the cache, but in order to read from the cache the request must have either session or eventual consistency.
100
+
> Write requests with other consistencies still populate the cache, but in order to read from the cache the request must have either session or eventual consistency.
95
101
96
102
### Session consistency
97
103
98
-
[Session consistency](consistency-levels.md#session-consistency) is the most widely used consistency level for both single region as well as globally distributed Azure Cosmos DB accounts. When using session consistency, single client sessions can read their own writes. When using the integrated cache, clients outside of the session performing writes will see eventual consistency.
104
+
[Session consistency](consistency-levels.md#session-consistency) is the most widely used consistency level for both single region and globally distributed Azure Cosmos DB accounts. With session consistency, single client sessions can read their own writes. Clients outside of the session performing writes will see eventual consistency when they are using the integrated cache.
99
105
100
106
## MaxIntegratedCacheStaleness
101
107
102
108
The `MaxIntegratedCacheStaleness` is the maximum acceptable staleness for cached point reads and queries, regardless of the selected consistency. The `MaxIntegratedCacheStaleness` is configurable at the request-level. For example, if you set a `MaxIntegratedCacheStaleness` of 2 hours, your request will only return cached data if the data is less than 2 hours old. To increase the likelihood of repeated reads utilizing the integrated cache, you should set the `MaxIntegratedCacheStaleness` as high as your business requirements allow.
103
109
104
-
It's important to understand that the `MaxIntegratedCacheStaleness`, when configured on a request that ends up populating the cache, doesn't impact how long that request will be cached. `MaxIntegratedCacheStaleness` enforces consistency when you try to use cached data. There's no global TTL or cache retention setting, so data will only be evicted from the cache if either the integrated cache is full or a new read is run with a lower `MaxIntegratedCacheStaleness` than the age of the current cached entry.
110
+
It's important to understand that the `MaxIntegratedCacheStaleness`, when configured on a request that ends up populating the cache, doesn't affect how long that request is cached. `MaxIntegratedCacheStaleness` enforces consistency when you try to use cached data. There's no global TTL or cache retention setting, so data is only evicted from the cache if either the integrated cache is full or a new read is run with a lower `MaxIntegratedCacheStaleness` than the age of the current cached entry.
105
111
106
-
This is an improvement from how most caches work and allows the following additional customization:
112
+
This is an improvement from how most caches work and allows for the following other customizations:
107
113
108
114
- You can set different staleness requirements for each point read or query
109
115
- Different clients, even if they run the same point read or query, can configure different `MaxIntegratedCacheStaleness` values
110
-
- If you wanted to modify read consistency for cached data, changing `MaxIntegratedCacheStaleness`will have an immediate effect on read consistency
116
+
- If you wanted to modify read consistency for cached data, changing `MaxIntegratedCacheStaleness`has an immediate effect on read consistency
111
117
112
118
> [!NOTE]
113
119
> The minimum `MaxIntegratedCacheStaleness` value is 0 and the maximum value is 10 years. When not explicitly configured, the `MaxIntegratedCacheStaleness` defaults to 5 minutes.
@@ -142,7 +148,7 @@ It's helpful to monitor some key metrics for the integrated cache. These metrics
142
148
-`IntegratedCacheItemHitRate` – The proportion of point reads that used the integrated cache (out of all point reads routed through the dedicated gateway with session or eventual consistency). This value is an average of integrated cache instances across all dedicated gateway nodes.
143
149
-`IntegratedCacheQueryHitRate` – The proportion of queries that used the integrated cache (out of all queries routed through the dedicated gateway with session or eventual consistency). This value is an average of integrated cache instances across all dedicated gateway nodes.
144
150
145
-
All existing metrics are available, by default, from the **Metrics**blade (not Metrics classic):
151
+
All existing metrics are available, by default, from **Metrics**in the Azure portal (not Metrics classic):
146
152
147
153
:::image type="content" source="./media/integrated-cache/integrated-cache-metrics.png" alt-text="Screenshot of the Azure portal that shows the location of integrated cache metrics." border="false":::
0 commit comments