You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cosmos-db/how-to-configure-integrated-cache.md
+7-4Lines changed: 7 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,19 +26,19 @@ This article describes how to provision a dedicated gateway, configure the integ
26
26
27
27
1. Navigate to an Azure Cosmos DB account in the Azure portal and select the **Dedicated Gateway** tab.
28
28
29
-
:::image type="content" source="./media/how-to-configure-integrated-cache/dedicated-gateway-tab.png" alt-text="Screenshot of the Azure Portal that shows how to navigate to the Azure Cosmos DB dedicated gateway tab." lightbox="./media/how-to-configure-integrated-cache/dedicated-gateway-tab.png" :::
29
+
:::image type="content" source="./media/how-to-configure-integrated-cache/dedicated-gateway-tab.png" alt-text="Screenshot of the Azure portal that shows how to navigate to the Azure Cosmos DB dedicated gateway tab." lightbox="./media/how-to-configure-integrated-cache/dedicated-gateway-tab.png" :::
30
30
31
31
2. Fill out the **Dedicated gateway** form with the following details:
32
32
33
33
***Dedicated Gateway** - Turn on the toggle to **Provisioned**.
34
34
***SKU** - Select a SKU with the required compute and memory size. The integrated cache will use approximately 50% of the memory, and the remaining memory is used for metadata and routing requests to the backend partitions.
35
35
***Number of instances** - Number of nodes. For development purpose, we recommend starting with one node of the D4 size. Based on the amount of data you need to cache and to achieve high availability, you can increase the node size after initial testing.
36
36
37
-
:::image type="content" source="./media/how-to-configure-integrated-cache/dedicated-gateway-input.png" alt-text="Screenshot of the Azure Portal dedicated gateway tab that shows sample input settings for creating a dedicated gateway cluster." lightbox="./media/how-to-configure-integrated-cache/dedicated-gateway-input.png" :::
37
+
:::image type="content" source="./media/how-to-configure-integrated-cache/dedicated-gateway-input.png" alt-text="Screenshot of the Azure portal dedicated gateway tab that shows sample input settings for creating a dedicated gateway cluster." lightbox="./media/how-to-configure-integrated-cache/dedicated-gateway-input.png" :::
38
38
39
39
3. Select **Save** and wait about 5-10 minutes for the dedicated gateway provisioning to complete. When the provisioning is done, you'll see the following notification:
40
40
41
-
:::image type="content" source="./media/how-to-configure-integrated-cache/dedicated-gateway-notification.png" alt-text="Screenshot of a notification in the Azure Portal that shows how to check if dedicated gateway provisioning is complete." lightbox="./media/how-to-configure-integrated-cache/dedicated-gateway-notification.png" :::
41
+
:::image type="content" source="./media/how-to-configure-integrated-cache/dedicated-gateway-notification.png" alt-text="Screenshot of a notification in the Azure portal that shows how to check if dedicated gateway provisioning is complete." lightbox="./media/how-to-configure-integrated-cache/dedicated-gateway-notification.png" :::
42
42
43
43
## Configuring the integrated cache
44
44
@@ -48,7 +48,7 @@ When you create a dedicated gateway, an integrated cache is automatically provis
48
48
49
49
The updated dedicated gateway connection string is in the **Keys** blade:
50
50
51
-
:::image type="content" source="./media/how-to-configure-integrated-cache/dedicated-gateway-connection-string.png" alt-text="Screenshot of the Azure Portal keys tab with the dedicated gateway connection string." lightbox="./media/how-to-configure-integrated-cache/dedicated-gateway-connection-string.png" :::
51
+
:::image type="content" source="./media/how-to-configure-integrated-cache/dedicated-gateway-connection-string.png" alt-text="Screenshot of the Azure portal keys tab with the dedicated gateway connection string." lightbox="./media/how-to-configure-integrated-cache/dedicated-gateway-connection-string.png" :::
52
52
53
53
All dedicated gateway connection strings follow the same pattern. Remove `documents.azure.com` from your original connection string and replace it with `sqlx.cosmos.azure.com`. A dedicated gateway will always have the same connection string, even if you remove and reprovision it.
54
54
@@ -70,6 +70,9 @@ You must ensure the request consistency is session or eventual. If not, the requ
70
70
71
71
Configure `MaxIntegratedCacheStaleness`, which is the maximum time in which you are willing to tolerate stale cached data. It is recommended to set the `MaxIntegratedCacheStaleness` as high as possible because it will increase the likelihood that repeated point reads and queries can be cache hits. If you set `MaxIntegratedCacheStaleness` to 0, your read request will **never** use the integrated cache, regardless of the consistency level. When not configured, the default `MaxIntegratedCacheStaleness` is 5 minutes.
72
72
73
+
>[!NOTE]
74
+
> The `MaxIntegratedCacheStaleness` can be set as high as 10 years. In practice, this value is the maximum staleness and the cache may be reset sooner due to node restarts which may occur.
75
+
73
76
Adjusting the `MaxIntegratedCacheStaleness` is supported in these versions of each SDK:
Copy file name to clipboardExpand all lines: articles/cosmos-db/integrated-cache.md
+14-14Lines changed: 14 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,15 +21,15 @@ An integrated cache is automatically configured within the dedicated gateway. Th
21
21
* An item cache for point reads
22
22
* A query cache for queries
23
23
24
-
The integrated cache is a read-through, write-through cache with a Least Recently Used (LRU) eviction policy. The item cache and query cache share the same capacity within the integrated cache and the LRU eviction policy applies to both. In other words, data is evicted from the cache strictly based on when it was least recently used, regardless of whether it is a point read or query.
24
+
The integrated cache is a read-through, write-through cache with a Least Recently Used (LRU) eviction policy. The item cache and query cache share the same capacity within the integrated cache and the LRU eviction policy applies to both. In other words, data is evicted from the cache strictly based on when it was least recently used, regardless of whether it's a point read or query.
25
25
26
26
> [!NOTE]
27
27
> Do you have any feedback about the integrated cache? We want to hear it! Feel free to share feedback directly with the Azure Cosmos DB engineering team:
## Workloads that benefit from the integrated cache
31
31
32
-
The main goal of the integrated cache is to reduce costs for read-heavy workloads. Low latency, while helpful, is not the main benefit of the integrated cache because Azure Cosmos DB is already fast without caching.
32
+
The main goal of the integrated cache is to reduce costs for read-heavy workloads. Low latency, while helpful, isn't the main benefit of the integrated cache because Azure Cosmos DB is already fast without caching.
33
33
34
34
Point reads and queries that hit the integrated cache will have an RU charge of 0. Cache hits will have a much lower per-operation cost than reads from the backend database.
35
35
@@ -40,9 +40,9 @@ Workloads that fit the following characteristics should evaluate if the integrat
40
40
- Many repeated high RU queries
41
41
- Hot partition key for reads
42
42
43
-
The biggest factor in expected savings is the degree to which reads repeat themselves. If your workload consistently executes the same point reads or queries within a short period of time, it is a great candidate for the integrated cache. When using the integrated cache for repeated reads, you only use RU's for the first read. Subsequent reads routed through the same dedicated gateway node (within the `MaxIntegratedCacheStaleness` window and if the data hasn't been evicted) won't use throughput.
43
+
The biggest factor in expected savings is the degree to which reads repeat themselves. If your workload consistently executes the same point reads or queries within a short period of time, it's a great candidate for the integrated cache. When using the integrated cache for repeated reads, you only use RUs for the first read. Subsequent reads routed through the same dedicated gateway node (within the `MaxIntegratedCacheStaleness` window and if the data hasn't been evicted) won't use throughput.
44
44
45
-
Some workloads should not consider the integrated cache, including:
45
+
Some workloads shouldn't consider the integrated cache, including:
46
46
47
47
- Write-heavy workloads
48
48
- Rarely repeated point reads or queries
@@ -68,7 +68,7 @@ The query cache can be used to cache queries. The query cache transforms a query
68
68
69
69
### Populating the query cache
70
70
71
-
- If the cache does not have a result for that query (cache miss), the query is sent to the backend. After the query is run, the cache will store the results for that query
71
+
- If the cache doesn't have a result for that query (cache miss), the query is sent to the backend. After the query is run, the cache will store the results for that query
72
72
73
73
### Query cache eviction
74
74
@@ -107,10 +107,10 @@ This is an improvement from how most caches work and allows the following additi
107
107
108
108
- You can set different staleness requirements for each point read or query
109
109
- Different clients, even if they run the same point read or query, can configure different `MaxIntegratedCacheStaleness` values
110
-
- If you wanted to modify read consistency when using cached data, changing `MaxIntegratedCacheStaleness` will have an immediate effect on read consistency
110
+
- If you wanted to modify read consistency for cached data, changing `MaxIntegratedCacheStaleness` will have an immediate effect on read consistency
111
111
112
112
> [!NOTE]
113
-
> When not explicitly configured, the MaxIntegratedCacheStaleness defaults to 5 minutes.
113
+
> The minimum `MaxIntegratedCacheStaleness` value is 0 and the maximum value is 10 years. When not explicitly configured, the `MaxIntegratedCacheStaleness` defaults to 5 minutes.
114
114
115
115
To better understand the `MaxIntegratedCacheStaleness` parameter, consider the following example:
116
116
@@ -128,23 +128,23 @@ To better understand the `MaxIntegratedCacheStaleness` parameter, consider the f
128
128
129
129
## Metrics
130
130
131
-
When using the integrated cache, it is helpful to monitor some key metrics. The integrated cache metrics include:
131
+
It's helpful to monitor some key metrics for the integrated cache. These metrics include:
132
132
133
133
-`DedicatedGatewayCPUUsage` - CPU usage with Avg, Max, or Min Aggregation types for data across all dedicated gateway nodes.
134
134
-`DedicatedGatewayAverageCPUUsage` - (Deprecated) Average CPU usage across all dedicated gateway nodes.
135
135
-`DedicatedGatewayMaximumCPUUsage` - (Deprecated) Maximum CPU usage across all dedicated gateway nodes.
136
136
-`DedicatedGatewayMemoryUsage` - Memory usage with Avg, Max, or Min Aggregation types for data across all dedicated gateway nodes.
137
137
-`DedicatedGatewayAverageMemoryUsage` - (Deprecated) Average memory usage across all dedicated gateway nodes.
138
138
-`DedicatedGatewayRequests` - Total number of dedicated gateway requests across all dedicated gateway nodes.
139
-
-`IntegratedCacheEvictedEntriesSize` – The average amount of data evicted from the integrated cache due to LRU across all dedicated gateway nodes. This value does not include data that expired due to exceeding the `MaxIntegratedCacheStaleness` time.
139
+
-`IntegratedCacheEvictedEntriesSize` – The average amount of data evicted from the integrated cache due to LRU across all dedicated gateway nodes. This value doesn't include data that expired due to exceeding the `MaxIntegratedCacheStaleness` time.
140
140
-`IntegratedCacheItemExpirationCount` - The average number of items that are evicted from the integrated cache due to cached point reads exceeding the `MaxIntegratedCacheStaleness` time across all dedicated gateway nodes.
141
141
-`IntegratedCacheQueryExpirationCount` - The average number of queries that are evicted from the integrated cache due to cached queries exceeding the `MaxIntegratedCacheStaleness` time across all dedicated gateway nodes.
142
142
-`IntegratedCacheItemHitRate` – The proportion of point reads that used the integrated cache (out of all point reads routed through the dedicated gateway with session or eventual consistency). This value is an average of integrated cache instances across all dedicated gateway nodes.
143
143
-`IntegratedCacheQueryHitRate` – The proportion of queries that used the integrated cache (out of all queries routed through the dedicated gateway with session or eventual consistency). This value is an average of integrated cache instances across all dedicated gateway nodes.
144
144
145
145
All existing metrics are available, by default, from the **Metrics** blade (not Metrics classic):
146
146
147
-
:::image type="content" source="./media/integrated-cache/integrated-cache-metrics.png" alt-text="Screenshot of the Azure Portal that shows the location of integrated cache metrics." border="false":::
147
+
:::image type="content" source="./media/integrated-cache/integrated-cache-metrics.png" alt-text="Screenshot of the Azure portal that shows the location of integrated cache metrics." border="false":::
148
148
149
149
Metrics are either an average, maximum, or sum across all dedicated gateway nodes. For example, if you provision a dedicated gateway cluster with five nodes, the metrics reflect the aggregated value across all five nodes. It isn't possible to determine the metric values for each individual node.
150
150
@@ -154,21 +154,21 @@ The below examples show how to debug some common scenarios:
154
154
155
155
### I can’t tell if my application is using the dedicated gateway
156
156
157
-
Check the `DedicatedGatewayRequests`. This metric includes all requests that use the dedicated gateway, regardless of whether they hit the integrated cache. If your application uses the standard gateway or direct mode with your original connection string, you won't see an error message but the `DedicatedGatewayRequests` will be zero.
157
+
Check the `DedicatedGatewayRequests`. This metric includes all requests that use the dedicated gateway, regardless of whether they hit the integrated cache. If your application uses the standard gateway or direct mode with your original connection string, you won't see an error message, but the `DedicatedGatewayRequests` will be zero.
158
158
159
159
### I can’t tell if my requests are hitting the integrated cache
160
160
161
-
Check the `IntegratedCacheItemHitRate` and `IntegratedCacheQueryHitRate`. If both of these values are zero, then requests are not hitting the integrated cache. Check that you are using the dedicated gateway connection string, [connecting with gateway mode](nosql/sdk-connection-modes.md), and [have set session or eventual consistency](consistency-levels.md#configure-the-default-consistency-level).
161
+
Check the `IntegratedCacheItemHitRate` and `IntegratedCacheQueryHitRate`. If both of these values are zero, then requests aren't hitting the integrated cache. Check that you're using the dedicated gateway connection string, [connecting with gateway mode](nosql/sdk-connection-modes.md), and [have set session or eventual consistency](consistency-levels.md#configure-the-default-consistency-level).
162
162
163
163
### I want to understand if my dedicated gateway is too small
164
164
165
-
Check the `IntegratedCacheItemHitRate` and `IntegratedCacheQueryHitRate`. If these values are high (for example, above 0.7-0.8), this is a good sign that the dedicated gateway is large enough.
165
+
Check the `IntegratedCacheItemHitRate` and `IntegratedCacheQueryHitRate`. High values (for example, above 0.7-0.8) are a good sign that the dedicated gateway is large enough.
166
166
167
167
If the `IntegratedCacheItemHitRate` or `IntegratedCacheQueryHitRate`is low, look at the `IntegratedCacheEvictedEntriesSize`. If the `IntegratedCacheEvictedEntriesSize` is high, it may mean that a larger dedicated gateway size would be beneficial. You can experiment by increasing the dedicated gateway size and comparing the new `IntegratedCacheItemHitRate` and `IntegratedCacheQueryHitRate`. If a larger dedicated gateway doesn't improve the `IntegratedCacheItemHitRate` or `IntegratedCacheQueryHitRate`, it's possible that reads simply don't repeat themselves enough for the integrated cache to be impactful.
168
168
169
169
### I want to understand if my dedicated gateway is too large
170
170
171
-
It is more difficult to measure if a dedicated gateway is too large than it is to measure if a dedicated gateway is too small. In general, you should start small and slowly increase the dedicated gateway size until the `IntegratedCacheItemHitRate` and `IntegratedCacheQueryHitRate` stop improving. In some cases, only one of the two cache hit metrics will be important, not both. For example, if your workload is primarily queries, rather than point reads, the `IntegratedCacheQueryHitRate` is much more important than the `IntegratedCacheItemHitRate`.
171
+
It's more difficult to measure if a dedicated gateway is too large than it is to measure if a dedicated gateway is too small. In general, you should start small and slowly increase the dedicated gateway size until the `IntegratedCacheItemHitRate` and `IntegratedCacheQueryHitRate` stop improving. In some cases, only one of the two cache hit metrics will be important, not both. For example, if your workload is primarily queries, rather than point reads, the `IntegratedCacheQueryHitRate` is much more important than the `IntegratedCacheItemHitRate`.
172
172
173
173
If most data is evicted from the cache due to exceeding the `MaxIntegratedCacheStaleness`, rather than LRU, your cache might be larger than required. If `IntegratedCacheItemExpirationCount` and `IntegratedCacheQueryExpirationCount` combined are nearly as large as `IntegratedCacheEvictedEntriesSize`, you can experiment with a smaller dedicated gateway size and compare performance.
0 commit comments