Skip to content

Commit 28a0806

Browse files
kathayldaisyfaithaumahyperlint-ai[bot]
authored
Update caching.mdx (#18004)
* Update limits.mdx * Edited footnotes * Update caching.mdx caching updates * Update caching.mdx * Update src/content/docs/ai-gateway/reference/limits.mdx Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com> * merge conflict * merge conflict * merge conflict --------- Co-authored-by: daisyfaithauma <[email protected]> Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com>
1 parent 1ef14ed commit 28a0806

File tree

2 files changed

+33
-19
lines changed

2 files changed

+33
-19
lines changed

src/content/docs/ai-gateway/configuration/caching.mdx

Lines changed: 23 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ description: Override caching settings on a per-request basis.
99

1010
import { TabItem, Tabs } from "~/components";
1111

12-
Enable and customize your gateway cache to serve requests directly from Cloudflares cache, instead of the original model provider, for faster requests and cost savings.
12+
Enable and customize your gateway cache to serve requests directly from Cloudflare's cache, instead of the original model provider, for faster requests and cost savings.
1313

1414
:::note
1515

@@ -51,19 +51,29 @@ To check whether a response comes from cache or not, **cf-aig-cache-status** wil
5151

5252
In order to override the default cache behavior defined on the settings tab, you can, on a per-request basis, set headers for the following options:
5353

54-
### Skip cache (cf-skip-cache)
54+
:::note
55+
56+
The following headers have been updated to new names, though the old headers will still function. We recommend updating to the new headers to ensure future compatibility:
57+
58+
`cf-cache-ttl` is now `cf-aig-cache-ttl`
59+
60+
`cf-skip-cache` is now `cf-aig-skip-cache`
61+
62+
:::
63+
64+
### Skip cache (cf-aig-skip-cache)
5565

5666
Skip cache refers to bypassing the cache and fetching the request directly from the original provider, without utilizing any cached copy.
5767

58-
You can use the header **cf-skip-cache** to bypass the cached version of the request.
68+
You can use the header **cf-aig-skip-cache** to bypass the cached version of the request.
5969

6070
As an example, when submitting a request to OpenAI, include the header in the following manner:
6171

6272
```bash title="Request skipping the cache"
6373
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
6474
--header 'Authorization: Bearer $TOKEN' \
6575
--header 'Content-Type: application/json' \
66-
--header 'cf-skip-cache: true' \
76+
--header 'cf-aig-skip-cache: true' \
6777
--data ' {
6878
"model": "gpt-4o-mini",
6979
"messages": [
@@ -76,19 +86,19 @@ curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/
7686
'
7787
```
7888

79-
### Cache TTL (cf-cache-ttl)
89+
### Cache TTL (cf-aig-cache-ttl)
8090

81-
Cache TTL, or Time To Live, is the duration a cached request remains valid before it expires and requires refreshing from the original source. You can use **cf-cache-ttl** to set the desired caching duration in milliseconds.
91+
Cache TTL, or Time To Live, is the duration a cached request remains valid before it expires and is refreshed from the original source. You can use **cf-aig-cache-ttl** to set the desired caching duration in seconds. The minimum TTL is 60 seconds and the maximum TTL is one month.
8292

83-
For example, if you set a TTL of one hour, it means that a request is kept in the cache for an hour. Within that hour, an identical request will be served from the cache instead of the original API. After an hour, the cache expires and the request will go to the original API for a more recent response, and that response will repopulate the cache for the next hour.
93+
For example, if you set a TTL of one hour, it means that a request is kept in the cache for an hour. Within that hour, an identical request will be served from the cache instead of the original API. After an hour, the cache expires and the request will go to the original API for a fresh response, and that response will repopulate the cache for the next hour.
8494

8595
As an example, when submitting a request to OpenAI, include the header in the following manner:
8696

8797
```bash title="Request to be cached for an hour"
8898
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
8999
--header 'Authorization: Bearer $TOKEN' \
90100
--header 'Content-Type: application/json' \
91-
--header 'cf-cache-ttl: 3600000' \
101+
--header 'cf-aig-cache-ttl: 3600' \
92102
--data ' {
93103
"model": "gpt-4o-mini",
94104
"messages": [
@@ -105,7 +115,7 @@ curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/
105115

106116
Custom cache keys let you override the default cache key in order to precisely set the cacheability setting for any resource. To override the default cache key, you can use the header **cf-aig-cache-key**.
107117

108-
When you use the **cf-aig-cache-key** header for the first time, you will receive a response from the provider. Subsequent requests with the same header will return the cached response. If the **cf-cache-ttl** header is used, responses will be cached according to the specified Cache Time To Live. Otherwise, responses will be cached according to the cache settings in the dashboard. If caching is not enabled for the gateway, responses will be cached for 5 minutes by default.
118+
When you use the **cf-aig-cache-key** header for the first time, you will receive a response from the provider. Subsequent requests with the same header will return the cached response. If the **cf-aig-cache-ttl** header is used, responses will be cached according to the specified Cache Time To Live. Otherwise, responses will be cached according to the cache settings in the dashboard. If caching is not enabled for the gateway, responses will be cached for 5 minutes by default.
109119

110120
As an example, when submitting a request to OpenAI, include the header in the following manner:
111121

@@ -125,3 +135,7 @@ curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/
125135
}
126136
'
127137
```
138+
139+
:::caution[AI Gateway caching behavior]
140+
Cache in AI Gateway is volatile. If two identical requests are sent simultaneously, the first request may not cache in time for the second request to use it, which may result in the second request retrieving data from the original source.
141+
:::

src/content/docs/ai-gateway/reference/limits.mdx

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -9,16 +9,16 @@ import { Render } from "~/components";
99

1010
The following limits apply to gateway configurations, logs, and related features in Cloudflare's platform.
1111

12-
| Feature | Limit |
13-
| --------------------------------------------------------------- | ----------------------------------- |
14-
| Datasets | 10 per gateway |
15-
| Gateways | 10 per account |
16-
| Logs stored [paid plan](/ai-gateway/reference/pricing/) | 10 million per gateway <sup>1</sup> |
17-
| Logs stored [free plan](/ai-gateway/reference/pricing/) | 100,000 per account <sup>2</sup> |
18-
| Log size stored | 10 MB per log <sup>3</sup> |
19-
| Logpush jobs | 4 per account |
20-
| Logpush size limit | 1MB per log |
21-
| Gateway name length | 64 characters |
12+
| Feature | Limit |
13+
| ------------------------------------------------------- | ----------------------------------- |
14+
| Datasets | 10 per gateway |
15+
| Gateways | 10 per account |
16+
| Logs stored [paid plan](/ai-gateway/reference/pricing/) | 10 million per gateway <sup>1</sup> |
17+
| Logs stored [free plan](/ai-gateway/reference/pricing/) | 100,000 per account <sup>2</sup> |
18+
| Log size stored | 10 MB per log <sup>3</sup> |
19+
| Logpush jobs | 4 per account |
20+
| Logpush size limit | 1MB per log |
21+
| Gateway name length | 64 characters |
2222

2323
<sup>1</sup> If you have reached 10 million logs stored per gateway, new logs
2424
will stop being saved. To continue saving logs, you must delete older logs in

0 commit comments

Comments
 (0)