Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 22 additions & 8 deletions src/content/docs/ai-gateway/configuration/caching.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -51,19 +51,29 @@ To check whether a response comes from cache or not, **cf-aig-cache-status** wil

In order to override the default cache behavior defined on the settings tab, you can, on a per-request basis, set headers for the following options:

### Skip cache (cf-skip-cache)
:::note

The following headers have been updated to new names, though the old headers will still function. We recommend updating to the new headers to ensure future compatibility:

`cf-cache-ttl` is now `cf-aig-cache-ttl`

`cf-skip-cache` is now `cf-aig-skip-cache`

:::

### Skip cache (cf-aig-skip-cache)

Skip cache refers to bypassing the cache and fetching the request directly from the original provider, without utilizing any cached copy.

You can use the header **cf-skip-cache** to bypass the cached version of the request.
You can use the header **cf-aig-skip-cache** to bypass the cached version of the request.

As an example, when submitting a request to OpenAI, include the header in the following manner:

```bash title="Request skipping the cache"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
--header 'Authorization: Bearer $TOKEN' \
--header 'Content-Type: application/json' \
--header 'cf-skip-cache: true' \
--header 'cf-aig-skip-cache: true' \
--data ' {
"model": "gpt-4o-mini",
"messages": [
Expand All @@ -76,19 +86,19 @@ curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/
'
```

### Cache TTL (cf-cache-ttl)
### Cache TTL (cf-aig-cache-ttl)

Cache TTL, or Time To Live, is the duration a cached request remains valid before it expires and requires refreshing from the original source. You can use **cf-cache-ttl** to set the desired caching duration in milliseconds.
Cache TTL, or Time To Live, is the duration a cached request remains valid before it expires and is refreshed from the original source. You can use **cf-aig-cache-ttl** to set the desired caching duration in seconds. The minimum TTL is 60 seconds and the maximum TTL is one month.

For example, if you set a TTL of one hour, it means that a request is kept in the cache for an hour. Within that hour, an identical request will be served from the cache instead of the original API. After an hour, the cache expires and the request will go to the original API for a more recent response, and that response will repopulate the cache for the next hour.
For example, if you set a TTL of one hour, it means that a request is kept in the cache for an hour. Within that hour, an identical request will be served from the cache instead of the original API. After an hour, the cache expires and the request will go to the original API for a fresh response, and that response will repopulate the cache for the next hour.

As an example, when submitting a request to OpenAI, include the header in the following manner:

```bash title="Request to be cached for an hour"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
--header 'Authorization: Bearer $TOKEN' \
--header 'Content-Type: application/json' \
--header 'cf-cache-ttl: 3600000' \
--header 'cf-aig-cache-ttl: 3600' \
--data ' {
"model": "gpt-4o-mini",
"messages": [
Expand All @@ -105,7 +115,7 @@ curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/

Custom cache keys let you override the default cache key in order to precisely set the cacheability setting for any resource. To override the default cache key, you can use the header **cf-aig-cache-key**.

When you use the **cf-aig-cache-key** header for the first time, you will receive a response from the provider. Subsequent requests with the same header will return the cached response. If the **cf-cache-ttl** header is used, responses will be cached according to the specified Cache Time To Live. Otherwise, responses will be cached according to the cache settings in the dashboard. If caching is not enabled for the gateway, responses will be cached for 5 minutes by default.
When you use the **cf-aig-cache-key** header for the first time, you will receive a response from the provider. Subsequent requests with the same header will return the cached response. If the **cf-aig-cache-ttl** header is used, responses will be cached according to the specified Cache Time To Live. Otherwise, responses will be cached according to the cache settings in the dashboard. If caching is not enabled for the gateway, responses will be cached for 5 minutes by default.

As an example, when submitting a request to OpenAI, include the header in the following manner:

Expand All @@ -125,3 +135,7 @@ curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/
}
'
```

:::caution[AI Gateway Caching Behavior]
Cache in AI Gateway is volatile. If two identical requests are sent simultaneously, the first request may not cache in time for the second request to use it, which may result in the second request retrieving data from the original source.
:::
17 changes: 17 additions & 0 deletions src/content/docs/ai-gateway/reference/limits.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,22 @@ The following limits apply to gateway configurations, logs, and related features

| Feature | Limit |
| --------------------------------------------------------------- | ----------------------------------- |
<<<<<<< HEAD
| Maximum gateways | 10 per account |
| Maximum logs stored [paid plan](/ai-gateway/reference/pricing/) | 10 million per gateway <sup>1</sup> |
| Maximum logs stored [free plan](/ai-gateway/reference/pricing/) | 100,000 per account <sup>2</sup> |
| Maximum log size stored | 10 MB per log <sup>3</sup> |
| Maximum Logpush jobs | 4 per account |
| Maximum gateway name length | 64 characters |

<sup>1</sup> If you have reached 10 million logs stored per gateway, new logs
will stop being saved. To continue saving logs, you must delete older logs in
that gateway to free up space or create a new gateway.

<sup>2</sup> If you have reached 100,000 logs stored per account, accross all
gateways, new logs will stop being saved. To continue saving logs, you must
Delete older logs.
=======
| Datasets | 10 per gateway |
| Gateways | 10 per account |
| Logs stored [paid plan](/ai-gateway/reference/pricing/) | 10 million per gateway <sup>1</sup> |
Expand All @@ -31,6 +47,7 @@ gateways, new logs will stop being saved. To continue saving logs, you must
delete older logs. Refer to [Auto Log
Cleanup](/ai-gateway/observability/logging/#auto-log-cleanup) for more details
on how to automatically delete logs.
>>>>>>> 914c70d6b867c652d10b92bd7fb44a284bafacea

<sup>3</sup> Logs larger than 10 MB will not be stored.

Expand Down
Loading