You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/content/docs/ai-gateway/configuration/caching.mdx
+18-2Lines changed: 18 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,14 @@ description: Override caching settings on a per-request basis.
9
9
10
10
import { TabItem, Tabs } from"~/components";
11
11
12
-
Enable and customize your gateway cache to serve requests directly from Cloudflare's cache, instead of the original model provider, for faster requests and cost savings.
12
+
AI Gateway can cache responses from your AI model providers, serving them directly from Cloudflare's edge network for identical requests. This can significantly improve response times and reduce costs.
13
+
14
+
## Benefits of Using Caching
15
+
16
+
-**Reduced Latency:** Serve responses faster to your users by avoiding a round trip to the origin AI provider for repeated requests.
17
+
-**Cost Savings:** Minimize the number of paid requests made to your AI provider, especially for frequently accessed or non-dynamic content.
18
+
-**Increased Throughput:** Offload repetitive requests from your AI provider, allowing it to handle unique requests more efficiently.
19
+
13
20
14
21
:::note
15
22
@@ -51,7 +58,16 @@ To check whether a response comes from cache or not, **cf-aig-cache-status** wil
51
58
52
59
## Per-request caching
53
60
54
-
In order to override the default cache behavior defined on the settings tab, you can, on a per-request basis, set headers for the following options:
61
+
While your gateway's default cache settings provide a good baseline, you might encounter scenarios where:
62
+
63
+
-**Freshness is critical:** Some API calls must always fetch the absolute latest data from the origin provider, irrespective of global caching rules.
64
+
-**Content has varying lifespans:** A global Time To Live (TTL) might be too long for frequently updated information or too short for highly static content, leading to either stale data or reduced cache effectiveness.
65
+
-**Responses are dynamic or personalized:** Caching user-specific or highly dynamic responses with a generic cache key could lead to incorrect data being served.
66
+
-**Specific caching strategies are needed:** You might want to define exactly how a particular piece of content is cached, separate from other requests.
67
+
68
+
To address these needs, AI Gateway allows you to override default cache behaviors on a per-request basis using specific HTTP headers. This gives you the precision to optimize caching for individual API calls, ensuring the right balance of performance, cost-efficiency, and data accuracy.
69
+
70
+
The following headers allow you to define this per-request cache behavior:
0 commit comments