Skip to content

Commit 77d856e

Browse files
authored
Update caching.mdx
update caching documentation. trial using new gemini access
1 parent 3d32791 commit 77d856e

File tree

1 file changed

+18
-2
lines changed

1 file changed

+18
-2
lines changed

src/content/docs/ai-gateway/configuration/caching.mdx

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,14 @@ description: Override caching settings on a per-request basis.
99

1010
import { TabItem, Tabs } from "~/components";
1111

12-
Enable and customize your gateway cache to serve requests directly from Cloudflare's cache, instead of the original model provider, for faster requests and cost savings.
12+
AI Gateway can cache responses from your AI model providers, serving them directly from Cloudflare's edge network for identical requests. This can significantly improve response times and reduce costs.
13+
14+
## Benefits of Using Caching
15+
16+
- **Reduced Latency:** Serve responses faster to your users by avoiding a round trip to the origin AI provider for repeated requests.
17+
- **Cost Savings:** Minimize the number of paid requests made to your AI provider, especially for frequently accessed or non-dynamic content.
18+
- **Increased Throughput:** Offload repetitive requests from your AI provider, allowing it to handle unique requests more efficiently.
19+
1320

1421
:::note
1522

@@ -51,7 +58,16 @@ To check whether a response comes from cache or not, **cf-aig-cache-status** wil
5158

5259
## Per-request caching
5360

54-
In order to override the default cache behavior defined on the settings tab, you can, on a per-request basis, set headers for the following options:
61+
While your gateway's default cache settings provide a good baseline, you might encounter scenarios where:
62+
63+
- **Freshness is critical:** Some API calls must always fetch the absolute latest data from the origin provider, irrespective of global caching rules.
64+
- **Content has varying lifespans:** A global Time To Live (TTL) might be too long for frequently updated information or too short for highly static content, leading to either stale data or reduced cache effectiveness.
65+
- **Responses are dynamic or personalized:** Caching user-specific or highly dynamic responses with a generic cache key could lead to incorrect data being served.
66+
- **Specific caching strategies are needed:** You might want to define exactly how a particular piece of content is cached, separate from other requests.
67+
68+
To address these needs, AI Gateway allows you to override default cache behaviors on a per-request basis using specific HTTP headers. This gives you the precision to optimize caching for individual API calls, ensuring the right balance of performance, cost-efficiency, and data accuracy.
69+
70+
The following headers allow you to define this per-request cache behavior:
5571

5672
:::note
5773

0 commit comments

Comments
 (0)