Update caching.mdx

kathayl · web-flow · commit 77d856e64b3e · 2025-05-06T15:28:38.000-07:00
update caching documentation. trial using new gemini access
diff --git a/src/content/docs/ai-gateway/configuration/caching.mdx b/src/content/docs/ai-gateway/configuration/caching.mdx
@@ -9,7 +9,14 @@ description: Override caching settings on a per-request basis.
 
 import { TabItem, Tabs } from "~/components";
 
-Enable and customize your gateway cache to serve requests directly from Cloudflare's cache, instead of the original model provider, for faster requests and cost savings.
+AI Gateway can cache responses from your AI model providers, serving them directly from Cloudflare's edge network for identical  requests. This can significantly improve response times and reduce costs.
+
+## Benefits of Using Caching
+
+- **Reduced Latency:** Serve responses faster to your users by avoiding a round trip to the origin AI provider for repeated requests.
+- **Cost Savings:** Minimize the number of paid requests made to your AI provider, especially for frequently accessed or non-dynamic content.
+- **Increased Throughput:** Offload repetitive requests from your AI provider, allowing it to handle unique requests more efficiently.
+
 
 :::note
 
@@ -51,7 +58,16 @@ To check whether a response comes from cache or not, **cf-aig-cache-status** wil
 
 ## Per-request caching
 
-In order to override the default cache behavior defined on the settings tab, you can, on a per-request basis, set headers for the following options:
+While your gateway's default cache settings provide a good baseline, you might encounter scenarios where:
+
+- **Freshness is critical:** Some API calls must always fetch the absolute latest data from the origin provider, irrespective of global caching rules.
+- **Content has varying lifespans:** A global Time To Live (TTL) might be too long for frequently updated information or too short for highly static content, leading to either stale data or reduced cache effectiveness.
+- **Responses are dynamic or personalized:** Caching user-specific or highly dynamic responses with a generic cache key could lead to incorrect data being served.
+- **Specific caching strategies are needed:** You might want to define exactly how a particular piece of content is cached, separate from other requests.
+
+To address these needs, AI Gateway allows you to override default cache behaviors on a per-request basis using specific HTTP headers. This gives you the precision to optimize caching for individual API calls, ensuring the right balance of performance, cost-efficiency, and data accuracy.
+
+The following headers allow you to define this per-request cache behavior:
 
 :::note