Skip to content

Commit d84cd60

Browse files
bibryammsfussell
andauthored
Added prompt caching details (#4882)
Signed-off-by: Bilgin Ibryam <[email protected]> Co-authored-by: Mark Fussell <[email protected]>
1 parent 9c44ff7 commit d84cd60

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

daprdocs/content/en/developing-applications/building-blocks/conversation/conversation-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ The following features are out-of-the-box for [all the supported conversation co
3131

3232
### Prompt caching
3333

34-
Prompt caching optimizes performance by storing and reusing prompts that are often repeated across multiple API calls. To significantly reduce latency and cost, Dapr stores frequent prompts in a local cache to be reused by your cluster, pod, or other, instead of reprocessing the information for every new request.
34+
The Conversation API includes a built-in caching mechanism (enabled by the cacheTTL parameter) that optimizes both performance and cost by storing previous model responses for faster delivery to repetitive requests. This is particularly valuable in scenarios where similar prompt patterns occur frequently. When caching is enabled, Dapr creates a deterministic hash of the prompt text and all configuration parameters, checks if a valid cached response exists for this hash within the time period (for example, 10 minutes), and returns the cached response immediately if found. If no match exists, Dapr makes the API call and stores the result. This eliminates external API calls, lowers latency, and avoids provider charges for repeated requests. The cache exists entirely within your runtime environment, with each Dapr sidecar maintaining its own local cache.
3535

3636
### Personally identifiable information (PII) obfuscation
3737

0 commit comments

Comments
 (0)