Skip to content

Commit 326f48a

Browse files
committed
Add LLM cost reduction
1 parent 0000190 commit 326f48a

File tree

1 file changed

+23
-0
lines changed

1 file changed

+23
-0
lines changed

content/operate/rc/langcache/_index.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,29 @@ Using LangCache as a semantic caching service in Redis Cloud has the following b
3232
- **Simpler Deployments**: Access our managed service via a REST API with automated embedding generation, configurable controls.
3333
- **Advanced cache management**: Manage data access and privacy, eviction protocols, and monitor usage and cache hit rates.
3434

35+
### LLM Cost reduction with LangCache
36+
37+
LangCache reduces your LLM costs by caching responses and avoiding repeated API calls. When a response is served from cache, you don’t pay for output tokens. Input token costs are typically offset by embedding and storage costs.
38+
39+
For every cached response, you'll save the output token cost. To calculate your monthly savings with LangCache, you can use the following formula:
40+
41+
```bash
42+
Estimated monthly savings with LangCache = (Monthly output token costs) × (Cache hit rate)
43+
```
44+
45+
The more requests you serve from LangCache, the more you save, because you’re not paying to regenerate the output.
46+
47+
Here’s an example:
48+
- Monthly LLM spend: $200
49+
- Percentage of output tokens in your spend: 60%
50+
- Cost of output tokens: $200 × 60% = $120
51+
- Cache hit rate: 50%
52+
- Estimated savings: $120 × 50% = $60/month
53+
54+
{{<note>}}
55+
The forumla and numbers above will provide a rough estimate of your monthly savings. Actual savings will vary depending on your usage.
56+
{{</note>}}
57+
3558
## LangCache architecture
3659

3760
The following diagram displays how you can integrate LangCache into your GenAI app:

0 commit comments

Comments
 (0)