Skip to content

Commit 7c3628f

Browse files
committed
better wording in cache
1 parent 52018fe commit 7c3628f

File tree

1 file changed

+7
-6
lines changed
  • src/content/docs/autorag/configuration

1 file changed

+7
-6
lines changed

src/content/docs/autorag/configuration/cache.mdx

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -20,20 +20,21 @@ To see if a response came from the cache, check the `cf-aig-cache-status` header
2020
## What to consider when using similarity cache
2121

2222
Consider these behaviors when using similarity caching:
23+
2324
- **Volatile Cache**: If two similar requests hit at the same time, the first might not cache in time for the second to use it, resulting in a `MISS`.
2425
- **30-Day Cache**: Cached responses last 30 days, then expire automatically. No custom durations for now.
2526
- **Data Dependency**: Cached responses are tied to specific document chunks. If those chunks change or get deleted, the cache clears to keep answers fresh.
2627

2728
## How similarity matching works
2829

29-
Similarity caching in AutoRAG uses **MinHash with Locality-Sensitive Hashing (LSH)** to detect prompts that are lexically similar.
30+
AutoRAG’s similarity cache uses **MinHash and Locality-Sensitive Hashing (LSH)** to find and reuse responses for prompts that are worded similarly.
3031

31-
When a new prompt is received:
32+
Here’s how it works when a new prompt comes in:
3233

33-
1. The prompt is broken into overlapping token sequences (called _shingles_), typically 2–3 words each.
34-
2. These shingles are hashed into a compact fingerprint using the MinHash algorithm. Prompts with more overlapping shingles will have more similar fingerprints.
35-
3. Fingerprints are grouped into LSH buckets, which allow AutoRAG to quickly find past prompts that are likely to be similar without scanning every cached prompt.
36-
4. If a prompt in the same bucket meets the configured similarity threshold, its cached response is reused.
34+
1. The prompt is split into small overlapping chunks of words (called shingles), like “what’s the” or “the weather.”
35+
2. These shingles are turned into a fingerprint using MinHash. The more overlap two prompts have, the more similar their fingerprints will be.
36+
3. Fingerprints are placed into LSH buckets, which help AutoRAG quickly find similar prompts without comparing every single one.
37+
4. If a past prompt in the same bucket is similar enough (based on your configured threshold), AutoRAG reuses its cached response.
3738

3839
## Choosing a threshold
3940

0 commit comments

Comments
 (0)