|
1 | | -# multicache - Stupid Fast Cache |
| 1 | +# multicache - Adaptive Multi-Tier Cache |
2 | 2 |
|
3 | 3 | <img src="media/logo-small.png" alt="multicache logo" width="256"> |
4 | 4 |
|
|
8 | 8 |
|
9 | 9 | <br clear="right"> |
10 | 10 |
|
11 | | -multicache is the fastest in-memory cache for Go. Need multi-tier persistence? We have it. Need thundering herd protection? We've got that too. |
| 11 | +multicache is a high-performance cache for Go that automatically adapts to your workload. It combines **multiple eviction strategies** that switch based on access patterns, with an optional **multi-tier architecture** for persistence. |
12 | 12 |
|
13 | | -Designed for persistently caching API requests in an unreliable environment, this cache has an abundance of production-ready features: |
| 13 | +## Why "multi"? |
| 14 | + |
| 15 | +### Multiple Adaptive Strategies |
| 16 | + |
| 17 | +multicache monitors ghost hit rates (how often evicted keys return) and automatically selects the optimal eviction strategy: |
| 18 | + |
| 19 | +| Mode | Trigger | Strategy | Workload | |
| 20 | +|------|---------|----------|----------| |
| 21 | +| 0 | Ghost rate <1% | Pure recency | Scan-heavy (unique keys) | |
| 22 | +| 1 | Ghost rate 1-22% | Balanced S3-FIFO | Mixed access patterns | |
| 23 | +| 2 | Ghost rate 7-12% | Frequency-biased | Repeated hot keys | |
| 24 | +| 3 | Ghost rate ≥23% | Clock-like second-chance | High temporal locality | |
| 25 | + |
| 26 | +No tuning required - the cache learns your workload and adapts. |
| 27 | + |
| 28 | +### Multi-Tier Architecture |
| 29 | + |
| 30 | +Stack fast in-memory caching with durable persistence: |
| 31 | + |
| 32 | +``` |
| 33 | +┌─────────────────────────────────────┐ |
| 34 | +│ Your Application │ |
| 35 | +└─────────────────┬───────────────────┘ |
| 36 | + │ |
| 37 | +┌─────────────────▼───────────────────┐ |
| 38 | +│ Memory Cache (microseconds) │ ← L1: S3-FIFO with adaptive modes |
| 39 | +└─────────────────┬───────────────────┘ |
| 40 | + │ async write / sync read |
| 41 | +┌─────────────────▼───────────────────┐ |
| 42 | +│ Persistence Store (milliseconds) │ ← L2: localfs, Valkey, Datastore |
| 43 | +└─────────────────────────────────────┘ |
| 44 | +``` |
| 45 | + |
| 46 | +Persistence backends: |
| 47 | +- [`pkg/store/localfs`](pkg/store/localfs) - Local files (JSON, zero dependencies) |
| 48 | +- [`pkg/store/valkey`](pkg/store/valkey) - Valkey/Redis |
| 49 | +- [`pkg/store/datastore`](pkg/store/datastore) - Google Cloud Datastore |
| 50 | +- [`pkg/store/cloudrun`](pkg/store/cloudrun) - Auto-selects Datastore or localfs |
| 51 | +- [`pkg/store/null`](pkg/store/null) - No-op for testing |
| 52 | + |
| 53 | +All backends support optional S2 or Zstd compression via [`pkg/store/compress`](pkg/store/compress). |
14 | 54 |
|
15 | 55 | ## Features |
16 | 56 |
|
17 | | -- **Faster than a bat out of hell** - Best-in-class latency and throughput |
18 | | -- **S3-FIFO eviction** - Better hit-rates than LRU ([learn more](https://s3fifo.com/)) |
19 | | -- **Multi-tier persistent cache (optional)** - Bring your own database or use built-in backends: |
20 | | - - [`pkg/store/cloudrun`](pkg/store/cloudrun) - Automatically select Google Cloud Datastore in Cloud Run, localfs elsewhere |
21 | | - - [`pkg/store/datastore`](pkg/store/datastore) - Google Cloud Datastore |
22 | | - - [`pkg/store/localfs`](pkg/store/localfs) - Local files (JSON encoding, zero dependencies) |
23 | | - - [`pkg/store/null`](pkg/store/null) - No-op (for testing or TieredCache API compatibility) |
24 | | - - [`pkg/store/valkey`](pkg/store/valkey) - Valkey/Redis |
25 | | -- **Optional compression** - S2 or Zstd for all persistence backends via [`pkg/store/compress`](pkg/store/compress) |
| 57 | +- **Best-in-class performance** - 7ns reads, 100M+ QPS single-threaded |
| 58 | +- **Adaptive S3-FIFO eviction** - Better hit-rates than LRU ([learn more](https://s3fifo.com/)) |
| 59 | +- **Thundering herd prevention** - `GetSet` deduplicates concurrent loads |
26 | 60 | - **Per-item TTL** - Optional expiration |
27 | | -- **Thundering herd prevention** - `GetSet` deduplicates concurrent loads for the same key |
28 | 61 | - **Graceful degradation** - Cache works even if persistence fails |
29 | | -- **Zero allocation updates** - minimal GC thrashing |
| 62 | +- **Zero allocation updates** - Minimal GC pressure |
30 | 63 |
|
31 | 64 | ## Usage |
32 | 65 |
|
@@ -169,34 +202,24 @@ Want even more comprehensive benchmarks? See https://github.com/tstromberg/gocac |
169 | 202 |
|
170 | 203 | ## Implementation Notes |
171 | 204 |
|
172 | | -### Differences from the S3-FIFO paper |
173 | | - |
174 | | -multicache implements the core S3-FIFO algorithm (Small/Main/Ghost queues with frequency-based promotion) with these optimizations: |
175 | | - |
176 | | -1. **Dynamic Sharding** - 1-2048 independent S3-FIFO shards (vs single-threaded) for concurrent workloads |
177 | | -2. **Bloom Filter Ghosts** - Two rotating Bloom filters track evicted keys (vs storing actual keys), reducing memory 10-100x |
178 | | -3. **Lazy Ghost Checks** - Only check ghosts when evicting, saving 5-9% latency when cache isn't full |
179 | | -4. **Intrusive Lists** - Embed pointers in entries (vs separate nodes) for zero-allocation queue ops |
180 | | -5. **Fast-path Hashing** - Specialized for `int`/`string` keys using wyhash and bit mixing |
181 | | - |
182 | | -### Adaptive Mode Detection |
| 205 | +### S3-FIFO Enhancements |
183 | 206 |
|
184 | | -multicache automatically detects workload characteristics and adjusts its eviction strategy using ghost hit rate (how often evicted keys are re-requested): |
| 207 | +multicache implements the S3-FIFO algorithm from SOSP'23 with these optimizations: |
185 | 208 |
|
186 | | -| Mode | Ghost Rate | Strategy | Best For | |
187 | | -|------|------------|----------|----------| |
188 | | -| 0 | <1% | Pure recency, skip ghost tracking | Scan-heavy workloads | |
189 | | -| 1 | 1-6% or 13-22% | Balanced, promote if freq > 0 | Mixed workloads | |
190 | | -| 2 | 7-12% | Frequency-heavy, promote if freq > 1 | Frequency-skewed workloads | |
191 | | -| 3 | ≥23% | Clock-like, all items to main with second-chance | High-recency workloads | |
| 209 | +1. **Dynamic Sharding** - 1-2048 independent shards for concurrent workloads |
| 210 | +2. **Bloom Filter Ghosts** - Two rotating Bloom filters (vs storing keys), 10-100x less memory |
| 211 | +3. **Lazy Ghost Checks** - Only check ghosts at capacity, saving 5-9% latency during warmup |
| 212 | +4. **Intrusive Lists** - Zero-allocation queue operations |
| 213 | +5. **Fast-path Hashing** - Specialized `int`/`string` hashing via wyhash |
192 | 214 |
|
193 | | -Mode 2 uses **hysteresis** to prevent oscillation: entry requires 7-12% ghost rate, but stays active while rate is 5-22%. |
| 215 | +### Adaptive Mode Details |
194 | 216 |
|
195 | | -### Other Optimizations |
| 217 | +Mode switching uses **hysteresis** to prevent oscillation. Mode 2 (frequency-biased) requires 7-12% ghost rate to enter, but stays active while rate is 5-22%. |
196 | 218 |
|
197 | | -- **Adaptive Queue Sizing** - Small queue is 20% for caches ≤32K, 15% for ≤128K, 10% for larger (paper recommends 10%) |
198 | | -- **Ghost Frequency Boost** - Items returning from ghost start with freq=1 instead of 0 |
199 | | -- **Higher Frequency Cap** - Max freq=7 (vs 3 in paper) for better hot/warm discrimination |
| 219 | +Additional tuning beyond the paper: |
| 220 | +- **Adaptive queue sizing** - Small queue is 20% for caches ≤32K, 15% for ≤128K, 10% for larger |
| 221 | +- **Ghost frequency boost** - Returning items start with freq=1 instead of 0 |
| 222 | +- **Higher frequency cap** - Max freq=7 (vs 3) for better hot/warm discrimination |
200 | 223 |
|
201 | 224 | ## License |
202 | 225 |
|
|
0 commit comments