📝 ADR-016: Two-Tier Cache Architecture#5332
Conversation
| # Tier 1: Memory cache (99% hit - FAST!) | ||
| if self._memory_cache is not None and key in self._memory_cache: | ||
| return self._memory_cache[key] |
There was a problem hiding this comment.
copy from #5304 (comment)
This will only valid for immutable cache, if any remote cache key not change, but value changed, we will have stale data.
If the goal is to avoid deserialization and large string response, we can add etag to cache value. Either use 2 cache items key & key:etag, or use redis hash to store both etag and content in one item.
The get flow will be:
- get etag from remote cache
- get content from local cache by etag
- fallback to get remote cache content
- sync local cache if missed
The set flow will be:
- calculate etag of content
- set both etag and content to local & remote
There was a problem hiding this comment.
But the memory cache has a very short TTL (max 5 seconds), it's just to avoid any Redis traffic and deserialization in a short time period. Eg. within one reconciliation call. For example, SlackAPI.get_users is called 3 times within a few seconds, and a simple memory cache speeds everything up.
There was a problem hiding this comment.
Then your use case is cache in the same lifecycle of request / task processing, that's a standard feature in ORM / web frameworks to optimize perf to just use point in time data during one request, we can definitely add it, just not use TTL but tie to execution context.
Even for a short TTL, stale data can lead to unexpected result for concurrent requests handled by different servers.
There was a problem hiding this comment.
What do you mean by "but tie to execution context"? I can't control the usage side of the cache. For sure, the memory cache supports the lifecycle of a request/task/process.
There was a problem hiding this comment.
yes, memory cache start and ends with request/task, when we inject cache service, should make memory cache as instance level so it can auto cleanup after request/task done. Same as https://docs.python.org/3/library/functools.html#functools.cached_property or instance level cache / lru_cache.
There was a problem hiding this comment.
TBH, I don't know. The current cache backend implementation is a singleton; this also hardly depends on how FastAPI handles the requests. asyncio vs process vs threads vs uvicorn workers ...
There was a problem hiding this comment.
We can control it in our own code
class Cache:
def __init__(self, backend: CacheBackend):
self.backend = backend
self.get = functools.cache(self.backend.get)
self.get_obj = functools.cache(self._get_obj)
def _get_obj(self, key: str, cls: type[T]) -> T | None:
...Cache is the only class that used by application. It includes all methods for model load & dump, as ORM layer with cached method. It is created as simple instance var.
CacheBackend only act as adaptor to different cache backend to handle raw response, and it can be singleton.
By using functools.cache, it's thread safe and we can also easily export metrics like self.get.cache_info() https://docs.python.org/3/library/functools.html#functools.lru_cache
$TITLE