Skip to content

📝 ADR-016: Two-Tier Cache Architecture#5332

Open
chassing wants to merge 1 commit intoapp-sre:masterfrom
chassing:APPSRE-12896/qontract-api-adr-16
Open

📝 ADR-016: Two-Tier Cache Architecture#5332
chassing wants to merge 1 commit intoapp-sre:masterfrom
chassing:APPSRE-12896/qontract-api-adr-16

Conversation

@chassing
Copy link
Member

@chassing chassing commented Dec 3, 2025

$TITLE

@chassing chassing self-assigned this Dec 3, 2025
@chassing chassing added the adr label Dec 3, 2025
Comment on lines +178 to +180
# Tier 1: Memory cache (99% hit - FAST!)
if self._memory_cache is not None and key in self._memory_cache:
return self._memory_cache[key]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy from #5304 (comment)

This will only valid for immutable cache, if any remote cache key not change, but value changed, we will have stale data.

If the goal is to avoid deserialization and large string response, we can add etag to cache value. Either use 2 cache items key & key:etag, or use redis hash to store both etag and content in one item.

The get flow will be:

  • get etag from remote cache
  • get content from local cache by etag
  • fallback to get remote cache content
  • sync local cache if missed

The set flow will be:

  • calculate etag of content
  • set both etag and content to local & remote

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the memory cache has a very short TTL (max 5 seconds), it's just to avoid any Redis traffic and deserialization in a short time period. Eg. within one reconciliation call. For example, SlackAPI.get_users is called 3 times within a few seconds, and a simple memory cache speeds everything up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then your use case is cache in the same lifecycle of request / task processing, that's a standard feature in ORM / web frameworks to optimize perf to just use point in time data during one request, we can definitely add it, just not use TTL but tie to execution context.

Even for a short TTL, stale data can lead to unexpected result for concurrent requests handled by different servers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "but tie to execution context"? I can't control the usage side of the cache. For sure, the memory cache supports the lifecycle of a request/task/process.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, memory cache start and ends with request/task, when we inject cache service, should make memory cache as instance level so it can auto cleanup after request/task done. Same as https://docs.python.org/3/library/functools.html#functools.cached_property or instance level cache / lru_cache.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I don't know. The current cache backend implementation is a singleton; this also hardly depends on how FastAPI handles the requests. asyncio vs process vs threads vs uvicorn workers ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can control it in our own code

class Cache:
    def __init__(self, backend: CacheBackend):
        self.backend = backend
        self.get = functools.cache(self.backend.get)
        self.get_obj = functools.cache(self._get_obj)

    def _get_obj(self, key: str, cls: type[T]) -> T | None:
        ...

Cache is the only class that used by application. It includes all methods for model load & dump, as ORM layer with cached method. It is created as simple instance var.

CacheBackend only act as adaptor to different cache backend to handle raw response, and it can be singleton.

By using functools.cache, it's thread safe and we can also easily export metrics like self.get.cache_info() https://docs.python.org/3/library/functools.html#functools.lru_cache

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants