-
Notifications
You must be signed in to change notification settings - Fork 131
Description
One thing we're struggling with at our organisation, is the GitHub API rate limit. We've managed to get GitHub to raise it once or twice for us but it's a struggle. We're occasionally coming uncomfortably close to hitting it again now.
I'm thinking about strategies to help with this.
I have had a decent look through Policy Bot's code, and I didn't manage to find anywhere that unnecessary API requests are being made. It seems like we're already sufficiently careful. Caching within an instance seems to work too, as far as I can see.
A relevant thing: since the bot is stateless, it's no problem at all to run multiple replicas. That's a really great property which makes it much more pleasant to operate. Having it as a required check in CI means that we can't really tolerate much downtime and this is an easy way to reduce that risk. We currently run two replicas. The downside of this is that each replica has to build up its own cache. They can't share what they know.
So what I am thinking about is adding some kind of cache persistence. This would make it so that each instance (e.g. after a pod eviction) doesn't start with a blank slate. It would also mean that parallel instances would have the chance of sharing state. In the most ideal case we'd only need to make 1 request for each resource (for us that would mean 50% fewer requests).
Of course, this comes with a bit of complexity in the codebase and the config. As far as I can see, since we're using httpcache, this could likely be done by adding extra backends there, which is fairly self-contained. But still they'd be in tree and need to be exposed in documentation.
So I thought I'd ask - does this sound like an acceptable direction? I'd love to hear any alternative approaches.