You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As we have a big bazel monorepo some CI jobs require a long initialization phase that takes up to 20-30minutes before actually running the test. After this, next jobs picked up by the same runner can re-use the cache and execute quite fast. The issue here is that the cache is held in memory and terminating and then starting runners again can become time-costly.
A possible solution to this problem is to hibernate inactive runners instead of terminating them when scaling down. This can help re-use the in-memory cache for an acceptable extra storage cost. When scaling up if there's a hibernated runner then we should use it otherwise a new runner spins up.
Another possible approach would be very similar to this issue #4033. Scale up to the maximum, hibernate runners_maximum_count - idleCount and then for scale up we wake up some of the hibernated runners.
Do you see any blockers or risks to implement such a change?