Skip to content

Conversation

@Diaphteiros
Copy link
Contributor

What this PR does / why we need it:

In the ClusterProvider: Gardener, we add watches to clusters during runtime, which essentially works by calling Start(...) on that cluster's cache. This is a blocking method, so it has to run in a new go routine, and it can return an error, so we cannot simply do go cache.Start(...), because then we would not notice if the watch failed somehow.
This means that we need slightly more complex logic to handle the multiple go routines. Using the errgroup package's Wait() method doesn't work either, because that one too blocks and returns an error, and furthermore stops when all go routines in the group have finished, which doesn't fit our use case, where having zero watches is something that could happen for some time.

This PR adds a ThreadManager. It manages multiple go routines and allows reacting to them being stopped. This enables the Gardener ClusterProvider to start multiple watches and react to failing ones by logging the error and restarting them.

@Diaphteiros Diaphteiros requested a review from robertgraeff May 7, 2025 08:59
@Diaphteiros Diaphteiros merged commit acdeca4 into main May 7, 2025
7 checks passed
@Diaphteiros Diaphteiros deleted the threads branch May 7, 2025 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants