-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Open
Description
Currently this is what the RLS LB policy is currently doing:
- Create an RLS control channel:
grpc-go/balancer/rls/balancer.go
Line 362 in 7472d57
ctrlCh, err := newControlChannel(newCfg.lookupService, newCfg.controlChannelServiceConfig, newCfg.lookupServiceTimeout, b.bopts, backToReadyFn) - Start a goroutine to monitor the connectivity state of the channel:
grpc-go/balancer/rls/control_channel.go
Line 95 in 7472d57
go ctrlCh.monitorConnectivityState() - The goroutine waits for the channel to become READY:
grpc-go/balancer/rls/control_channel.go
Line 188 in 7472d57
cc.logger.Infof("Connectivity state is READY") - The next time it becomes READY again, it reset backoffs:
grpc-go/balancer/rls/control_channel.go
Line 202 in 7472d57
cc.backToReadyFunc()
This was a wrong assumption that once we are READY and get back to READY, we should have gone through TRANSIENT_FAILURE.
What should it should actually do:
- When the state transitions to TRANSIENT_FAILURE, record that transition
- The next time it transitions to READY, reset the backoff timeouts in all cache entries. Specifically, this means that it will reset the backoff state and cancel the pending backoff timer.
We should also update this test:
grpc-go/balancer/rls/balancer_test.go
Line 916 in 7472d57
| func (s) TestControlChannelConnectivityStateMonitoring(t *testing.T) { |