Skip to content

Improper callback timing in leaderelection leads to the dual-leader #7350

@carp-chen

Description

@carp-chen

Describe the bug

Under extreme concurrency scenarios caused by network latency or etcd performance issues, the operations of geting lock, updating observed, and updating the lock are not atomic. This can result in temporary state inconsistencies:

  1. A node exits the process proactively due to lease renewal timeout. After restarting, it fetches the lock information and considers itself the leader, updating its local observed state (triggering onStartLeading).
  2. However, since the lock has already been acquired by another node (dual-leader scenario, where both nodes consider themselves leader), updating the lock fails and the PATCH request throws a 409 exception. This exception is caught by the acquire method, which then waits for the next retry.
  3. Only in the next retry does the node discover the leader change (triggering onStopLeading).
Image

The biggest difference between the leader election implementations in fabric8 and client-go lies in the timing of callback execution:

Java client (fabric8): The onStartLeading and onStopLeading callbacks are executed immediately when the local observed state is updated. In other words, as long as the local observed state changes, the callbacks are triggered, regardless of whether the lock is actually updated successfully.

Go client (client-go): The onStartLeading callback is executed only after the lock has been successfully updated and leadership has truly been acquired (i.e., after the acquire method is completed). The onStopLeading callback is triggered only when the renew phase times out or fails, right before the election process exits.

This difference means that the Java client may encounter the issue of "failing to update the lock but still considering itself the leader," whereas the Go client, due to its stricter callback timing, does not have this problem.

Fabric8 Kubernetes Client version

other (please specify in additional context)

Steps to reproduce

Election parameters:
leaseDuration=30
renewDeadline=20
retryPeriod=5
releaseOnCancel=false (If enabled, it can reduce the probability of the above issue occurring, but cannot completely prevent it.)

This issue can only be reproduced under extreme concurrency conditions. I believe the above explanation and timeline have made the situation clear.

Expected behavior

The timing of callback execution should follow the approach used in client-go.
The onStartLeading callback should not be triggered before the lock is actually acquired.

Runtime

Kubernetes (vanilla)

Kubernetes API Server version

other (please specify in additional context)

Environment

Linux

Fabric8 Kubernetes Client Logs

Additional context

Fabric8 Kubernetes Client version:6.12.1
Kubernetes API Server version:1.21

Metadata

Metadata

Assignees

Labels

Waiting on feedbackIssues that require feedback from User/Other community membersbug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions