-
Notifications
You must be signed in to change notification settings - Fork 3.9k
[automatic failover] Introduce fast failover mode - a thread-sync-free approach #4223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Healtstatus manager with initial listener and registration logic - pluggable health checker strategy introduced, these are draft NoOpStrategy, EchoStrategy, LagAwareStrategy, - fix failing tests impacted from weighted clusters
- add echo ot CommandObjects and UnifiedJEdis - improve StrategySupplier by accepting jedisclientconfig - adapt EchoStrategy to StrategySupplier. Now it handles the creation of connection by accepting endpoint and JedisClientConfig - make healthchecks disabled by default - drop noOpStrategy - add unit&integration tests for health check
- clear redundant catch - replace failover options and drop failoveroptions class - remove forced_unhealthy from healthstatus - fix failback check - add disabled flag to cluster - update/fix related tests
Co-authored-by: Copilot <[email protected]>
- replace failback enabled with failbacksupported in client - fix formatting - set defaults
- fix failing tests
- fix failing tests
- introduce graceperiod - fix issue when CB is forced_open and gracePeriod is completed
… results during consturction of provider - add HealthStatus.UNKNOWN as default for Cluster - handle status changes in order of events during initialization - add tests for status tracker and orderingof events - fix impacted unit&integ tests
- fix formatting
- downgrade logback version for slf4j compatibility - increase timeouts for faultInjector
…MultiClusterPooledConnectionProvider - add test for init and post init events - fix failing tests
- fix failing tests due to method name change
- fix broken echostrategy due to connection issue - make healtthCheckStrategy closable and close on - adding fastfailover mode to config and provider - add local failover tests for total failover duration
- added builders to connection and connectionFactory - introduce initializtionTracker to track list of connections during their construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a comprehensive fast failover mechanism for the Jedis Redis client, providing thread-sync-free cluster switching with enhanced health monitoring and automatic failback capabilities.
Key Changes:
- Fast failover implementation - Forcibly disconnects old cluster connections during failover using
TrackingConnectionPool
for immediate traffic redirection - Enhanced health monitoring system - Comprehensive health check strategies with configurable intervals, grace periods, and automatic status tracking
- Automatic failback mechanism - Periodic checks to return to higher-weighted healthy clusters with configurable intervals and grace periods
Reviewed Changes
Copilot reviewed 56 out of 58 changed files in this pull request and generated 6 comments.
File | Description |
---|---|
MultiClusterPooledConnectionProvider.java | Core failover logic with health status management, weighted cluster selection, and periodic failback scheduling |
TrackingConnectionPool.java | Connection pool wrapper that tracks active connections and enables forced disconnection during failover |
mcf/*.java | Health check framework including status tracking, event management, and various health check strategies |
Test files | Comprehensive test coverage for failover scenarios, health checks, and integration testing with toxiproxy |
src/main/java/redis/clients/jedis/providers/MultiClusterPooledConnectionProvider.java
Show resolved
Hide resolved
src/main/java/redis/clients/jedis/providers/MultiClusterPooledConnectionProvider.java
Show resolved
Hide resolved
src/main/java/redis/clients/jedis/mcf/TrackingConnectionPool.java
Outdated
Show resolved
Hide resolved
- do not throw exception is failover already happening
Closing this in favor of #4226. This has been an idea to "failover immediately" while trying to avoid any thread synchronization operation/overhead and introduce necessary constructs to manage waiting/blocked threads. While this is still valid and viable option, we also start tinkering around a way to enable the core components for more flexibility and decided to put more effort and courage to do so. #4226 presents the way we chose to proceed for a "fast failover". |
Closing this in favor of #4226.
This has been an idea to "failover immediately" while trying to avoid any thread synchronization operation/overhead and introduce necessary constructs to manage waiting/blocked threads. While this is still valid and viable option, we also start tinkering around a way to enable the core components for more flexibility and decided to put more effort and courage to do so. #4226 presents the way we chose to proceed for a "fast failover".
This PR is SUPERSEDED by #4226
i decided to keep it open anyway, since i am uncertain whether this is the right moment for changing creational behaviour of central components.
This PR is based on changes in previous #4207.
Changes here should be also reviewed in comparison with #4220
This is thread-sync-free approach(compared to #4220) for failing fast with on-going command executions and connection inits.
Summary of the changes in PR;
TrackingConnectionPool
Cluster
health validation for borrowing cluster resource - throws exception when getting connection from unhealthy clusterConnectionPool
andHealthCheckStrategy
InitializtionTracker
- to track list of connections during their construction phaseConnection
andConnectionFactory
- helping to setInitializationTracker
for connectionsCommits essential to this one are;
- introduce clusterSwitchEvent and drop clusterFailover post processor
- introduce fastfailover using objectMaker injection into connectionF…
- polish
- cleanup
- introduce TrackingConnectionPool with FailFastConnectionFactory