|
1 | 1 | # Hitless Upgrades |
2 | 2 |
|
3 | | -Seamless Redis connection handoffs during topology changes without interrupting operations. |
| 3 | +Seamless Redis connection handoffs during cluster changes without dropping connections. |
4 | 4 |
|
5 | 5 | ## Quick Start |
6 | 6 |
|
7 | 7 | ```go |
8 | | -import "github.com/redis/go-redis/v9/hitless" |
9 | | - |
10 | | -opt := &redis.Options{ |
| 8 | +client := redis.NewClient(&redis.Options{ |
11 | 9 | Addr: "localhost:6379", |
12 | 10 | Protocol: 3, // RESP3 required |
13 | | - HitlessUpgrades: &redis.HitlessUpgradeConfig{ |
14 | | - Mode: hitless.MaintNotificationsEnabled, // or MaintNotificationsAuto |
| 11 | + HitlessUpgrades: &hitless.Config{ |
| 12 | + Mode: hitless.MaintNotificationsEnabled, |
15 | 13 | }, |
16 | | -} |
17 | | -client := redis.NewClient(opt) |
| 14 | +}) |
18 | 15 | ``` |
19 | 16 |
|
20 | 17 | ## Modes |
21 | 18 |
|
22 | | -- **`MaintNotificationsDisabled`**: Hitless upgrades are completely disabled |
23 | | -- **`MaintNotificationsEnabled`**: Hitless upgrades are forcefully enabled (fails if server doesn't support it) |
24 | | -- **`MaintNotificationsAuto`**: Hitless upgrades are enabled if server supports it (default) |
| 19 | +- **`MaintNotificationsDisabled`** - Hitless upgrades disabled |
| 20 | +- **`MaintNotificationsEnabled`** - Forcefully enabled (fails if server doesn't support) |
| 21 | +- **`MaintNotificationsAuto`** - Auto-detect server support (default) |
25 | 22 |
|
26 | 23 | ## Configuration |
27 | 24 |
|
28 | 25 | ```go |
29 | | -import ( |
30 | | - "github.com/redis/go-redis/v9/hitless" |
31 | | - "github.com/redis/go-redis/v9/logging" |
32 | | -) |
33 | | - |
34 | | -Config: &hitless.Config{ |
35 | | - Mode: hitless.MaintNotificationsAuto, // Notification mode |
36 | | - MaxHandoffRetries: 3, // Retry failed handoffs |
37 | | - HandoffTimeout: 15 * time.Second, // Handoff operation timeout |
38 | | - RelaxedTimeout: 10 * time.Second, // Extended timeout during migrations |
39 | | - PostHandoffRelaxedDuration: 20 * time.Second, // Keep relaxed timeout after handoff |
40 | | - LogLevel: logging.LogLevelWarn, // LogLevelError, LogLevelWarn, LogLevelInfo, LogLevelDebug |
41 | | - MaxWorkers: 15, // Concurrent handoff workers |
42 | | - HandoffQueueSize: 300, // Handoff request queue size |
| 26 | +&hitless.Config{ |
| 27 | + Mode: hitless.MaintNotificationsAuto, |
| 28 | + EndpointType: hitless.EndpointTypeAuto, |
| 29 | + RelaxedTimeout: 10 * time.Second, |
| 30 | + HandoffTimeout: 15 * time.Second, |
| 31 | + MaxHandoffRetries: 3, |
| 32 | + MaxWorkers: 0, // Auto-calculated |
| 33 | + HandoffQueueSize: 0, // Auto-calculated |
| 34 | + PostHandoffRelaxedDuration: 0, // 2 * RelaxedTimeout |
| 35 | + LogLevel: logging.LogLevelError, |
43 | 36 | } |
44 | 37 | ``` |
45 | 38 |
|
46 | | -### Worker Scaling |
47 | | -- **Auto-calculated**: `min(PoolSize/2, max(10, PoolSize/3))` - balanced scaling approach |
48 | | -- **Explicit values**: `max(PoolSize/2, set_value)` - enforces minimum PoolSize/2 workers |
49 | | -- **On-demand**: Workers created when needed, cleaned up when idle |
| 39 | +### Endpoint Types |
50 | 40 |
|
51 | | -### Queue Sizing |
52 | | -- **Auto-calculated**: `max(20 × MaxWorkers, PoolSize)` - hybrid scaling |
53 | | - - Worker-based: 20 handoffs per worker for burst processing |
54 | | - - Pool-based: Scales directly with pool size |
55 | | - - Takes the larger of the two for optimal performance |
56 | | -- **Explicit values**: `max(200, set_value)` - enforces minimum 200 when set |
57 | | -- **Capping**: Queue size capped by `MaxActiveConns+1` (if set) or `5 × PoolSize` for memory efficiency |
| 41 | +- **`EndpointTypeAuto`** - Auto-detect based on connection (default) |
| 42 | +- **`EndpointTypeInternalIP`** - Internal IP address |
| 43 | +- **`EndpointTypeInternalFQDN`** - Internal FQDN |
| 44 | +- **`EndpointTypeExternalIP`** - External IP address |
| 45 | +- **`EndpointTypeExternalFQDN`** - External FQDN |
| 46 | +- **`EndpointTypeNone`** - No endpoint (reconnect with current config) |
58 | 47 |
|
59 | | -**Examples (without MaxActiveConns):** |
60 | | -- Pool 10: Workers 5, Queue 100 (max(20×5, 10) = 100, capped at 5×10 = 50) |
61 | | -- Pool 100: Workers 33, Queue 660 (max(20×33, 100) = 660, capped at 5×100 = 500) |
62 | | -- Pool 200: Workers 66, Queue 1320 (max(20×66, 200) = 1320, capped at 5×200 = 1000) |
| 48 | +### Auto-Scaling |
63 | 49 |
|
64 | | -**Examples (with MaxActiveConns=150):** |
65 | | -- Pool 100: Workers 33, Queue 151 (max(20×33, 100) = 660, capped at MaxActiveConns+1 = 151) |
66 | | -- Pool 200: Workers 66, Queue 151 (max(20×66, 200) = 1320, capped at MaxActiveConns+1 = 151) |
| 50 | +**Workers**: `min(PoolSize/2, max(10, PoolSize/3))` when auto-calculated |
| 51 | +**Queue**: `max(20×Workers, PoolSize)` capped by `MaxActiveConns+1` or `5×PoolSize` |
67 | 52 |
|
68 | | -## Notification Hooks |
| 53 | +**Examples:** |
| 54 | +- Pool 100: 33 workers, 660 queue (capped at 500) |
| 55 | +- Pool 100 + MaxActiveConns 150: 33 workers, 151 queue |
69 | 56 |
|
70 | | -Notification hooks allow you to monitor and customize hitless upgrade operations. The `NotificationHook` interface provides pre and post processing hooks: |
| 57 | +## How It Works |
71 | 58 |
|
72 | | -```go |
73 | | -type NotificationHook interface { |
74 | | - PreHook(ctx context.Context, notificationCtx push.NotificationHandlerContext, notificationType string, notification []interface{}) ([]interface{}, bool) |
75 | | - PostHook(ctx context.Context, notificationCtx push.NotificationHandlerContext, notificationType string, notification []interface{}, result error) |
76 | | -} |
77 | | -``` |
| 59 | +1. Redis sends push notifications about cluster changes |
| 60 | +2. Client creates new connections to updated endpoints |
| 61 | +3. Active operations transfer to new connections |
| 62 | +4. Old connections close gracefully |
78 | 63 |
|
79 | | -### Example: Metrics Collection Hook |
| 64 | +## Supported Notifications |
80 | 65 |
|
81 | | -A metrics collection hook is available in `example_hooks.go`: |
| 66 | +- `MOVING` - Slot moving to new node |
| 67 | +- `MIGRATING` - Slot in migration state |
| 68 | +- `MIGRATED` - Migration completed |
| 69 | +- `FAILING_OVER` - Node failing over |
| 70 | +- `FAILED_OVER` - Failover completed |
82 | 71 |
|
83 | | -```go |
84 | | -import "github.com/redis/go-redis/v9/hitless" |
| 72 | +## Hooks (Optional) |
85 | 73 |
|
86 | | -metricsHook := hitless.NewMetricsHook() |
87 | | -manager.AddNotificationHook(metricsHook) |
88 | | - |
89 | | -// Access metrics |
90 | | -metrics := metricsHook.GetMetrics() |
91 | | -``` |
92 | | - |
93 | | -### Example: Custom Logging Hook |
| 74 | +Monitor and customize hitless operations: |
94 | 75 |
|
95 | 76 | ```go |
96 | | -type CustomHook struct{} |
97 | | - |
98 | | -func (h *CustomHook) PreHook(ctx context.Context, notificationCtx push.NotificationHandlerContext, notificationType string, notification []interface{}) ([]interface{}, bool) { |
99 | | - // Log notification with connection details |
100 | | - if conn, ok := notificationCtx.Conn.(*pool.Conn); ok { |
101 | | - log.Printf("Processing %s on conn[%d]", notificationType, conn.GetID()) |
102 | | - } |
103 | | - return notification, true // Continue processing |
| 77 | +type NotificationHook interface { |
| 78 | + PreHook(ctx, notificationCtx, notificationType, notification) ([]interface{}, bool) |
| 79 | + PostHook(ctx, notificationCtx, notificationType, notification, result) |
104 | 80 | } |
105 | 81 |
|
106 | | -func (h *CustomHook) PostHook(ctx context.Context, notificationCtx push.NotificationHandlerContext, notificationType string, notification []interface{}, result error) { |
107 | | - if result != nil { |
108 | | - log.Printf("Failed to process %s: %v", notificationType, result) |
109 | | - } |
110 | | -} |
| 82 | +// Add custom hook |
| 83 | +manager.AddNotificationHook(&MyHook{}) |
111 | 84 | ``` |
112 | 85 |
|
113 | | -The notification context provides access to: |
114 | | -- **Client**: The Redis client instance |
115 | | -- **Pool**: The connection pool |
116 | | -- **Conn**: The specific connection that received the notification |
117 | | -- **IsBlocking**: Whether the notification was received on a blocking connection |
118 | | - |
119 | | -Hooks can track: |
120 | | -- Handoff success/failure rates |
121 | | -- Processing duration |
122 | | -- Connection-specific metrics |
123 | | -- Custom business logic |
| 86 | +### Metrics Hook Example |
124 | 87 |
|
125 | | -## Requirements |
| 88 | +```go |
| 89 | +// Create metrics hook |
| 90 | +metricsHook := hitless.NewMetricsHook() |
| 91 | +manager.AddNotificationHook(metricsHook) |
126 | 92 |
|
127 | | -- **RESP3 Protocol**: Required for push notifications |
| 93 | +// Access collected metrics |
| 94 | +metrics := metricsHook.GetMetrics() |
| 95 | +fmt.Printf("Notification counts: %v\n", metrics["notification_counts"]) |
| 96 | +fmt.Printf("Processing times: %v\n", metrics["processing_times"]) |
| 97 | +fmt.Printf("Error counts: %v\n", metrics["error_counts"]) |
| 98 | +``` |
0 commit comments