Commit 68fe0a9
committed
Rework retry/timeout defaults to ensure fast service failover
When the galera pod that receives database traffic becomes
unresponsible, the galera library reacts by running a script
in one of the surviving pod to elect a new endpoint. This
script uses curl to call the API server to update the selector
object responsible for balancing database traffic.
If during the API call the API server becomes unresponsive/unreacheable
(e.g. the API VIP fails over to another master node), the curl call
might get stuck for an unbounded period of time, which delays the
traffic failover and can cause a long database service disruption.
Add a default connect timeout and update default retry parameters
so that curl is never blocked for too long, and the endpoint
configuration can be retried until the API server becomes available.
This commit only improves the default parameters, the ability to override
those parameters will be addressed in a subsequent commit.
Jira: OSPRH-176041 parent 381cb0b commit 68fe0a9
1 file changed
+12
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
15 | | - | |
16 | | - | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
17 | 22 | | |
18 | 23 | | |
19 | 24 | | |
| |||
66 | 71 | | |
67 | 72 | | |
68 | 73 | | |
69 | | - | |
| 74 | + | |
70 | 75 | | |
71 | 76 | | |
72 | 77 | | |
| |||
109 | 114 | | |
110 | 115 | | |
111 | 116 | | |
112 | | - | |
113 | | - | |
| 117 | + | |
| 118 | + | |
114 | 119 | | |
115 | 120 | | |
116 | 121 | | |
| |||
132 | 137 | | |
133 | 138 | | |
134 | 139 | | |
135 | | - | |
| 140 | + | |
136 | 141 | | |
137 | 142 | | |
138 | 143 | | |
| |||
0 commit comments