Skip to content

Commit 4943c93

Browse files
craig[bot]pav-kv
andcommitted
Merge #151573
151573: interactive_tests: wait for liveness upreplication r=rafiss a=pav-kv This commit fixes `test_demo_node_cmds` so that it waits for liveness range to have 5 voters, before shutting down and decommissioning a node. This avoids loss of availability and test stalls in cases when the liveness range has < 3 voters, and the killed node is one of them. Fixes #147867 Fixes #149151 (via backport) Co-authored-by: Pavel Kalinnikov <[email protected]>
2 parents 5ca6f39 + 015b29c commit 4943c93

File tree

1 file changed

+21
-1
lines changed

1 file changed

+21
-1
lines changed

pkg/cli/interactive_tests/test_demo_node_cmds.tcl

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ send "\\demo restart 3\r"
6565
eexpect "node 3 has been restarted"
6666
eexpect "defaultdb>"
6767

68-
set timeout 1
68+
set timeout 2
6969
set stmt "select node_id, draining, membership from crdb_internal.kv_node_liveness ORDER BY node_id;\r"
7070
send $stmt
7171
expect {
@@ -90,6 +90,26 @@ eexpect "4 | f | active"
9090
eexpect "5 | f | active"
9191
eexpect "defaultdb>"
9292

93+
# Wait for the liveness range to have the default 5 voters. If its replication
94+
# factor is too low, shutting down the node below can cause it to lose quorum
95+
# and stall the decommissioning command (example: #147867).
96+
set timeout 2
97+
set stmt "select range_id, array_length(voting_replicas,1) from crdb_internal.ranges where range_id=2;\r"
98+
send $stmt
99+
expect {
100+
"2 | 5" {
101+
puts "\rliveness range has 5 voters"
102+
}
103+
timeout {
104+
puts "\rliveness range does not yet have 5 voters"
105+
sleep 2
106+
send $stmt
107+
exp_continue
108+
}
109+
}
110+
# Reset timeout back to 45 to match common.tcl.
111+
set timeout 45
112+
93113
# Try decommissioning commands
94114
send "\\demo decommission 4\r"
95115
eexpect "node is draining"

0 commit comments

Comments
 (0)