Skip to content

Commit c6a1ae7

Browse files
committed
Add test for SST progress monitoring variables (effyis#390)
This test validates that SST progress variables are properly exposed in SHOW STATUS during State Snapshot Transfer when a node joins a cluster. The test focuses on verifying that all SST progress variables exist: - cluster_<name>_node_state (donor/joiner/synced) - cluster_<name>_sst_total (0-100 or dash when complete) - cluster_<name>_sst_stage (stage name or dash) - cluster_<name>_sst_stage_total (0-100 or dash) - cluster_<name>_sst_tables (count or dash) - cluster_<name>_sst_table (name or dash) Test creates a 2-node cluster with a simple table, triggers SST via JOIN CLUSTER, and verifies that all progress variables are present in SHOW STATUS output regardless of SST completion timing. Also adds watchdog = 0 to base searchd config with idempotent check.
1 parent 7877958 commit c6a1ae7

File tree

2 files changed

+160
-0
lines changed

2 files changed

+160
-0
lines changed

test/clt-tests/base/searchd-with-flexible-ports.conf

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ searchd {
1515
query_log_format = sphinxql
1616
query_log_commands = 1
1717
server_id = ${INSTANCE}
18+
watchdog = 0
1819
}
1920
2021
EOF
Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
––– comment –––
2+
Test for SST (State Snapshot Transfer) progress monitoring during table replication
3+
Issue: https://github.com/manticoresoftware/effyis/issues/390
4+
5+
Problem: When adding a large table to cluster via ALTER CLUSTER ADD, there was no way
6+
to track replication progress. Users couldn't determine how much was copied or remaining.
7+
8+
Solution: Added SST progress variables to SHOW STATUS:
9+
- cluster_<name>_sst_total: Overall progress (0-100)
10+
- cluster_<name>_sst_stage: Current stage (e.g., "send files")
11+
- cluster_<name>_sst_stage_total: Progress of current stage (0-100)
12+
- cluster_<name>_sst_tables: Total tables being transferred
13+
- cluster_<name>_sst_table: Current table being transferred (e.g., "3 (products)")
14+
- cluster_<name>_node_state: Node state (donor/joiner/synced)
15+
16+
This test validates that SST progress variables exist in SHOW STATUS output.
17+
––– input –––
18+
apt-get update -y > /dev/null; echo $?
19+
––– output –––
20+
0
21+
––– input –––
22+
apt-get install -y iproute2 procps > /dev/null; echo $?
23+
––– output –––
24+
0
25+
––– comment –––
26+
Add watchdog = 0 to base config if not already present
27+
––– input –––
28+
grep -q "watchdog" test/clt-tests/base/searchd-with-flexible-ports.conf && echo "watchdog already exists" || sed -i '/^searchd {/,/^}/ s/^\([[:space:]]*\)}$/\1\twatchdog = 0\n\1}/' test/clt-tests/base/searchd-with-flexible-ports.conf; echo $?
29+
––– output –––
30+
#!/watchdog already exists|0/!#
31+
––– comment –––
32+
Start node 1
33+
––– input –––
34+
export INSTANCE=1
35+
––– output –––
36+
––– input –––
37+
mkdir -p /var/{run,lib,log}/manticore-${INSTANCE}
38+
––– output –––
39+
––– input –––
40+
stdbuf -oL searchd --logreplication -c test/clt-tests/base/searchd-with-flexible-ports.conf > /dev/null
41+
––– output –––
42+
––– input –––
43+
if timeout 10 grep -qm1 '\[BUDDY\] started' <(tail -n 1000 -f /var/log/manticore-${INSTANCE}/searchd.log); then echo 'Buddy started!'; else echo 'Timeout or failed!'; cat /var/log/manticore-${INSTANCE}/searchd.log; fi
44+
––– output –––
45+
Buddy started!
46+
––– comment –––
47+
Create cluster on node 1
48+
––– input –––
49+
mysql -h0 -P1306 -e "CREATE CLUSTER sst_test"
50+
––– output –––
51+
––– input –––
52+
mysql -h0 -P1306 -e "SHOW STATUS LIKE 'cluster_sst_test_status'\G"
53+
––– output –––
54+
*************************** 1. row ***************************
55+
Counter: cluster_sst_test_status
56+
Value: primary
57+
––– comment –––
58+
Create simple table and add to cluster
59+
––– input –––
60+
mysql -h0 -P1306 -e "CREATE TABLE test_table (id bigint, title text)"
61+
––– output –––
62+
––– input –––
63+
mysql -h0 -P1306 -e "ALTER CLUSTER sst_test ADD test_table"
64+
––– output –––
65+
––– input –––
66+
mysql -h0 -P1306 -e "SHOW STATUS LIKE 'cluster_sst_test_indexes'\G"
67+
––– output –––
68+
*************************** 1. row ***************************
69+
Counter: cluster_sst_test_indexes
70+
Value: test_table
71+
––– comment –––
72+
Start node 2
73+
––– input –––
74+
export INSTANCE=2
75+
––– output –––
76+
––– input –––
77+
mkdir -p /var/{run,lib,log}/manticore-${INSTANCE}
78+
––– output –––
79+
––– input –––
80+
stdbuf -oL searchd --logreplication -c test/clt-tests/base/searchd-with-flexible-ports.conf > /dev/null
81+
––– output –––
82+
––– input –––
83+
if timeout 10 grep -qm1 '\[BUDDY\] started' <(tail -n 1000 -f /var/log/manticore-${INSTANCE}/searchd.log); then echo 'Buddy started!'; else echo 'Timeout or failed!'; cat /var/log/manticore-${INSTANCE}/searchd.log; fi
84+
––– output –––
85+
Buddy started!
86+
––– comment –––
87+
Join node 2 to cluster - this triggers SST
88+
––– input –––
89+
mysql -h0 -P2306 -e "JOIN CLUSTER sst_test AT '127.0.0.1:1312'"
90+
––– output –––
91+
––– comment –––
92+
Verify SST progress variables exist on donor node
93+
The main goal is to check that these variables are available, not their specific values
94+
––– input –––
95+
mysql -h0 -P1306 -e "SHOW STATUS LIKE 'cluster_sst_test_node_state'\G"
96+
––– output –––
97+
*************************** 1. row ***************************
98+
Counter: cluster_sst_test_node_state
99+
Value: #!/.+/!#
100+
––– comment –––
101+
Verify all 5 SST progress variables exist
102+
––– input –––
103+
mysql -h0 -P1306 -e "SHOW STATUS LIKE 'cluster_sst_test_sst_%'\G" | grep "Counter:" | wc -l
104+
––– output –––
105+
5
106+
––– comment –––
107+
Check that each individual SST variable exists
108+
Values will be "-" if SST completed, or numbers/stage names if still active
109+
––– input –––
110+
mysql -h0 -P1306 -e "SHOW STATUS LIKE 'cluster_sst_test_sst_total'\G"
111+
––– output –––
112+
*************************** 1. row ***************************
113+
Counter: cluster_sst_test_sst_total
114+
Value: #!/.*/!#
115+
––– input –––
116+
mysql -h0 -P1306 -e "SHOW STATUS LIKE 'cluster_sst_test_sst_stage'\G"
117+
––– output –––
118+
*************************** 1. row ***************************
119+
Counter: cluster_sst_test_sst_stage
120+
Value: #!/.*/!#
121+
––– input –––
122+
mysql -h0 -P1306 -e "SHOW STATUS LIKE 'cluster_sst_test_sst_stage_total'\G"
123+
––– output –––
124+
*************************** 1. row ***************************
125+
Counter: cluster_sst_test_sst_stage_total
126+
Value: #!/.*/!#
127+
––– input –––
128+
mysql -h0 -P1306 -e "SHOW STATUS LIKE 'cluster_sst_test_sst_tables'\G"
129+
––– output –––
130+
*************************** 1. row ***************************
131+
Counter: cluster_sst_test_sst_tables
132+
Value: #!/.*/!#
133+
––– input –––
134+
mysql -h0 -P1306 -e "SHOW STATUS LIKE 'cluster_sst_test_sst_table'\G"
135+
––– output –––
136+
*************************** 1. row ***************************
137+
Counter: cluster_sst_test_sst_table
138+
Value: #!/.*/!#
139+
––– comment –––
140+
Verify both nodes are synced
141+
––– input –––
142+
bash -c 'end=$((SECONDS+60)); while [ $SECONDS -lt $end ]; do all_synced=true; for port in 1306 2306; do mysql -h0 -P$port -e "SHOW STATUS LIKE '\''cluster_sst_test_status'\''\G" > /tmp/status_$port.log 2>/dev/null && grep -q "Value: primary" /tmp/status_$port.log || { all_synced=false; break; }; done; if $all_synced; then for port in 1306 2306; do echo "Port $port: Node synced"; done; exit 0; fi; sleep 1; done; echo "Timeout waiting for nodes to sync!"; exit 1'
143+
––– output –––
144+
Port 1306: Node synced
145+
Port 2306: Node synced
146+
––– comment –––
147+
Verify table exists on both nodes
148+
––– input –––
149+
mysql -h0 -P1306 -e "SHOW STATUS LIKE 'cluster_sst_test_indexes'\G"
150+
––– output –––
151+
*************************** 1. row ***************************
152+
Counter: cluster_sst_test_indexes
153+
Value: test_table
154+
––– input –––
155+
mysql -h0 -P2306 -e "SHOW STATUS LIKE 'cluster_sst_test_indexes'\G"
156+
––– output –––
157+
*************************** 1. row ***************************
158+
Counter: cluster_sst_test_indexes
159+
Value: test_table

0 commit comments

Comments
 (0)