Skip to content

Commit cbd27c8

Browse files
committed
add a prompt when removing nodes from the cluster
1 parent 22ab0e5 commit cbd27c8

File tree

2 files changed

+23
-1
lines changed

2 files changed

+23
-1
lines changed

bin/remove_nodes_prompt.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
Does your cluster run any file system like Ceph, NFS, etc. on the GPU/HPC nodes itself using local NVMe SSDs?
2+
If yes, terminating nodes which store your data can result in permanent data loss, so before proceeding make sure any important data is copied to a persistent file system outside of the cluster such as to object storage, file storage, etc.
3+
Once data is backed up or migrated, come back and run the script. Select 2 to exit.
4+
Remember, once the nodes are terminated, all the data is lost forever and you won't be able to recover it.

bin/resize.sh

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,12 @@ then
1515
exit
1616
fi
1717

18+
if [ $USER != "ubuntu" ] && [ $USER != "opc" ]
19+
then
20+
echo "Run this script as opc or ubuntu"
21+
exit
22+
fi
23+
1824
if [ $# -eq 0 ]
1925
then
2026
python3 $folder/resize.py --help
@@ -51,6 +57,18 @@ for (( i=1; i<=$#; i++)); do
5157
fi
5258
done
5359

60+
if [ $resize_type == "remove" ] || [ $resize_type == "remove_unreachable" ]
61+
then
62+
echo "$(cat $folder/remove_nodes_prompt.txt)"
63+
echo "Do you confirm you have done all of the above steps and wish to proceed for the termination of the nodes? Enter 1 for Yes and 2 for No (to exit)."
64+
select yn in "Yes" "No"; do
65+
case $yn in
66+
Yes ) break;;
67+
No ) exit;;
68+
esac
69+
done
70+
fi
71+
5472
if [ $resize_type != "default" ]
5573
then
5674
if [ $permanent -eq 0 ]
@@ -148,5 +166,5 @@ then
148166
rm currently_resizing
149167
fi
150168
else
151-
python3 $folder/resize.py ${@}
169+
python3 $folder/resize.py ${@} &
152170
fi

0 commit comments

Comments
 (0)