Skip to content
This repository was archived by the owner on Jun 30, 2021. It is now read-only.

Commit 3d41299

Browse files
committed
Fix race conditions while using many nodes with docker-compose
1 parent a726a36 commit 3d41299

File tree

10 files changed

+136
-70
lines changed

10 files changed

+136
-70
lines changed

CHANGELOG.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,30 @@ Note image ids also change after scm-source.json has being updated which trigger
77
###### To get container versions
88
docker exec grid versions
99

10+
## TBD_DOCKER_TAG
11+
+ Date: TBD_DATE
12+
+ Fix race conditions while using many nodes with docker-compose
13+
+ Image tag details:
14+
+ Selenium: vTBD_SELENIUM_VERSION (TBD_SELENIUM_REVISION)
15+
+ Chrome stable: TBD_CHROME_STABLE
16+
+ Firefox stable: TBD_FIREFOX_STABLE
17+
+ Chromedriver: TBD_CHROME_DRIVER (TBD_CHROMEDRIVER_COMMIT)
18+
+ Java: TBD_JAVA_VENDOR Java TBD_JAVA_BUILD
19+
+ Timezone: TBD_TIME_ZONE
20+
+ FROM ubuntu:UBUNTU_FLAVOR-UBUNTU_DATE
21+
+ Python: TBD_PYTHON_VERSION
22+
+ Sauce Connect TBD_SAUCE_CONNECT_VERS, build TBD_SAUCE_CONNECT_BUILD TBD_SAUCE_CONNECT_REVISION
23+
+ BrowserStack Local version TBD_BROWSER_STACK_VERSION
24+
+ Tested on kernel dev host: 4.4.0-29-generic x86_64
25+
+ Tested on kernel CI host: TBD_HOST_UNAME
26+
+ Built at dev host with: Docker version 1.11.2, build b9f10c9
27+
+ Built at CI host with: Docker version TBD_DOCKER_VERS, build TBD_DOCKER_BUILD
28+
+ Built at dev host with: Docker Compose version 1.7.1, build 0a9ab35
29+
+ Built at CI host with: Docker Compose version TBD_DOCKER_COMPOSE_VERS, build TBD_DOCKER_COMPOSE_BUILD
30+
+ Image size: TBD_IMAGE_SIZE
31+
+ Digest: TBD_DIGEST
32+
+ Image ID: TBD_IMAGE_ID
33+
1034
## 2.53.1a
1135
+ Date: 2016-06-30
1236
+ Upgrade Selenium to 2.53.1

CONTRIBUTING.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,24 +3,24 @@
33
## Local
44
For pull requests or local commits:
55

6-
docker rm -vf grid; ./test/before_install_build && ./test/install && ./test/script && docker tag selenium:latest elgalu/selenium:latest
6+
./test/before_install_build && ./test/install && ./test/script && docker tag selenium:latest elgalu/selenium:latest
77
docker exec grid versions && ./test/after_script
88
open ./images/grid_console.png #to verify the versions are correct
99
git checkout ./images/grid_console.png && open ./videos/chrome/test.mkv
1010
travis lint #if you changed .travis.yml
11-
git checkout -b tmp-2.53.1a #name your branch according to your changes
11+
git checkout -b tmp-2.53.1b #name your branch according to your changes
1212
#git add ... git commit ... git push ... open pull request
1313

1414
For repository owners only:
1515

16-
git commit -m "Selenium 2.53.1 & Firefox 47.0.1 #102 & docker-compose #108 & Upgrade ubuntu 20160629"
16+
git commit -m "Fix race conditions while using many nodes with docker-compose"
1717
git tag -d latest #tag latest will be updated from TravisCI
18-
git tag 2.53.1a && git push origin tmp-2.53.1a && git push --tags
18+
git tag 2.53.1b && git push origin tmp-2.53.1b && git push --tags
1919

2020
-- Wait for Travis to pass OK
2121
-- Make sure changes got merged into master by elgalubot
2222

23-
git checkout master && git pull && git branch -d tmp-2.53.1a && git push origin --delete tmp-2.53.1a
23+
git checkout master && git pull && git branch -d tmp-2.53.1b && git push origin --delete tmp-2.53.1b
2424

2525
-- Re-add TBD_* section in CHANGELOG.md starting with TBD_DOCKER_TAG
2626
-- Upgrade release tag in github.com with latest CHANGELOG.md
@@ -37,9 +37,9 @@ Keep certain bins if chrome version changed for example:
3737
## Retry
3838
Failed in Travis? retry
3939

40-
git tag -d 2.53.1a && git push origin :2.53.1a
40+
git tag -d 2.53.1b && git push origin :2.53.1b
4141
#git add ...
42-
git commit --amend && git tag 2.53.1a && git push --force origin tmp-2.53.1a && git push --tags
42+
git commit --amend && git tag 2.53.1b && git push --force origin tmp-2.53.1b && git push --tags
4343

4444
## Docker push from Travis CI
4545
Travis [steps](https://docs.travis-ci.com/user/docker/#Pushing-a-Docker-Image-to-a-Registry) involve `docker login` and docker credentials encryptions.

Dockerfile

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -911,8 +911,8 @@ ENV DEFAULT_SELENIUM_HUB_PORT="24444" \
911911
DEFAULT_NOVNC_PORT="26080" \
912912
DEFAULT_SSHD_PORT="22222" \
913913
DEFAULT_SAUCE_LOCAL_SEL_PORT="4445" \
914-
DEFAULT_SUPERVISOR_HTTP_PORT="29001" \
915-
DEFAULT_DISP_N="10"
914+
DEFAULT_SUPERVISOR_HTTP_PORT="29001"
915+
# DEFAULT_DISP_N="10"
916916

917917
# Commented for now; all these versions are still available at
918918
# https://github.com/elgalu/docker-selenium/releases/tag/2.47.1m
@@ -935,6 +935,10 @@ ENV FIREFOX_VERSION="${FF_VER}" \
935935
MEM_JAVA_PERCENT=80 \
936936
# Max amount of time to wait on Xvfb or Xmanager while retrying
937937
WAIT_FOREGROUND_RETRY="1s" \
938+
# Supervisor processes retry attemps (0 means do not retry)
939+
XVFB_STARTRETRIES=0 \
940+
XMANAGER_STARTRETRIES=0 \
941+
XMANAGER_STARTSECS=0 \
938942
# Max amount of time to wait for other processes dependencies
939943
WAIT_TIMEOUT="25s" \
940944
SCREEN_WIDTH=1900 \
@@ -1043,6 +1047,7 @@ ENV FIREFOX_VERSION="${FF_VER}" \
10431047
VIDEO_CHUNKS_MAX=999 \
10441048
VIDEOS_DIR="${NORMAL_USER_HOME}/videos" \
10451049
# You can choose what X manager to use
1050+
# fluxbox | openbox
10461051
XMANAGER="fluxbox" \
10471052
# Sauce Labs tunneling. Naming is required: SAUCE_TUNNEL_ID
10481053
SAUCE_TUNNEL="false" \

bin/entry.sh

Lines changed: 41 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -172,7 +172,7 @@ fi
172172

173173
#----------------------------------------
174174
# Remove lock files, thanks @garagepoort
175-
clear_x_locks.sh
175+
# clear_x_locks.sh
176176

177177
#--------------------------------
178178
# Improve etc/hosts and fix dirs
@@ -195,47 +195,57 @@ function get_free_display() {
195195

196196
# Get a list of socket DISPLAYs already used
197197
netstat -nlp | grep -Po '(?<=\/tmp\/\.X11-unix\/X)([0-9]+)' | sort -u > /tmp/netstatX11.log
198-
#DEBUG: cat /tmp/netstatX11.log 1>&3
199-
200-
# -s file is not zero size
201-
if [ -s /tmp/netstatX11.log ]; then
202-
# important: while loops are executed in a subshell
203-
# var assignments will be lost unless using <<<
204-
# using 11.0 12.3 1.8 and so on didn't work, left as a reference
205-
# local pythonCmd="from random import shuffle;list1 = list(range($MAX_DISPLAY_SEARCH));shuffle(list1);list2 = [x/10 for x in list1];str_res = ' '.join(str(e) for e in list2);print (str_res)"
206-
local pythonCmd="from random import shuffle;list1 = list(range($MAX_DISPLAY_SEARCH));shuffle(list1);print (' '.join(str(e) for e in list1))"
207-
local displayNums=$(python -c "${pythonCmd}")
208-
# Always find a free DISPLAY port starting with current DISP_N if it was provided
209-
[ "${DISP_N}" != "-1" ] && displayNums="${DISP_N} ${displayNums}"
210-
IFS=' ' read -r -a arrayDispNums <<< "$displayNums"
211-
for find_display_num in ${arrayDispNums[@]}; do
212-
# read -r Do not treat a backslash character in any special way.
213-
# Consider each backslash to be part of the input line.
198+
[ ! -s /tmp/netstatX11.log ] && echo "-- INFO: Emtpy file /tmp/netstatX11.log" 1>&3
199+
200+
# important: while loops are executed in a subshell
201+
# var assignments will be lost unless using <<<
202+
# using 11.0 12.3 1.8 and so on didn't work, left as a reference
203+
# local pythonCmd="from random import shuffle;list1 = list(range($MAX_DISPLAY_SEARCH));shuffle(list1);list2 = [x/10 for x in list1];str_res = ' '.join(str(e) for e in list2);print (str_res)"
204+
local pythonCmd="from random import shuffle;list1 = list(range($MAX_DISPLAY_SEARCH));shuffle(list1);print (' '.join(str(e) for e in list1))"
205+
local displayNums=$(python -c "${pythonCmd}")
206+
# Always find a free DISPLAY port starting with current DISP_N if it was provided
207+
[ "${DISP_N}" != "-1" ] && displayNums="${DISP_N} ${displayNums}"
208+
IFS=' ' read -r -a arrayDispNums <<< "$displayNums"
209+
for find_display_num in ${arrayDispNums[@]}; do
210+
# read -r Do not treat a backslash character in any special way.
211+
# Consider each backslash to be part of the input line.
212+
# -s file is not zero size
213+
if [ -s /tmp/netstatX11.log ]; then
214214
while read read_disp_num ; do
215215
if [ "${read_disp_num}" = "${find_display_num}" ]; then
216-
echo "-- WARN: DISPLAY=${find_display_num} is taken, searching for another..." 1>&3
216+
echo "-- WARN: DISPLAY=:${find_display_num} is taken, searching for another..." 1>&3
217217
selected_disp_num="-1"
218218
break
219219
elif [ "${selected_disp_num}" = "-1" ]; then
220220
selected_disp_num="${find_display_num}"
221-
echo "-- INFO: Possible free DISPLAY=${find_display_num}" 1>&3
221+
echo "-- INFO: Possible free DISPLAY=:${find_display_num}" 1>&3
222222
# echo "-- DEBUG: Updated selected_disp_num=$selected_disp_num" 1>&3
223223
# echo "-- DEBUG: WAS read_disp_num=$read_disp_num" 1>&3
224224
fi
225225
done <<< "$(cat /tmp/netstatX11.log)"
226-
# echo "-- DEBUG: selected_disp_num=$selected_disp_num" 1>&3
227-
[ "${selected_disp_num}" != "-1" ] && break
228-
# echo "-- DEBUG: find_display_num=$find_display_num" 1>&3
229-
if [ ${find_display_num} -gt ${MAX_DISPLAY_SEARCH} ]; then
230-
echo "-- ERROR: Entered in an infinite loop at $0 after netstat" 1>&2 1>&3
231-
break
226+
else
227+
# echo "-- INFO: Emtpy file /tmp/netstatX11.log" 1>&3
228+
# selected_disp_num="${DEFAULT_DISP_N}"
229+
selected_disp_num="${find_display_num}"
230+
fi
231+
if [ "${selected_disp_num}" != "-1" ]; then
232+
# If we can already use that display it means there is already some
233+
# Xvfb there which means we need to keep looking for a free one.
234+
export DISPLAY=":${find_display_num}"
235+
if xsetroot -cursor_name left_ptr -fg white -bg black > /dev/null 2>&1; then
236+
echo "-- WARN: DISPLAY=:${find_display_num} is already being used, skip it..." 1>&3
237+
selected_disp_num="-1"
232238
fi
233-
done
234-
else
235-
echo "-- INFO: Emtpy file /tmp/netstatX11.log" 1>&3
236-
selected_disp_num="${DEFAULT_DISP_N}"
237-
fi
238-
[ "${selected_disp_num}" = "-1" ] || echo "-- INFO: Found free DISPLAY=${selected_disp_num}" 1>&3
239+
fi
240+
if [ "${selected_disp_num}" != "-1" ]; then
241+
break
242+
elif [ ${find_display_num} -gt ${MAX_DISPLAY_SEARCH} ]; then
243+
echo "-- ERROR: Entered in an infinite loop at $0 after netstat" 1>&2 1>&3
244+
selected_disp_num="-1"
245+
break
246+
fi
247+
done
248+
[ "${selected_disp_num}" = "-1" ] || echo "-- INFO: Found free DISPLAY=:${selected_disp_num}" 1>&3
239249

240250
echo ${selected_disp_num}
241251
}

test/compose-test.sh

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,14 +23,20 @@ die () {
2323
[ -z "${SELENIUM_HUB_PORT}" ] && die "Required env var SELENIUM_HUB_PORT"
2424
[ -z "${WAIT_ALL_DONE}" ] && export WAIT_ALL_DONE="40s"
2525

26+
SLEEP_LOCALLY=0
27+
SLEEP_TRAVIS=3
28+
29+
# Ensure clean
30+
docker-compose -p selenium down || true
31+
2632
# Compose up!
2733
docker-compose -p selenium scale hub=1 chrome=${NUM_NODES} firefox=${NUM_NODES}
2834

2935
# FIXME: We still need to wait a bit because the nodes registration is not
3036
# being waited on wait_all_done script :(
3137
# mabe related to issue #83
32-
sleep 3
33-
[ "${TRAVIS}" = "true" ] && sleep 3
38+
sleep ${SLEEP_LOCALLY}
39+
[ "${TRAVIS}" = "true" ] && sleep ${SLEEP_TRAVIS}
3440

3541
# Wait then show errors, if any
3642
if ! docker exec selenium_hub_1 wait_all_done ${WAIT_ALL_DONE}; then
@@ -55,8 +61,8 @@ done
5561
# FIXME: We still need to wait a bit because the nodes registration is not
5662
# being waited on wait_all_done script :(
5763
# mabe related to issue #83
58-
sleep 4
59-
[ "${TRAVIS}" = "true" ] && sleep 4
64+
sleep ${SLEEP_LOCALLY}
65+
[ "${TRAVIS}" = "true" ] && sleep ${SLEEP_TRAVIS}
6066

6167
# Tests can run anywere, in the hub, in the host, doesn't matter
6268
for i in $(seq 1 ${PARAL_TESTS}); do
@@ -68,8 +74,8 @@ for i in $(seq 1 ${PARAL_TESTS}); do
6874
done
6975

7076
# sleep a moment to let the UI tests start
71-
sleep 4
72-
[ "${TRAVIS}" = "true" ] && sleep 4
77+
sleep ${SLEEP_LOCALLY}
78+
[ "${TRAVIS}" = "true" ] && sleep ${SLEEP_TRAVIS}
7379

7480
# not so verbose from here
7581
set +x
@@ -88,7 +94,7 @@ for i in $(seq 1 ${NUM_NODES}); do
8894
done
8995

9096
# Cleanup
91-
docker-compose down
97+
docker-compose -p selenium down
9298

9399
# Results
94100
if [ "$FAIL_COUNT" == "0" ]; then

test/install_grid

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,9 @@
33
# set -e: exit asap if a command exits with a non-zero status
44
set -e
55

6+
# Ensure clean env
7+
docker rm -vf grid || true
8+
69
docker run --name=grid -d -e VIDEO=true -v /dev/shm:/dev/shm selenium
710
docker exec grid wait_all_done 40s
811
docker exec grid versions

xmanager/bin/start-xmanager.sh

Lines changed: 29 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,17 @@ die () {
2323
# Wait for this process dependencies
2424
timeout --foreground ${WAIT_TIMEOUT} wait-xvfb.sh
2525

26+
function shutdown {
27+
echo "Trapped SIGTERM/SIGINT so shutting down $0 gracefully..."
28+
exit 0
29+
}
30+
31+
# Run function shutdown() when this process a killer signal
32+
trap shutdown SIGTERM SIGINT SIGKILL
33+
2634
function start_fluxbox() {
27-
fluxbox -display ${DISPLAY} -verbose \
35+
# http://stackoverflow.com/a/21028200/511069
36+
fluxbox -display "${DISPLAY}.0" -verbose \
2837
1> "${LOGS_DIR}/fluxbox-tryouts-stdout.log" \
2938
2> "${LOGS_DIR}/fluxbox-tryouts-stderr.log" &
3039
}
@@ -37,25 +46,26 @@ elif [ "${XMANAGER}" = "fluxbox" ]; then
3746
i=0
3847
stat_failed=true
3948
while true ; do
40-
let i=${i}+1
41-
if ! start_fluxbox; then
42-
echo "-- WARN: start_fluxbox() failed!" 1>&3
43-
fi
44-
timeout --foreground ${WAIT_TIMEOUT} wait-xvfb.sh
45-
if timeout --foreground "${WAIT_FOREGROUND_RETRY}" wait-xmanager.sh &> "${LOGS_DIR}/wait-xmanager-stdout.log"; then
46-
stat_failed=false
47-
break
48-
else
49-
echo "-- WARN: wait-xmanager.sh failed! for DISPLAY=${DISPLAY}" 1>&3
50-
killall fluxbox || true
51-
fi
52-
if [ ${i} -gt 10 ]; then
53-
echoerr "-- ERROR: Failed to start Fluxbox at $0 after many retries."
54-
break
55-
fi
49+
while true ; do
50+
let i=${i}+1
51+
if ! start_fluxbox; then
52+
echo "-- WARN: start_fluxbox() failed!" 1>&3
53+
fi
54+
if timeout --foreground "${WAIT_FOREGROUND_RETRY}" wait-xmanager.sh &> "${LOGS_DIR}/wait-xmanager-stdout.log"; then
55+
stat_failed=false
56+
break
57+
else
58+
echo "-- WARN: wait-xmanager.sh failed! for DISPLAY=${DISPLAY}" 1>&3
59+
killall fluxbox || true
60+
fi
61+
if [ ${i} -gt 3 ]; then
62+
echoerr "-- ERROR: Failed to start Fluxbox at $0 after many retries."
63+
break
64+
fi
65+
done
66+
[ "${stat_failed}" != "true" ] && wait
67+
stat_failed=true
5668
done
57-
[ "${stat_failed}" = "true" ] && die "Failed to start_fluxbox()."
58-
wait
5969
else
6070
die "The chosen X manager is not supported: '${XMANAGER}'"
6171
fi

xmanager/etc/supervisor/conf.d/xmanager.conf

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,12 +29,12 @@ autorestart=false
2929
;particular amount of time.
3030
;So using custom wait-xxxx.sh scripts to perform a more efficient
3131
;active waiting until https://github.com/Supervisor/supervisor/issues/584
32-
startsecs=0
32+
startsecs=%(ENV_XMANAGER_STARTSECS)s
3333

3434
;The number of serial failure attempts that supervisord will allow when
3535
;attempting to start the program before giving up and puting the process
3636
;into an FATAL state.
37-
startretries=0
37+
startretries=%(ENV_XMANAGER_STARTRETRIES)s
3838

3939
;Logs
4040
redirect_stderr=false

xterm/bin/start-xterm.sh

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,12 @@
11
#!/usr/bin/env bash
22

3+
# Open new file descriptors that redirects to stderr/stdout
4+
exec 3>&1
5+
exec 4>&2
6+
37
# echo fn that outputs to stderr http://stackoverflow.com/a/2990533/511069
48
echoerr() {
5-
cat <<< "$@" 1>&2;
9+
cat <<< "$@" 1>&4;
610
}
711

812
# print error and exit
@@ -26,6 +30,9 @@ shutdown () {
2630
die "Some processes failed to start so quitting."
2731
}
2832

33+
# clean status file
34+
echo "" > ${DOCKER_SELENIUM_STATUS}
35+
2936
# timeout runs the given command and kills it if it is still running
3037
# after the specified time interval:
3138
# http://www.gnu.org/software/coreutils/manual/coreutils.html#timeout-invocation
@@ -79,5 +86,6 @@ x-terminal-emulator -ls \
7986
# Join them in 1 bash line to avoid supervisor split them in debug output
8087
# this output is used to signal docker-selenium is ready for testing
8188
echo -e "\nContainer docker internal IP: $CONTAINER_IP\n"
89+
echo -e "\nContainer docker internal IP: $CONTAINER_IP\n" 1>&3
8290

8391
wait

xvfb/etc/supervisor/conf.d/xvfb.conf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ startsecs=0
3737
;The number of serial failure attempts that supervisord will allow when
3838
;attempting to start the program before giving up and puting the process
3939
;into an FATAL state.
40-
startretries=0
40+
startretries=%(ENV_XVFB_STARTRETRIES)s
4141

4242
;Logs
4343
redirect_stderr=false

0 commit comments

Comments
 (0)