Skip to content

Commit 3e0afba

Browse files
committed
(tasks) Provide retry logic for install_package_file
I have seen some failures in kvm_automation_tooling CI where attempting to install the openvox8-release package with rpm will fail because of the rpm lock held be some other process during vm initialization: 2025-05-07T00:18:42 [INFO]: Executing: rpm -Uvh /tmp/tmp.jkHsr7JPYU/openvox8-release-el-9.noarch.rpm error: can't create transaction lock on /var/lib/rpm/.rpm.lock (Resource temporarily unavailable) Package managers like dnf, yum, apt, etc, already handle waiting for the locks, but direct use of rpm or dpkg can run into these problems. This patch wraps the use of rpm and dpkg (for manually installing the downloaded openvox release file) with a little retry logic. It only fires if the command fails and the error output matches a regex for a lock file error. This is brittle, but if the match fails, the default behavior is as if no retries were done. This is safest, since failures outside of a lock conflict are most likely for a good reason.
1 parent 9c345a3 commit 3e0afba

File tree

1 file changed

+71
-5
lines changed

1 file changed

+71
-5
lines changed

files/common.sh

Lines changed: 71 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -44,22 +44,88 @@ exists() {
4444

4545
# Log and execute a command in a subshell, capturing and echoing its
4646
# output before returning its exit status.
47+
#
48+
# Output is also captured in the variable EXEC_AND_CAPTURE_RESULT
49+
# for other functions to inspect.
4750
exec_and_capture() {
4851
local _cmd="$*"
4952

5053
info "Executing: ${_cmd}"
5154

52-
local _result
5355
set +e
54-
_result=$(${_cmd} 2>&1)
56+
EXEC_AND_CAPTURE_RESULT=$(${_cmd} 2>&1)
5557
local _status=$?
5658
set -e
5759

58-
echo "${_result}"
60+
echo "${EXEC_AND_CAPTURE_RESULT}"
5961
info "Status: ${_status}"
6062
return $_status
6163
}
6264

65+
# If the passed command fails with output matching the given regex,
66+
# retry the command after delay up to given number of retries.
67+
#
68+
# Aborts retries if the command fails but output does not match the
69+
# given error regex.
70+
#
71+
# Returns the status of the last executed command.
72+
#
73+
# Args:
74+
# $1 - retry count
75+
# $2 - delay in seconds
76+
# $3 - regex to match against command output
77+
# Remaining args - command to execute
78+
with_retries_if() {
79+
local _retries="$1"
80+
local _delay="$2"
81+
local _error_regex="$3"
82+
shift 3
83+
84+
local _cmd="$*"
85+
local _status
86+
87+
for ((i = 0; i < _retries; i++)); do
88+
info "Attempt $((i + 1)) of $_retries: ${_cmd}"
89+
exec_and_capture "${_cmd}"
90+
_status=$?
91+
if [[ "${_status}" == 0 ]]; then
92+
break # command succeeded
93+
else
94+
if [[ "${EXEC_AND_CAPTURE_OUTPUT}" =~ ${_error_regex} ]]; then
95+
info "Retrying in ${_delay} seconds..."
96+
sleep "${_delay}"
97+
else
98+
info "Command failed but output did not match /${_error_regex}/. Aborting retries."
99+
break
100+
fi
101+
fi
102+
done
103+
104+
return "${_status}"
105+
}
106+
107+
# Retries an rpm command if it fails with output that matches an rpm
108+
# lock error (rpm running in another process).
109+
#
110+
# All arguments given to the function are passed to the rpm command.
111+
#
112+
# NOTE: The higher level package managers (dnf, yum, apt, etc.)
113+
# already manage lock waits. This function is only used for the
114+
# specific case of manually installing the release package, which
115+
# can collide with other uses of rpm during vm initialization, for
116+
# example.
117+
rpm_with_retries() {
118+
with_retries_if 5 5 'error.*rpm.*lock' rpm "$@"
119+
}
120+
121+
# Varient of rpm_with_retries for dpkg.
122+
#
123+
# (I haven't seen a dpkg lock failure in CI, but the mechanism for
124+
# failure is the same.)
125+
dpkg_with_retries() {
126+
with_retries_if 5 5 'error.*dpkg.*lock' dpkg "$@"
127+
}
128+
63129
# Download the given url to the given local file path.
64130
download() {
65131
local _url="$1"
@@ -194,10 +260,10 @@ install_package_file() {
194260
info "Installing release package '${_package_file}' of type '${_package_type}'"
195261
case $_package_type in
196262
rpm)
197-
exec_and_capture rpm -Uvh "$_package_file"
263+
rpm_with_retries -Uvh "$_package_file"
198264
;;
199265
deb)
200-
exec_and_capture dpkg -i "$_package_file"
266+
dpkg_with_retries -i "$_package_file"
201267
;;
202268
*)
203269
fail "Unhandled package type: '${package_type}'"

0 commit comments

Comments
 (0)