Skip to content

Commit 8d94bf0

Browse files
[build] Add migration retry logic to hue.sh
Updated `run_syncdb_and_migrate_subcommands` function with retry. This is mostly to be resilient to migration processes running concurrently from multiple hosts. django's `migrate` command runs in its own db transaction, so it is safe wrt concurrency, But, it still can cause our startup script to fail. The retry mechanism adds a delay and retry. Any earlier `migrate` commands shouldn't take more than 5 seconds. So, the subsequent `migrate` would either be a no-op or complete the migration.
1 parent af4342d commit 8d94bf0

File tree

1 file changed

+32
-2
lines changed

1 file changed

+32
-2
lines changed

tools/scripts/hue.sh

Lines changed: 32 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -157,9 +157,39 @@ function stop_previous_hueprocs() {
157157
rm -f /tmp/hue_${HUE_PORT}.pid
158158
}
159159

160+
161+
# Executes Django database migrations with retry logic to handle
162+
# concurrent migration attempts from multiple hosts.
160163
function run_syncdb_and_migrate_subcommands() {
161-
"$HUE" makemigrations --noinput
162-
"$HUE" migrate --fake-initial
164+
# Run the initial command first, but allow it to fail gracefully
165+
echo "INFO: Running --fake-initial to sync history for legacy databases..."
166+
if ! $HUE migrate --fake-initial --noinput; then
167+
echo "WARN: --fake-initial failed, but continuing with regular migration..."
168+
fi
169+
170+
# Now, attempt the main migration in a retry loop.
171+
local max_attempts=3
172+
local delay_seconds=5
173+
174+
for ((attempt=1; attempt<=max_attempts; attempt++)); do
175+
echo "INFO: Applying migrations (Attempt $attempt of $max_attempts)..."
176+
177+
# If the migrate command succeeds, we're done.
178+
if $HUE migrate --noinput; then
179+
echo "INFO: Migration successful."
180+
return 0
181+
fi
182+
183+
# If we've not reached the max attempts, wait and retry.
184+
if [ $attempt -lt $max_attempts ]; then
185+
echo "WARN: Migration failed, likely due to a temporary lock. Retrying in $delay_seconds seconds..."
186+
sleep $delay_seconds
187+
fi
188+
done
189+
190+
# If the loop finishes, all attempts have failed.
191+
echo "ERROR: All migration attempts failed after $max_attempts tries." >&2
192+
exit 1
163193
}
164194

165195
if [[ "$1" == "kt_renewer" ]]; then

0 commit comments

Comments
 (0)