-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
During the cut-over operation gh-ost
issues a lock tables
on the tables before they're renamed. After the rename an unlock tables
is issued to unlock the tables
Today, if gh-ost
pauses/freezes (process remains running but is unresponsive due to a host problem) between the lock tables
and unlock tables
, the locks are not released. We haven't explained what could cause the host running gh-ost
to essentially freeze execution, but we had this occur in production and locks were never released until the MySQL wait_timeout
(for killing idle connections)
This theoretically can be reproduced by:
- Adding a pause after the
lock tables
step in the cut-over (hand-wavy) - Freeze the
gh-ost
process withkill -TSTP [pid]
orkill -STOP [pid]
- Observe the table locks never getting released until
wait_timeout
(default 30 minutes)
To address this, I plan to shorten the wait_timeout
of the applier MySQL session during cut-over only, as this is the only time where a short idle timeout is advantageous. After the cut-over the wait_timeout
for the session will be restored to the server default