Skip to content

cut-over locks not released when gh-ost pauses mid-cut-overΒ #1407

@timvaillancourt

Description

@timvaillancourt

During the cut-over operation gh-ost issues a lock tables on the tables before they're renamed. After the rename an unlock tables is issued to unlock the tables

Today, if gh-ost pauses/freezes (process remains running but is unresponsive due to a host problem) between the lock tables and unlock tables, the locks are not released. We haven't explained what could cause the host running gh-ost to essentially freeze execution, but we had this occur in production and locks were never released until the MySQL wait_timeout (for killing idle connections)

This theoretically can be reproduced by:

  1. Adding a pause after the lock tables step in the cut-over (hand-wavy)
  2. Freeze the gh-ost process with kill -TSTP [pid] or kill -STOP [pid]
  3. Observe the table locks never getting released until wait_timeout (default 30 minutes)

To address this, I plan to shorten the wait_timeout of the applier MySQL session during cut-over only, as this is the only time where a short idle timeout is advantageous. After the cut-over the wait_timeout for the session will be restored to the server default

This work began in #1401 and is completed with #1406

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions