Skip to content

Conversation

@hubb
Copy link

@hubb hubb commented Jul 10, 2025

Description

Addresses cases where mysqld processes were not shutting down properly when vttestserver received SIGTERM signals, leading to orphaned processes and stale socket files.

Key improvements:

  • Extended MySQL shutdown timeout from 60s to configurable 360s default
  • Added --mysql_shutdown_timeout CLI flag and MYSQL_SHUTDOWN_TIMEOUT env var
  • Implemented force shutdown with process cleanup (SIGTERM → SIGKILL)
  • Added automatic cleanup of stale socket and PID files
  • Enhanced error handling and logging for shutdown failures

The timeout mismatch was the primary issue: vttest layer timed out at 60s while MySQL's shutdown timeout is 300s, causing premature cancellation.

Changes:

  • go/vt/vttest/mysqlctl.go: Add TearDownWithTimeout() method
  • go/vt/vttest/local_cluster.go: Enhanced TearDown() with process cleanup
  • go/cmd/vttestserver/cli/main.go: Add --mysql_shutdown_timeout flag
  • docker/vttestserver/run.sh: Support MYSQL_SHUTDOWN_TIMEOUT env var
  • tools/debug_mysql_shutdown.sh: Debug script for troubleshooting
  • doc/MySQL_Shutdown_Fix.md: Comprehensive documentation

Fixes prevent data corruption risks and container restart failures. Backward compatible with improved defaults.

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported to release branches
  • If this change is to be back-ported to previous releases, a justification is included in the PR description
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on CI?
  • Documentation was added or is not required

Deployment Notes

Addresses cases where mysqld processes were not shutting down properly when
vttestserver received SIGTERM signals, leading to orphaned processes and
stale socket files.

Key improvements:
- Extended MySQL shutdown timeout from 60s to configurable 360s default
- Added --mysql_shutdown_timeout CLI flag and MYSQL_SHUTDOWN_TIMEOUT env var
- Implemented force shutdown with process cleanup (SIGTERM → SIGKILL)
- Added automatic cleanup of stale socket and PID files
- Enhanced error handling and logging for shutdown failures

The timeout mismatch was the primary issue: vttest layer timed out at 60s
while MySQL's shutdown timeout is 300s, causing premature cancellation.

Changes:
- go/vt/vttest/mysqlctl.go: Add TearDownWithTimeout() method
- go/vt/vttest/local_cluster.go: Enhanced TearDown() with process cleanup
- go/cmd/vttestserver/cli/main.go: Add --mysql_shutdown_timeout flag
- docker/vttestserver/run.sh: Support MYSQL_SHUTDOWN_TIMEOUT env var
- tools/debug_mysql_shutdown.sh: Debug script for troubleshooting
- doc/MySQL_Shutdown_Fix.md: Comprehensive documentation

Fixes prevent data corruption risks and container restart failures.
Backward compatible with improved defaults.
@hubb hubb closed this Jul 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant