You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement graceful shutdown with proper prediction completion
This implements a comprehensive graceful shutdown mechanism that waits for
in-flight predictions to complete before stopping runners and the service.
Key changes:
**Runner-level graceful shutdown:**
- Add shutdownWhenIdle atomic flag and readyForShutdown channel to Runner
- GracefulShutdown() signals runners to shutdown when idle
- updateStatus() automatically closes readyForShutdown when becoming READY with no pending predictions
- Add nil check with warning for test compatibility
**Handler-level prediction rejection:**
- Add gracefulShutdown atomic flag to reject new predictions during shutdown
- Handler.Stop() sets flag and waits for manager shutdown
- Predict() returns 503 Service Unavailable during shutdown
**Manager-level coordinated shutdown:**
- Manager.Stop() signals all runners for graceful shutdown
- Use WaitGroup.Go() for independent parallel runner shutdowns
- Respect RunnerShutdownGracePeriod timeout before force stopping
- Wait on runner.readyForShutdown channel or timeout
**Service-level errgroup coordination:**
- Fix errgroup goroutines to exit on shutdown signal
- Add shutdown case to force shutdown monitor goroutine
- Signal handler already had proper shutdown case
- Add contextcheck nolint for long-lived errgroup context
**Test coverage:**
- Add E2E test for 503 rejection of new predictions during shutdown
- Verify graceful shutdown waits for in-flight predictions
- Test service properly stops after shutdown completes
This restores the graceful shutdown behavior from commit 575d218 that was
lost during the server refactor, ensuring predictions complete naturally
during the grace period rather than being immediately force-killed.
0 commit comments