-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
π Bug
If a KeyboardInterrupt
signal is send during training via e.g. ctrl-c
the trainer.fit()
method finishes immediately but no KeyboardInterrupt
error is raised. Hence, the remaining model training script (everything after the trainer.fit()
call) continues as if nothing has happened. Since my training script saves a status file "successfull" to disc if no error occured, raising no KeyboardInterrupt
error during training is quite problematic.
To Reproduce
The problem comes from the design of the _call_and_handle_interrupt()
method in the lightning trainer class.
This method wraps the entire training and has the following structure:
import time
def main():
# _call_and_handle_interrupt method which wraps entire training has this structure
# However, the BaseException is never reached if a KeyboardInterrupt is caught and thus no error is raised
try:
# training here
for i in range(100):
time.sleep(1)
except KeyboardInterrupt as e:
print('handle KeyBoardInterrupt.')
# why not raining here as well?
except BaseException as e:
# never reached when a KeyboardInterrupt is caught
print('handle BaseException.')
raise
if __name__ == '__main__':
main()
Since the KeyboardInterrupt
exception handling has no raise
statement no error is raised and the trainer.fit()
method exits with no error. It is said that the KeyboardInterrupt
exception block will be removed in lightning version 1.7 but I think the current behavior is not expected. At least one should raise the caught KeyboardInterrupt
after handling it.
Expected behavior
If I send a KeyboardInterrupt
signal while training, I expect the trainer.fit()
method to raise that error immediately so that my script finishes without executing further code which relys on the fact that no error occured.
cc @Borda @tchaton @rohitgr7 @akihironitta @justusschock @awaelchli @ninginthecloud