-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
bugSomething isn't workingSomething isn't workingneeds triageWaiting to be triaged by maintainersWaiting to be triaged by maintainersver: 2.4.x
Description
Bug description
Import error on shutdown/KeyboardInterrupt if ran from Jupyter Lab notebook cell. If ran from script everything works fine.
What version are you seeing the problem on?
v2.4
How to reproduce the bug
Run trainer.fit from a Jupyter notebook cell, then click stop in Jupyter notebook.
print("---start train---")
trainer.fit(model, train_dataloader, ckpt_path=ckpt_path)
Error messages and logs
Detected KeyboardInterrupt, attempting graceful shutdown ...
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
~/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py in _call_and_handle_interrupt(trainer, trainer_fn, *args, **kwargs)
45 if trainer.strategy.launcher is not None:
---> 46 return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
47 return trainer_fn(*args, **kwargs)
~/.local/lib/python3.10/site-packages/lightning/pytorch/strategies/launchers/multiprocessing.py in launch(self, function, trainer, *args, **kwargs)
143 self.procs = process_context.processes
--> 144 while not process_context.join():
145 pass
~/.local/lib/python3.10/site-packages/torch/multiprocessing/spawn.py in join(self, timeout)
117 # Wait for any process to fail or all of them to succeed.
--> 118 ready = multiprocessing.connection.wait(
119 self.sentinels.keys(),
/usr/lib/python3.10/multiprocessing/connection.py in wait(object_list, timeout)
930 while True:
--> 931 ready = selector.select(timeout)
932 if ready:
/usr/lib/python3.10/selectors.py in select(self, timeout)
415 try:
--> 416 fd_event_list = self._selector.poll(timeout)
417 except InterruptedError:
KeyboardInterrupt:
During handling of the above exception, another exception occurred:
NameError Traceback (most recent call last)
/tmp/ipykernel_2824/3752444865.py in <module>
189 ckpt_path = None
190 print("---start train---")
--> 191 trainer.fit(model, train_dataloader, ckpt_path=ckpt_path)
~/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py in fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
536 self.state.status = TrainerStatus.RUNNING
537 self.training = True
--> 538 call._call_and_handle_interrupt(
539 self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
540 )
~/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py in _call_and_handle_interrupt(trainer, trainer_fn, *args, **kwargs)
62 if isinstance(launcher, _SubprocessScriptLauncher):
63 launcher.kill(_get_sigkill_signal())
---> 64 exit(1)
65
66 except BaseException as exception:
NameError: name 'exit' is not defined
Environment
Current environment
- CUDA:
- GPU:
- NVIDIA A100-SXM4-40GB
- available: True
- version: 12.1
- GPU:
- Lightning:
- lightning: 2.4.0
- lightning-utilities: 0.11.7
- pytorch-lightning: 2.4.0
- torch: 2.4.1
- torch-summary: 1.4.5
- torchmetrics: 1.4.2
- torchvision: 0.15.2
- Packages:
- absl-py: 0.15.0
- aiohappyeyeballs: 2.4.3
- aiohttp: 3.10.8
- aiosignal: 1.3.1
- aiosqlite: 0.19.0
- annotated-types: 0.6.0
- anyio: 4.1.0
- appdirs: 1.4.4
- argon2-cffi: 21.1.0
- arrow: 1.3.0
- astunparse: 1.6.3
- async-lru: 2.0.4
- async-timeout: 4.0.3
- attrs: 23.1.0
- automat: 20.2.0
- babel: 2.13.1
- backcall: 0.2.0
- bcrypt: 3.2.0
- beautifulsoup4: 4.10.0
- beniget: 0.4.1
- bleach: 4.1.0
- blinker: 1.4
- bottle: 0.12.19
- bottleneck: 1.3.2
- brotli: 1.0.9
- cachetools: 5.0.0
- certifi: 2020.6.20
- cffi: 1.15.0
- chardet: 4.0.0
- charset-normalizer: 3.3.2
- click: 8.0.3
- cloud-init: 23.3.3
- colorama: 0.4.4
- comm: 0.2.0
- command-not-found: 0.3
- configobj: 5.0.6
- constantly: 15.1.0
- cryptography: 3.4.8
- ctop: 1.0.0
- cycler: 0.11.0
- dacite: 1.8.1
- dbus-python: 1.2.18
- debugpy: 1.8.0
- decorator: 4.4.2
- defusedxml: 0.7.1
- distlib: 0.3.4
- distro: 1.7.0
- distro-info: 1.1+ubuntu0.1
- docker: 5.0.3
- entrypoints: 0.4
- et-xmlfile: 1.0.1
- exceptiongroup: 1.2.0
- fastjsonschema: 2.19.0
- filelock: 3.6.0
- flake8: 4.0.1
- flatbuffers: 1.12.1-git20200711.33e2d80-dfsg1-0.6
- fonttools: 4.29.1
- fqdn: 1.5.1
- frozenlist: 1.4.1
- fs: 2.4.12
- fsspec: 2024.9.0
- future: 0.18.2
- gast: 0.5.2
- glances: 3.2.4.2
- google-auth: 1.5.1
- google-auth-oauthlib: 0.4.2
- google-pasta: 0.2.0
- grpcio: 1.30.2
- h5py: 3.6.0
- h5py.-debian-h5py-serial: 3.6.0
- html5lib: 1.1
- htmlmin: 0.1.12
- httplib2: 0.20.2
- huggingface-hub: 0.25.1
- hyperlink: 21.0.0
- icdiff: 2.0.4
- idna: 3.3
- imagehash: 4.3.1
- importlib-metadata: 4.6.4
- incremental: 21.3.0
- influxdb: 5.3.1
- iniconfig: 1.1.1
- iotop: 0.6
- ipykernel: 6.7.0
- ipython: 7.31.1
- ipython-genutils: 0.2.0
- ipywidgets: 8.1.1
- isoduration: 20.11.0
- jax: 0.4.14
- jaxlib: 0.4.14
- jdcal: 1.0
- jedi: 0.18.0
- jeepney: 0.7.1
- jinja2: 3.0.3
- joblib: 0.17.0
- json5: 0.9.14
- jsonpatch: 1.32
- jsonpointer: 2.0
- jsonschema: 4.20.0
- jsonschema-specifications: 2023.11.2
- jupyter-client: 8.6.0
- jupyter-console: 6.4.0
- jupyter-core: 5.5.0
- jupyter-events: 0.9.0
- jupyter-lsp: 2.2.1
- jupyter-server: 2.12.0
- jupyter-server-fileid: 0.9.0
- jupyter-server-terminals: 0.4.4
- jupyter-ydoc: 1.1.1
- jupyterlab: 4.0.9
- jupyterlab-pygments: 0.1.2
- jupyterlab-server: 2.25.2
- jupyterlab-widgets: 3.0.9
- kaptan: 0.5.12
- keras: 2.13.1
- keyring: 23.5.0
- kiwisolver: 1.3.2
- launchpadlib: 1.10.16
- lazr.restfulclient: 0.14.4
- lazr.uri: 1.0.6
- libtmux: 0.10.1
- lightning: 2.4.0
- lightning-utilities: 0.11.7
- llvmlite: 0.41.1
- lxml: 4.8.0
- lz4: 3.1.3+dfsg
- markdown: 3.3.6
- markupsafe: 2.0.1
- matplotlib: 3.5.1
- matplotlib-inline: 0.1.3
- mccabe: 0.6.1
- mistune: 3.0.2
- ml-dtypes: 0.2.0
- more-itertools: 8.10.0
- mpmath: 0.0.0
- msgpack: 1.0.3
- multidict: 6.1.0
- multimethod: 1.10
- nbclient: 0.5.6
- nbconvert: 7.12.0
- nbformat: 5.9.2
- nest-asyncio: 1.5.4
- netifaces: 0.11.0
- networkx: 2.4
- nose: 1.3.7
- notebook: 6.4.8
- notebook-shim: 0.2.3
- numba: 0.58.1
- numexpr: 2.8.1
- numpy: 1.25.2
- nvidia-cublas-cu12: 12.1.3.1
- nvidia-cuda-cupti-cu12: 12.1.105
- nvidia-cuda-nvrtc-cu12: 12.1.105
- nvidia-cuda-runtime-cu12: 12.1.105
- nvidia-cudnn-cu12: 9.1.0.70
- nvidia-cufft-cu12: 11.0.2.54
- nvidia-curand-cu12: 10.3.2.106
- nvidia-cusolver-cu12: 11.4.5.107
- nvidia-cusparse-cu12: 12.1.0.106
- nvidia-ml-py3: 7.352.0
- nvidia-nccl-cu12: 2.20.5
- nvidia-nvjitlink-cu12: 12.6.77
- nvidia-nvtx-cu12: 12.1.105
- oauthlib: 3.2.0
- odfpy: 1.4.2
- olefile: 0.46
- openpyxl: 3.0.9
- opt-einsum: 3.3.0
- overrides: 7.4.0
- packaging: 21.3
- pandas: 1.3.5
- pandas-profiling: 3.6.6
- pandocfilters: 1.5.0
- parso: 0.8.1
- patsy: 0.5.4
- pexpect: 4.8.0
- phik: 0.12.3
- pickleshare: 0.7.5
- pillow: 9.0.1
- pip: 23.3.1
- platformdirs: 2.5.1
- pluggy: 0.13.0
- ply: 3.11
- prometheus-client: 0.9.0
- prompt-toolkit: 3.0.28
- protobuf: 4.21.12
- psutil: 5.9.0
- ptyprocess: 0.7.0
- py: 1.10.0
- pyasn1: 0.4.8
- pyasn1-modules: 0.2.1
- pycodestyle: 2.8.0
- pycparser: 2.21
- pycryptodomex: 3.11.0
- pydantic: 2.5.2
- pydantic-core: 2.14.5
- pyflakes: 2.4.0
- pygments: 2.11.2
- pygobject: 3.42.1
- pyhamcrest: 2.0.2
- pyinotify: 0.9.6
- pyjwt: 2.3.0
- pyopenssl: 21.0.0
- pyparsing: 2.4.7
- pyrsistent: 0.18.1
- pyserial: 3.5
- pysmi: 0.3.2
- pysnmp: 4.4.12
- pystache: 0.6.0
- pytest: 6.2.5
- python-apt: 2.4.0+ubuntu2
- python-dateutil: 2.8.2
- python-debian: 0.1.43+ubuntu1.1
- python-json-logger: 2.0.7
- python-magic: 0.4.24
- pythran: 0.10.0
- pytorch-lightning: 2.4.0
- pytz: 2022.1
- pywavelets: 1.5.0
- pyyaml: 5.4.1
- pyzmq: 25.1.2
- referencing: 0.31.1
- regex: 2024.9.11
- requests: 2.31.0
- requests-oauthlib: 1.3.0
- rfc3339-validator: 0.1.4
- rfc3986-validator: 0.1.1
- rpds-py: 0.13.2
- rsa: 4.8
- safetensors: 0.4.5
- scikit-learn: 0.23.2
- scipy: 1.8.0
- seaborn: 0.12.2
- secretstorage: 3.3.1
- send2trash: 1.8.2
- service-identity: 18.1.0
- setuptools: 59.6.0
- simplejson: 3.17.6
- six: 1.16.0
- sniffio: 1.3.0
- sos: 4.5.6
- soupsieve: 2.3.1
- ssh-import-id: 5.11
- statsmodels: 0.14.0
- sympy: 1.9
- systemd-python: 234
- tables: 3.7.0
- tangled-up-in-unicode: 0.2.0
- tensorboard: 2.13.0
- tensorflow: 2.13.1
- tensorflow-estimator: 2.13.0
- termcolor: 1.1.0
- terminado: 0.13.1
- testpath: 0.5.0
- threadpoolctl: 3.1.0
- tinycss2: 1.2.1
- tmuxp: 1.9.2
- tokenizers: 0.20.0
- toml: 0.10.2
- tomli: 2.0.1
- torch: 2.4.1
- torch-summary: 1.4.5
- torchmetrics: 1.4.2
- torchvision: 0.15.2
- tornado: 6.4
- tqdm: 4.66.1
- traitlets: 5.14.0
- transformers: 4.45.1
- triton: 3.0.0
- twisted: 22.1.0
- typeguard: 4.1.5
- types-python-dateutil: 2.8.19.14
- typing-extensions: 4.8.0
- ubuntu-advantage-tools: 8001
- ufolib2: 0.13.1
- ufw: 0.36.1
- unattended-upgrades: 0.1
- unicodedata2: 14.0.0
- uri-template: 1.3.0
- urllib3: 1.26.5
- virtualenv: 20.13.0+ds
- visions: 0.7.5
- wadllib: 1.3.6
- wcwidth: 0.2.5
- webcolors: 1.13
- webencodings: 0.5.1
- websocket-client: 1.2.3
- werkzeug: 2.0.2
- wheel: 0.37.1
- widgetsnbextension: 4.0.9
- wordcloud: 1.9.2
- wrapt: 1.13.3
- xlwt: 1.3.0
- y-py: 0.6.2
- yarl: 1.13.1
- ydata-profiling: 4.6.3
- ypy-websocket: 0.12.4
- zipp: 1.0.0
- zope.interface: 5.4.0
- System:
- OS: Linux
- architecture:
- 64bit
- ELF
- processor: x86_64
- python: 3.10.12
- release: 6.2.0-37-generic
- version: Fixed typo in single_cpu_template #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 2 18:01:13 UTC 2
#- PyTorch Lightning Version (e.g., 2.4.0): 2.4.0
#- PyTorch Version (e.g., 2.4): 2.4.1+cu121
#- Python version (e.g., 3.12):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration: 1xA100 40GB
#- How you installed Lightning(`conda`, `pip`, source): pip install lightning
More info
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingneeds triageWaiting to be triaged by maintainersWaiting to be triaged by maintainersver: 2.4.x