Skip to content

Releases: Lightning-AI/pytorch-lightning

Weekly patch release

15 Dec 17:19
e5d5901

Choose a tag to compare

App

Added

  • Added Lightning{Flow,Work}.lightningignores attributes to programmatically ignore files before uploading to the cloud (#15818)
  • Added a progress bar while connecting to an app through the CLI (#16035)
  • Support running on multiple clusters (#16016)
  • Added guards to cluster deletion from cli (#16053)
  • Added creation of the default .lightningignore that ignores venv (#16056)

Changed

  • Cleanup cluster waiting (#16054)

Fixed

  • Fixed DDPStrategy import in app framework (#16029)
  • Fixed AutoScaler raising an exception when non-default cloud compute is specified (#15991)
  • Fixed and improvements of login flow (#16052)
  • Fixed the debugger detection mechanism for the lightning App in VSCode (#16068)

Pytorch

  • some minor cleaning

Full Changelog: 1.8.4.post0...1.8.5

Minor patch release

09 Dec 23:43
60b3cc9

Choose a tag to compare

App

  • Fixed MultiNode Component to use separate cloud computes (#15965)
  • Fixed Registration for CloudComputes of Works in L.app.structures (#15964)
  • Fixed a bug where auto-upgrading to the latest lightning via the CLI could get stuck in a loop (#15984)

Pytorch

  • Fixed the XLAProfiler not recording anything due to mismatching of action names (#15885)

Full Changelog: 1.8.4...1.8.4.post0

Dependency hotfix

09 Dec 05:02

Choose a tag to compare

Weekly patch release

08 Dec 18:52
7eb5ff5

Choose a tag to compare

App

Added

  • Add code_dir argument to tracer run (#15771)
  • Added the CLI command lightning run model to launch a LightningLite accelerated script (#15506)
  • Added the CLI command lightning delete app to delete a lightning app on the cloud (#15783)
  • Added a CloudMultiProcessBackend which enables running a child App from within the Flow in the cloud (#15800)
  • Utility for pickling work object safely even from a child process (#15836)
  • Added AutoScaler component (#15769)
  • Added the property ready of the LightningFlow to inform when the Open App should be visible (#15921)
  • Added private work attributed _start_method to customize how to start the works (#15923)
  • Added a configure_layout method to the LightningWork which can be used to control how the work is handled in the layout of a parent flow (#15926)
  • Added the ability to run a Lightning App or Component directly from the Gallery using lightning run app organization/name (#15941)
  • Added automatic conversion of list and dict of works and flows to structures (#15961)

Changed

  • The MultiNode components now warn the user when running with num_nodes > 1 locally (#15806)
  • Cluster creation and deletion now waits by default [#15458
  • Running an app without a UI locally no longer opens the browser (#15875)
  • Show a message when BuildConfig(requirements=[...]) is passed but a requirements.txt file is already present in the Work (#15799)
  • Show a message when BuildConfig(dockerfile="...") is passed but a Dockerfile file is already present in the Work (#15799)
  • Dropped name column from cluster list (#15721)
  • Apps without UIs no longer activate the "Open App" button when running in the cloud (#15875)
  • Wait for full file to be transferred in Path / Payload (#15934)

Removed

  • Removed the SingleProcessRuntime (#15933)

Fixed

  • Fixed SSH CLI command listing stopped components (#15810)
  • Fixed bug when launching apps on multiple clusters (#15484)
  • Fixed Sigterm Handler causing thread lock which caused KeyboardInterrupt to hang (#15881)
  • Fixed MPS error for multinode component (defaults to cpu on mps devices now as distributed operations are not supported by pytorch on mps) (#15748)
  • Fixed the work not stopped when successful when passed directly to the LightningApp (#15801)
  • Fixed the PyTorch Inference locally on GPU (#15813)
  • Fixed the enable_spawn method of the WorkRunExecutor (#15812)
  • Fixed require/import decorator (#15849)
  • Fixed a bug where using L.app.structures would cause multiple apps to be opened and fail with an error in the cloud (#15911)
  • Fixed PythonServer generating noise on M1 (#15949)
  • Fixed multiprocessing breakpoint (#15950)
  • Fixed detection of a Lightning App running in debug mode (#15951)
  • Fixed ImportError on Multinode if package not present (#15963)

Lite

  • Fixed shuffle=False having no effect when using DDP/DistributedSampler (#15931)

Pytorch

Changed

  • Direct support for compiled models (#15922)

Fixed

  • Fixed issue with unsupported torch.inference_mode() on hpu backends (#15918)
  • Fixed LRScheduler import for PyTorch 2.0 (#15940)
  • Fixed fit_loop.restarting to be False for lr finder (#15620)
  • Fixed torch.jit.script-ing a LightningModule causing an unintended error message about deprecated use_amp property (#15947)

Full Changelog: 1.8.3...1.8.4

Hotfix for Python Server

25 Nov 19:20
92fe188

Choose a tag to compare

App

Changed

  • Fixed the PyTorch Inference locally on GPU (#15813)

Full Changelog: 1.8.3...1.8.3

Hotfix for requirements

23 Nov 15:03
655ade6

Choose a tag to compare

Revert/s3fs (#15792)

* revert s3fs

* post

Weekly patch release

23 Nov 10:11
7d6cfb1

Choose a tag to compare

App

Changed

  • Deduplicate top-level lighting CLI command groups (#15761)
    • lightning add ssh-key CLI command has been transitioned to lightning create ssh-key
    • lightning remove ssh-key CLI command has been transitioned to lightning delete ssh-key
  • Set Torch inference mode for prediction (#15719)
  • Improved LightningTrainerScript start-up time (#15751)
  • Disable XSRF protection in StreamlitFrontend to support upload in localhost (#15684)

Fixed

  • Fixed debugging with VSCode IDE (#15747)
  • Fixed setting property to the LightningFlow (#15750)

Lite

Changed

  • Temporarily removed support for Hydra multi-run (#15737)

Pytorch

Changed

  • Temporarily removed support for Hydra multi-run (#15737)
  • Switch from tensorboard to tensorboardx in TensorBoardLogger (#15728)

Full Changelog: 1.8.2...1.8.3

Weekly patch release

18 Nov 00:44
8bea72b

Choose a tag to compare

App

Added

  • Added title and description to ServeGradio (#15639)
  • Added a friendly error message when attempting to run the default cloud compute with a custom base image configured (#14929)

Changed

  • Improved support for running apps when dependencies aren't installed (#15711)
  • Changed the root directory of the app (which gets uploaded) to be the folder containing the app file, rather than any parent folder containing a .lightning file (#15654)
  • Enabled MultiNode Components to support state broadcasting (#15607)
  • Prevent artefactual "running from outside your current environment" error (#15647)
  • Rename failed -> error in tables (#15608)

Fixed

  • Fixed race condition to over-write the frontend with app infos (#15398)
  • Fixed bi-directional queues sending delta with Drive Component name changes (#15642)
  • Fixed CloudRuntime works collection with structures and accelerated multi node startup time (#15650)
  • Fixed catimage import (#15712)
  • Parse all lines in app file looking for shebangs to run commands (#15714)

Lite

Fixed

  • Fixed the automatic fallback from LightningLite(strategy="ddp_spawn", ...) to LightningLite(strategy="ddp", ...) when on an LSF cluster (#15103)

Pytorch

Fixed

  • Make sure save_dir can be empty str (#15638](#15638))
  • Fixed the automatic fallback from Trainer(strategy="ddp_spawn", ...) to Trainer(strategy="ddp", ...) when on an LSF cluster (#15103](#15103))

Full Changelog: 1.8.1...1.8.2

Weekly patch release

10 Nov 20:25
18c587e

Choose a tag to compare

App

Added

  • Added the start method to the work (#15523)
  • Added a MultiNode Component to run with distributed computation with any frameworks (#15524)
  • Expose RunWorkExecutor to the work and provides default ones for the MultiNode Component (#15561)
  • Added a start_with_flow flag to the LightningWork which can be disabled to prevent the work from starting at the same time as the flow (#15591)
  • Added support for running Lightning App with VSCode IDE debugger (#15590)
  • Added bi-directional delta updates between the flow and the works (#15582)
  • Added --setup flag to lightning run app CLI command allowing for dependency installation via app comments (#15577)
  • Auto-upgrade / detect environment mis-match from the CLI (#15434)
  • Added Serve component (#15609)

Changed

  • Changed the flow.flows to be recursive wont to align the behavior with the flow.works (#15466)
  • The params argument in TracerPythonScript.run no longer prepends -- automatically to parameters (#15518)
  • Only check versions / env when not in the cloud (#15504)
  • Periodically sync database to the drive (#15441)
  • Slightly safer multi node (#15538)
  • Reuse existing commands when running connect more than once (#15471)

Fixed

  • Fixed writing app name and id in connect.txt file for the command CLI (#15443)
  • Fixed missing root flow among the flows of the app (#15531)
  • Fixed bug with Multi Node Component and add some examples (#15557)
  • Fixed a bug where payload would take a very long time locally (#15557)
  • Fixed an issue with the lightning CLI taking a long time to error out when the cloud is not reachable (#15412)

Lite

Fixed

  • Fix an issue with the SLURM srun detection causing permission errors (#15485)
  • Fixed the import of lightning_lite causing a warning 'Redirects are currently not supported in Windows or MacOs' (#15610)

PyTorch

Fixed

  • Fixed TensorBoardLogger not validating the input array type when logging the model graph (#15323)
  • Fixed an attribute error in ColossalAIStrategy at import time when torch.distributed is not available (#15535)
  • Fixed an issue when calling fs.listdir with file URI instead of path in CheckpointConnector (#15413)
  • Fixed an issue with the BaseFinetuning callback not setting the track_running_stats attribute for batch normaliztion layers (#15063)
  • Fixed an issue with WandbLogger(log_model=True|'all) raising an error and not being able to serialize tensors in the metadata (#15544)
  • Fixed the gradient unscaling logic when using Trainer(precision=16) and fused optimizers such as Adam(..., fused=True) (#15544)
  • Fixed model state transfer in multiprocessing launcher when running multi-node (#15567)
  • Fixed manual optimization raising AttributeError with Bagua Strategy (#12534)
  • Fixed the import of pytorch_lightning causing a warning 'Redirects are currently not supported in Windows or MacOs' (#15610)

Full Changelog: 1.8.0...1.8.1

Minor pkg stability fix

02 Nov 16:31

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 1.8.0...1.8.0.post1