Releases: Lightning-AI/pytorch-lightning
Releases · Lightning-AI/pytorch-lightning
Weekly patch release
App
Added
- Added
Lightning{Flow,Work}.lightningignoresattributes to programmatically ignore files before uploading to the cloud (#15818) - Added a progress bar while connecting to an app through the CLI (#16035)
- Support running on multiple clusters (#16016)
- Added guards to cluster deletion from cli (#16053)
- Added creation of the default
.lightningignorethat ignoresvenv(#16056)
Changed
- Cleanup cluster waiting (#16054)
Fixed
- Fixed
DDPStrategyimport in app framework (#16029) - Fixed
AutoScalerraising an exception when non-default cloud compute is specified (#15991) - Fixed and improvements of login flow (#16052)
- Fixed the debugger detection mechanism for the lightning App in VSCode (#16068)
Pytorch
- some minor cleaning
Full Changelog: 1.8.4.post0...1.8.5
Minor patch release
App
- Fixed MultiNode Component to use separate cloud computes (#15965)
- Fixed Registration for CloudComputes of Works in
L.app.structures(#15964) - Fixed a bug where auto-upgrading to the latest lightning via the CLI could get stuck in a loop (#15984)
Pytorch
- Fixed the
XLAProfilernot recording anything due to mismatching of action names (#15885)
Full Changelog: 1.8.4...1.8.4.post0
Dependency hotfix
Weekly patch release
App
Added
- Add
code_dirargument to tracer run (#15771) - Added the CLI command
lightning run modelto launch aLightningLiteaccelerated script (#15506) - Added the CLI command
lightning delete appto delete a lightning app on the cloud (#15783) - Added a CloudMultiProcessBackend which enables running a child App from within the Flow in the cloud (#15800)
- Utility for pickling work object safely even from a child process (#15836)
- Added
AutoScalercomponent (#15769) - Added the property
readyof the LightningFlow to inform when theOpen Appshould be visible (#15921) - Added private work attributed
_start_methodto customize how to start the works (#15923) - Added a
configure_layoutmethod to theLightningWorkwhich can be used to control how the work is handled in the layout of a parent flow (#15926) - Added the ability to run a Lightning App or Component directly from the Gallery using
lightning run app organization/name(#15941) - Added automatic conversion of list and dict of works and flows to structures (#15961)
Changed
- The
MultiNodecomponents now warn the user when running withnum_nodes > 1locally (#15806) - Cluster creation and deletion now waits by default [#15458
- Running an app without a UI locally no longer opens the browser (#15875)
- Show a message when
BuildConfig(requirements=[...])is passed but arequirements.txtfile is already present in the Work (#15799) - Show a message when
BuildConfig(dockerfile="...")is passed but aDockerfilefile is already present in the Work (#15799) - Dropped name column from cluster list (#15721)
- Apps without UIs no longer activate the "Open App" button when running in the cloud (#15875)
- Wait for full file to be transferred in Path / Payload (#15934)
Removed
- Removed the
SingleProcessRuntime(#15933)
Fixed
- Fixed SSH CLI command listing stopped components (#15810)
- Fixed bug when launching apps on multiple clusters (#15484)
- Fixed Sigterm Handler causing thread lock which caused KeyboardInterrupt to hang (#15881)
- Fixed MPS error for multinode component (defaults to cpu on mps devices now as distributed operations are not supported by pytorch on mps) (#15748)
- Fixed the work not stopped when successful when passed directly to the LightningApp (#15801)
- Fixed the PyTorch Inference locally on GPU (#15813)
- Fixed the
enable_spawnmethod of theWorkRunExecutor(#15812) - Fixed require/import decorator (#15849)
- Fixed a bug where using
L.app.structureswould cause multiple apps to be opened and fail with an error in the cloud (#15911) - Fixed PythonServer generating noise on M1 (#15949)
- Fixed multiprocessing breakpoint (#15950)
- Fixed detection of a Lightning App running in debug mode (#15951)
- Fixed
ImportErroron Multinode if package not present (#15963)
Lite
- Fixed
shuffle=Falsehaving no effect when using DDP/DistributedSampler (#15931)
Pytorch
Changed
- Direct support for compiled models (#15922)
Fixed
- Fixed issue with unsupported torch.inference_mode() on hpu backends (#15918)
- Fixed LRScheduler import for PyTorch 2.0 (#15940)
- Fixed
fit_loop.restartingto beFalsefor lr finder (#15620) - Fixed
torch.jit.script-ing a LightningModule causing an unintended error message about deprecateduse_ampproperty (#15947)
Full Changelog: 1.8.3...1.8.4
Hotfix for Python Server
Hotfix for requirements
Revert/s3fs (#15792) * revert s3fs * post
Weekly patch release
App
Changed
- Deduplicate top-level lighting CLI command groups (#15761)
lightning add ssh-keyCLI command has been transitioned tolightning create ssh-keylightning remove ssh-keyCLI command has been transitioned tolightning delete ssh-key
- Set Torch inference mode for prediction (#15719)
- Improved
LightningTrainerScriptstart-up time (#15751) - Disable XSRF protection in
StreamlitFrontendto support upload in localhost (#15684)
Fixed
Lite
Changed
- Temporarily removed support for Hydra multi-run (#15737)
Pytorch
Changed
- Temporarily removed support for Hydra multi-run (#15737)
- Switch from
tensorboardtotensorboardxinTensorBoardLogger(#15728)
Full Changelog: 1.8.2...1.8.3
Weekly patch release
App
Added
- Added title and description to ServeGradio (#15639)
- Added a friendly error message when attempting to run the default cloud compute with a custom base image configured (#14929)
Changed
- Improved support for running apps when dependencies aren't installed (#15711)
- Changed the root directory of the app (which gets uploaded) to be the folder containing the app file, rather than any parent folder containing a
.lightningfile (#15654) - Enabled MultiNode Components to support state broadcasting (#15607)
- Prevent artefactual "running from outside your current environment" error (#15647)
- Rename failed -> error in tables (#15608)
Fixed
- Fixed race condition to over-write the frontend with app infos (#15398)
- Fixed bi-directional queues sending delta with Drive Component name changes (#15642)
- Fixed CloudRuntime works collection with structures and accelerated multi node startup time (#15650)
- Fixed catimage import (#15712)
- Parse all lines in app file looking for shebangs to run commands (#15714)
Lite
Fixed
- Fixed the automatic fallback from
LightningLite(strategy="ddp_spawn", ...)toLightningLite(strategy="ddp", ...)when on an LSF cluster (#15103)
Pytorch
Fixed
- Make sure save_dir can be empty str (#15638](#15638))
- Fixed the automatic fallback from
Trainer(strategy="ddp_spawn", ...)toTrainer(strategy="ddp", ...)when on an LSF cluster (#15103](#15103))
Full Changelog: 1.8.1...1.8.2
Weekly patch release
App
Added
- Added the
startmethod to the work (#15523) - Added a
MultiNodeComponent to run with distributed computation with any frameworks (#15524) - Expose
RunWorkExecutorto the work and provides default ones for theMultiNodeComponent (#15561) - Added a
start_with_flowflag to theLightningWorkwhich can be disabled to prevent the work from starting at the same time as the flow (#15591) - Added support for running Lightning App with VSCode IDE debugger (#15590)
- Added
bi-directionaldelta updates between the flow and the works (#15582) - Added
--setupflag tolightning run appCLI command allowing for dependency installation via app comments (#15577) - Auto-upgrade / detect environment mis-match from the CLI (#15434)
- Added Serve component (#15609)
Changed
- Changed the
flow.flowsto be recursive wont to align the behavior with theflow.works(#15466) - The
paramsargument inTracerPythonScript.runno longer prepends--automatically to parameters (#15518) - Only check versions / env when not in the cloud (#15504)
- Periodically sync database to the drive (#15441)
- Slightly safer multi node (#15538)
- Reuse existing commands when running connect more than once (#15471)
Fixed
- Fixed writing app name and id in connect.txt file for the command CLI (#15443)
- Fixed missing root flow among the flows of the app (#15531)
- Fixed bug with Multi Node Component and add some examples (#15557)
- Fixed a bug where payload would take a very long time locally (#15557)
- Fixed an issue with the
lightningCLI taking a long time to error out when the cloud is not reachable (#15412)
Lite
Fixed
- Fix an issue with the SLURM
srundetection causing permission errors (#15485) - Fixed the import of
lightning_litecausing a warning 'Redirects are currently not supported in Windows or MacOs' (#15610)
PyTorch
Fixed
- Fixed
TensorBoardLoggernot validating the input array type when logging the model graph (#15323) - Fixed an attribute error in
ColossalAIStrategyat import time whentorch.distributedis not available (#15535) - Fixed an issue when calling
fs.listdirwith file URI instead of path inCheckpointConnector(#15413) - Fixed an issue with the
BaseFinetuningcallback not setting thetrack_running_statsattribute for batch normaliztion layers (#15063) - Fixed an issue with
WandbLogger(log_model=True|'all)raising an error and not being able to serialize tensors in the metadata (#15544) - Fixed the gradient unscaling logic when using
Trainer(precision=16)and fused optimizers such asAdam(..., fused=True)(#15544) - Fixed model state transfer in multiprocessing launcher when running multi-node (#15567)
- Fixed manual optimization raising
AttributeErrorwith Bagua Strategy (#12534) - Fixed the import of
pytorch_lightningcausing a warning 'Redirects are currently not supported in Windows or MacOs' (#15610)
Full Changelog: 1.8.0...1.8.1
Minor pkg stability fix
What's Changed
- Implement freeze batchnorm with freezing track running stats by @PososikTeam in #15063
- Pkg: fix parsing versions by @Borda in #15401
- Remove pytest as a requirement to run app by @manskx in #15449
New Contributors
- @PososikTeam made their first contribution in #15063
Full Changelog: 1.8.0...1.8.0.post1