Releases: Lightning-AI/pytorch-lightning
Standard weekly patch release
[1.1.6] - 2021-01-26
Changed
- Increased TPU check timeout from 20s to 100s (#5598)
- Ignored
stepparam in Neptune logger's log_metric method (#5510) - Pass batch outputs to
on_train_batch_endinstead ofepoch_endoutputs (#4369)
Fixed
- Fixed
toggle_optimizerto resetrequires_gradstate (#5574) - Fixed FileNotFoundError for best checkpoint when using DDP with Hydra (#5629)
- Fixed an error when logging a progress bar metric with a reserved name (#5620)
- Fixed
Metric'sstate_dictnot included when child modules (#5614) - Fixed Neptune logger creating multiple experiments when GPUs > 1 (#3256)
- Fixed duplicate logs appearing in console when using the python logging module (#5509)
- Fixed tensor printing in
trainer.test()(#5138) - Fixed not using dataloader when
hparamspresent (#4559)
Contributors
@awaelchli @bryant1410 @lezwon @manipopopo @PiotrJander @psinger @rnett @SeanNaren @swethmandava @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.1.5] - 2021-01-19
Fixed
- Fixed a visual bug in the progress bar display initialization (#4579)
- Fixed logging
on_train_batch_endin a callback with multiple optimizers (#5521) - Fixed
reinit_scheduler_propertieswith correct optimizer (#5519) - Fixed
val_check_intervalwithfast_dev_run(#5540)
Contributors
@awaelchli, @carmocca, @rohitgr7
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.1.4] - 2021-01-12
Added
- Add automatic optimization property setter to lightning module (#5169)
Changed
- Changed deprecated
enable_pl_optimizer=True(#5244)
Fixed
- Fixed
transfer_batch_to_devicefor DDP withlen(devices_ids) == 1(#5195) - Logging only on
not should_accumulate()during training (#5417) - Resolve interpolation bug with Hydra (#5406)
- Check environ before selecting a seed to prevent warning message (#4743)
Contributors
@ananthsub, @SeanNaren, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.1.3] - 2021-01-05
Added
- Added a check for optimizer attached to
lr_scheduler(#5338) - Added support for passing non-existing
filepathstoresume_from_checkpoint(#4402)
Changed
- Skip restore from
resume_from_checkpointwhiletesting(#5161) - Allowed
log_momentumfor adaptive optimizers inLearningRateMonitor(#5333) - Disabled checkpointing, earlystopping and logging with
fast_dev_run(#5277) - Distributed group defaults to
WORLDifNone(#5125)
Fixed
- Fixed
trainer.testreturning non-test metrics (#5214) - Fixed metric state reset (#5273)
- Fixed
--num-nodesonDDPSequentialPlugin(#5327) - Fixed invalid value for
weights_summary(#5296) - Fixed
Trainer.testnot using the latestbest_model_path(#5161) - Fixed existence check for
hparamsnot using underlying filesystem (#5250) - Fixed
LightningOptimizerAMP bug (#5191) - Fixed casted key to string in
_flatten_dict(#5354)
Contributors
@8greg8, @haven-jeon, @kandluis, @marload, @rohitgr7, @tadejsv, @tarepan, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Overview
Detail changes
Added
- Support number for logging with
sync_dist=True(#5080) - Added offset logging step when resuming for Wandb logger (#5050)
Removed
enable_pl_optimizer=Falseby default to temporarily fix AMP issues (#5163)
Fixed
- Metric reduction with Logging (#5150)
- Remove nan loss in manual optimization (#5121)
- Un-balanced logging properly supported (#5119)
- Fix hanging in DDP HPC accelerators (#5157)
- Fix saved filename in
ModelCheckpointif it already exists (#4861) - Fix reset
TensorRunningAccum(#5106) - Updated
DALIClassificationLoaderto not use deprecated arguments (#4925) - Corrected call to
torch.no_grad(#5124)
Contributors
@8greg8, @ananthsub, @borisdayma, @gan3sh500, @rohitgr7, @SeanNaren, @tchaton, @VinhLoiIT
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Overview
Detail changes
Added
- Add a notebook example to reach a quick baseline of ~94% accuracy on CIFAR10 using Resnet in Lightning (#4818)
Changed
Removed
Fixed
- Fixed trainer by default
NoneinDDPAccelerator(#4915) - Fixed
LightningOptimizerto expose optimizer attributes (#5095) - Do not warn when the
namekey is used in thelr_schedulerdict (#5057) - Check if optimizer supports closure (#4981)
- Extend LightningOptimizer to exposure underlying Optimizer attributes + update doc (#5095)
- Add deprecated metric utility functions back to functional (#5067, #5068)
- Allow any input in
to_onnxandto_torchscript(#4378) - Do not warn when the name key is used in the
lr_schedulerdict (#5057)
Contributors
@Borda, @carmocca, @hemildesai, @rohitgr7, @s-rog, @tarepan, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Model Parallelism Training and More Logging Options
Overview
Lightning 1.1 is out! You can now train models with twice the parameters and zero code changes with the new sharded model training! We also have a new plugin for sequential model parallelism, more logging options, and a lot of improvements!
Release highlights: https://bit.ly/3gyLZpP
Learn more about sharded training: https://bit.ly/2W3hgI0
Detail changes
Added
- Added "monitor" key to saved
ModelCheckpoints(#4383) - Added
ConfusionMatrixclass interface (#4348) - Added multiclass AUROC metric (#4236)
- Added global step indexing to the checkpoint name for a better sub-epoch checkpointing experience (#3807)
- Added optimizer hooks in callbacks (#4379)
- Added option to log momentum (#4384)
- Added
current_scoretoModelCheckpoint.on_save_checkpoint(#4721) - Added logging using
self.login train and evaluation for epoch end hooks (#4913) - Added ability for DDP plugin to modify optimizer state saving (#4675)
- Added casting to python types for NumPy scalars when logging
hparams(#4647) - Added
prefixargument in loggers (#4557) - Added printing of total num of params, trainable and non-trainable params in ModelSummary (#4521)
- Added
PrecisionRecallCurve, ROC, AveragePrecisionclass metric (#4549) - Added custom
ApexandNativeAMPasPrecision plugins(#4355) - Added
DALI MNISTexample (#3721) - Added
sharded pluginfor DDP for multi-GPU training memory optimizations (#4773) - Added
experiment_idto the NeptuneLogger (#3462) - Added
Pytorch Geometricintegration example with Lightning (#4568) - Added
all_gathermethod toLightningModulewhich allows gradient-based tensor synchronizations for use-cases such as negative sampling. (#5012) - Enabled
self.login most functions (#4969) - Added changeable extension variable for
ModelCheckpoint(#4977)
Changed
- Removed
multiclass_rocandmulticlass_precision_recall_curve, userocandprecision_recall_curveinstead (#4549) - Tuner algorithms will be skipped if
fast_dev_run=True(#3903) - WandbLogger does not force wandb
reinitarg to True anymore and creates a run only when needed (#4648) - Changed
automatic_optimizationto be a model attribute (#4602) - Changed
Simple Profilerreport to order by percentage time spent + num calls (#4880) - Simplify optimization Logic (#4984)
- Classification metrics overhaul (#4837)
- Updated
fast_dev_runto accept integer representing num_batches (#4629) - Refactored optimizer (#4658)
Deprecated
- Deprecated
prefixargument inModelCheckpoint(#4765) - Deprecated the old way of assigning hyper-parameters through
self.hparams = ...(#4813) - Deprecated
mode='auto'fromModelCheckpointandEarlyStopping(#4695)
Removed
- Removed
reorderparameter of theaucmetric (#5004)
Fixed
- Added feature to move tensors to CPU before saving (#4309)
- Fixed
LoggerConnectorto have logged metrics on root device in DP (#4138) - Auto convert tensors to contiguous format when
gather_all(#4907) - Fixed
PYTHONPATHfor DDP test model (#4528) - Fixed allowing logger to support indexing (#4595)
- Fixed DDP and manual_optimization (#4976)
Contributors
@ananyahjha93, @awaelchli, @blatr, @Borda, @borisdayma, @carmocca, @ddrevicky, @george-gca, @gianscarpe, @irustandi, @janhenriklambrechts, @jeremyjordan, @justusschock, @lezwon, @rohitgr7, @s-rog, @SeanNaren, @SkafteNicki, @tadejsv, @tchaton, @williamFalcon, @zippeurfou
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Detail changes
Added
- Added casting to python types for numpy scalars when logging
hparams(#4647) - Added warning when progress bar refresh rate is less than 20 on Google Colab to prevent crashing (#4654)
- Added
F1class metric (#4656)
Changed
- Consistently use
step=trainer.global_stepinLearningRateMonitorindependently oflogging_interval(#4376) - Metric states are no longer as default added to
state_dict(#4685) - Renamed class metric
Fbeta>>FBeta(#4656) - Model summary: add 1 decimal place (#4745)
- Do not override
PYTHONWARNINGS(#4700)
Fixed
- Fixed checkpoint
hparamsdict casting whenomegaconfis available (#4770) - Fixed incomplete progress bars when total batches not divisible by refresh rate (#4577)
- Updated SSIM metric (#4566)(#4656)
- Fixed batch_arg_name - add
batch_arg_nameto all calls to_adjust_batch_sizebug (#4812) - Fixed
torchtextdata to GPU (#4785) - Fixed a crash bug in MLFlow logger (#4716)
Contributors
@awaelchli, @jonashaag, @jungwhank, @M-Salti, @moi90, @pgagarinov, @s-rog, @Samyak2, @SkafteNicki, @teddykoker, @ydcjeff
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Detail changes
Added
- Added lambda closure to
manual_optimizer_step(#4618)
Changed
- Change Metrics
persistentdefault mode toFalse(#4685)
Fixed
- Prevent crash if
sync_dist=Trueon CPU (#4626) - Fixed average pbar Metrics (#4534)
- Fixed
setupcallback hook to correctly pass the LightningModule through (#4608) - Allowing decorate model init with saving
hparamsinside (#4662) - Fixed
split_idxset byLoggerConnectorinon_trainer_inittoTrainer(#4697)
Contributors
@ananthsub, @Borda, @SeanNaren, @SkafteNicki, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Detail changes
Added
- Added metrics aggregation in Horovod and fixed early stopping (#3775)
- Added
manual_optimizer_stepwhich work withAMP Nativeandaccumulated_grad_batches(#4485) - Added
persistent(mode)method to metrics, to enable and disable metric states being added tostate_dict(#4482) - Added congratulations at the end of our notebooks (#4555)
Changed
- Changed
fsspecto tuner (#4458) - Unify sLURM/TorchElastic under backend plugin (#4578, #4580, #4581, #4582, #4583)
Fixed
- Fixed feature-lack in
hpc_load(#4526) - Fixed metrics states being overridden in DDP mode (#4482)
- Fixed
lightning_getattr,lightning_hasattrnot finding the correct attributes in datamodule (#4347) - Fixed automatic optimization AMP by
manual_optimization_step(#4485) - Replace
MisconfigurationExceptionwith warning inModelCheckpointCallback (#4560) - Fixed logged keys in mlflow logger (#4412)
- Fixed
is_picklableby catchingAttributeError(#4508)
Contributors
@dscarmo, @jtamir, @kazhang, @maxjeblick, @rohitgr7, @SkafteNicki, @tarepan, @tchaton, @tgaddair, @williamFalcon
If we forgot someone due to not matching commit email with GitHub account, let us know :]