Skip to content

Releases: Lightning-AI/pytorch-lightning

Standard weekly patch release

12 Jan 20:34
652df18
Compare
Choose a tag to compare

[1.1.4] - 2021-01-12

Added

  • Add automatic optimization property setter to lightning module (#5169)

Changed

  • Changed deprecated enable_pl_optimizer=True (#5244)

Fixed

  • Fixed transfer_batch_to_device for DDP with len(devices_ids) == 1 (#5195)
  • Logging only on not should_accumulate() during training (#5417)
  • Resolve interpolation bug with Hydra (#5406)
  • Check environ before selecting a seed to prevent warning message (#4743)

Contributors

@ananthsub, @SeanNaren, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Standard weekly patch release

06 Jan 10:17
4d9db86
Compare
Choose a tag to compare

[1.1.3] - 2021-01-05

Added

  • Added a check for optimizer attached to lr_scheduler (#5338)
  • Added support for passing non-existing filepaths to resume_from_checkpoint (#4402)

Changed

  • Skip restore from resume_from_checkpoint while testing (#5161)
  • Allowed log_momentum for adaptive optimizers in LearningRateMonitor (#5333)
  • Disabled checkpointing, earlystopping and logging with fast_dev_run (#5277)
  • Distributed group defaults to WORLD if None (#5125)

Fixed

  • Fixed trainer.test returning non-test metrics (#5214)
  • Fixed metric state reset (#5273)
  • Fixed --num-nodes on DDPSequentialPlugin (#5327)
  • Fixed invalid value for weights_summary (#5296)
  • Fixed Trainer.test not using the latest best_model_path (#5161)
  • Fixed existence check for hparams not using underlying filesystem (#5250)
  • Fixed LightningOptimizer AMP bug (#5191)
  • Fixed casted key to string in _flatten_dict (#5354)

Contributors

@8greg8, @haven-jeon, @kandluis, @marload, @rohitgr7, @tadejsv, @tarepan, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

standard weekly patch release

23 Dec 09:38
5820887
Compare
Choose a tag to compare

Overview

Detail changes

Added

  • Support number for logging with sync_dist=True (#5080)
  • Added offset logging step when resuming for Wandb logger (#5050)

Removed

  • enable_pl_optimizer=False by default to temporarily fix AMP issues (#5163)

Fixed

  • Metric reduction with Logging (#5150)
  • Remove nan loss in manual optimization (#5121)
  • Un-balanced logging properly supported (#5119)
  • Fix hanging in DDP HPC accelerators (#5157)
  • Fix saved filename in ModelCheckpoint if it already exists (#4861)
  • Fix reset TensorRunningAccum (#5106)
  • Updated DALIClassificationLoader to not use deprecated arguments (#4925)
  • Corrected call to torch.no_grad (#5124)

Contributors

@8greg8, @ananthsub, @borisdayma, @gan3sh500, @rohitgr7, @SeanNaren, @tchaton, @VinhLoiIT

If we forgot someone due to not matching commit email with GitHub account, let us know :]

standard weekly patch release

15 Dec 23:32
748a74e
Compare
Choose a tag to compare

Overview

Detail changes

Added

  • Add a notebook example to reach a quick baseline of ~94% accuracy on CIFAR10 using Resnet in Lightning (#4818)

Changed

  • Simplify accelerator steps (#5015)
  • Refactor load in checkpoint connector (#4593)

Removed

  • Drop duplicate metrics (#5014)
  • Remove beta arg from F1 class and functional (#5076)

Fixed

  • Fixed trainer by default None in DDPAccelerator (#4915)
  • Fixed LightningOptimizer to expose optimizer attributes (#5095)
  • Do not warn when the name key is used in the lr_scheduler dict (#5057)
  • Check if optimizer supports closure (#4981)
  • Extend LightningOptimizer to exposure underlying Optimizer attributes + update doc (#5095)
  • Add deprecated metric utility functions back to functional (#5067, #5068)
  • Allow any input in to_onnx and to_torchscript (#4378)
  • Do not warn when the name key is used in the lr_scheduler dict (#5057)

Contributors

@Borda, @carmocca, @hemildesai, @rohitgr7, @s-rog, @tarepan, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Model Parallelism Training and More Logging Options

10 Dec 01:05
cdbddbe
Compare
Choose a tag to compare

Overview

Lightning 1.1 is out! You can now train models with twice the parameters and zero code changes with the new sharded model training! We also have a new plugin for sequential model parallelism, more logging options, and a lot of improvements!
Release highlights: https://bit.ly/3gyLZpP

Learn more about sharded training: https://bit.ly/2W3hgI0

Detail changes

Added

  • Added "monitor" key to saved ModelCheckpoints (#4383)
  • Added ConfusionMatrix class interface (#4348)
  • Added multiclass AUROC metric (#4236)
  • Added global step indexing to the checkpoint name for a better sub-epoch checkpointing experience (#3807)
  • Added optimizer hooks in callbacks (#4379)
  • Added option to log momentum (#4384)
  • Added current_score to ModelCheckpoint.on_save_checkpoint (#4721)
  • Added logging using self.log in train and evaluation for epoch end hooks (#4913)
  • Added ability for DDP plugin to modify optimizer state saving (#4675)
  • Added casting to python types for NumPy scalars when logging hparams (#4647)
  • Added prefix argument in loggers (#4557)
  • Added printing of total num of params, trainable and non-trainable params in ModelSummary (#4521)
  • Added PrecisionRecallCurve, ROC, AveragePrecision class metric (#4549)
  • Added custom Apex and NativeAMP as Precision plugins (#4355)
  • Added DALI MNIST example (#3721)
  • Added sharded plugin for DDP for multi-GPU training memory optimizations (#4773)
  • Added experiment_id to the NeptuneLogger (#3462)
  • Added Pytorch Geometric integration example with Lightning (#4568)
  • Added all_gather method to LightningModule which allows gradient-based tensor synchronizations for use-cases such as negative sampling. (#5012)
  • Enabled self.log in most functions (#4969)
  • Added changeable extension variable for ModelCheckpoint (#4977)

Changed

  • Removed multiclass_roc and multiclass_precision_recall_curve, use roc and precision_recall_curve instead (#4549)
  • Tuner algorithms will be skipped if fast_dev_run=True (#3903)
  • WandbLogger does not force wandb reinit arg to True anymore and creates a run only when needed (#4648)
  • Changed automatic_optimization to be a model attribute (#4602)
  • Changed Simple Profiler report to order by percentage time spent + num calls (#4880)
  • Simplify optimization Logic (#4984)
  • Classification metrics overhaul (#4837)
  • Updated fast_dev_run to accept integer representing num_batches (#4629)
  • Refactored optimizer (#4658)

Deprecated

  • Deprecated prefix argument in ModelCheckpoint (#4765)
  • Deprecated the old way of assigning hyper-parameters through self.hparams = ... (#4813)
  • Deprecated mode='auto' from ModelCheckpoint and EarlyStopping (#4695)

Removed

  • Removed reorder parameter of the auc metric (#5004)

Fixed

  • Added feature to move tensors to CPU before saving (#4309)
  • Fixed LoggerConnector to have logged metrics on root device in DP (#4138)
  • Auto convert tensors to contiguous format when gather_all (#4907)
  • Fixed PYTHONPATH for DDP test model (#4528)
  • Fixed allowing logger to support indexing (#4595)
  • Fixed DDP and manual_optimization (#4976)

Contributors

@ananyahjha93, @awaelchli, @blatr, @Borda, @borisdayma, @carmocca, @ddrevicky, @george-gca, @gianscarpe, @irustandi, @janhenriklambrechts, @jeremyjordan, @justusschock, @lezwon, @rohitgr7, @s-rog, @SeanNaren, @SkafteNicki, @tadejsv, @tchaton, @williamFalcon, @zippeurfou

If we forgot someone due to not matching commit email with GitHub account, let us know :]

standard weekly patch release

24 Nov 16:55
Compare
Choose a tag to compare

Detail changes

Added

  • Added casting to python types for numpy scalars when logging hparams (#4647)
  • Added warning when progress bar refresh rate is less than 20 on Google Colab to prevent crashing (#4654)
  • Added F1 class metric (#4656)

Changed

  • Consistently use step=trainer.global_step in LearningRateMonitor independently of logging_interval (#4376)
  • Metric states are no longer as default added to state_dict (#4685)
  • Renamed class metric Fbeta >> FBeta (#4656)
  • Model summary: add 1 decimal place (#4745)
  • Do not override PYTHONWARNINGS (#4700)

Fixed

  • Fixed checkpoint hparams dict casting when omegaconf is available (#4770)
  • Fixed incomplete progress bars when total batches not divisible by refresh rate (#4577)
  • Updated SSIM metric (#4566)(#4656)
  • Fixed batch_arg_name - add batch_arg_name to all calls to _adjust_batch_sizebug (#4812)
  • Fixed torchtext data to GPU (#4785)
  • Fixed a crash bug in MLFlow logger (#4716)

Contributors

@awaelchli, @jonashaag, @jungwhank, @M-Salti, @moi90, @pgagarinov, @s-rog, @Samyak2, @SkafteNicki, @teddykoker, @ydcjeff

If we forgot someone due to not matching commit email with GitHub account, let us know :]

standard weekly patch release

17 Nov 21:57
Compare
Choose a tag to compare

Detail changes

Added

  • Added lambda closure to manual_optimizer_step (#4618)

Changed

  • Change Metrics persistent default mode to False (#4685)

Fixed

  • Prevent crash if sync_dist=True on CPU (#4626)
  • Fixed average pbar Metrics (#4534)
  • Fixed setup callback hook to correctly pass the LightningModule through (#4608)
  • Allowing decorate model init with saving hparams inside (#4662)
  • Fixed split_idx set by LoggerConnector in on_trainer_init to Trainer (#4697)

Contributors

@ananthsub, @Borda, @SeanNaren, @SkafteNicki, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

standard weekly patch release

11 Nov 13:16
Compare
Choose a tag to compare

Detail changes

Added

  • Added metrics aggregation in Horovod and fixed early stopping (#3775)
  • Added manual_optimizer_step which work with AMP Native and accumulated_grad_batches (#4485)
  • Added persistent(mode) method to metrics, to enable and disable metric states being added to state_dict (#4482)
  • Added congratulations at the end of our notebooks (#4555)

Changed

Fixed

  • Fixed feature-lack in hpc_load (#4526)
  • Fixed metrics states being overridden in DDP mode (#4482)
  • Fixed lightning_getattr, lightning_hasattr not finding the correct attributes in datamodule (#4347)
  • Fixed automatic optimization AMP by manual_optimization_step (#4485)
  • Replace MisconfigurationException with warning in ModelCheckpoint Callback (#4560)
  • Fixed logged keys in mlflow logger (#4412)
  • Fixed is_picklable by catching AttributeError (#4508)

Contributors

@dscarmo, @jtamir, @kazhang, @maxjeblick, @rohitgr7, @SkafteNicki, @tarepan, @tchaton, @tgaddair, @williamFalcon

If we forgot someone due to not matching commit email with GitHub account, let us know :]

standard weekly patch release

04 Nov 02:00
Compare
Choose a tag to compare

Detail changes

Added

  • Added PyTorch 1.7 Stable support (#3821)
  • Added timeout for tpu_device_exists to ensure process does not hang indefinitely (#4340)

Changed

  • W&B log in sync with Trainer step (#4405)
  • Hook on_after_backward is called only when optimizer_step is being called (#4439)
  • Moved track_and_norm_grad into training loop and called only when optimizer_step is being called (#4439)
  • Changed type checker with explicit cast of ref_model object (#4457)

Deprecated

  • Deprecated passing ModelCheckpoint instance to checkpoint_callback Trainer argument (#4336)

Fixed

  • Disable saving checkpoints if not trained (#4372)
  • Fixed error using auto_select_gpus=True with gpus=-1 (#4209)
  • Disabled training when limit_train_batches=0 (#4371)
  • Fixed that metrics do not store computational graph for all seen data (#4313)
  • Fixed AMP unscale for on_after_backward (#4439)
  • Fixed TorchScript export when module includes Metrics (#4428)
  • Fixed CSV logger warning (#4419)
  • Fixed skip DDP parameter sync (#4301)

Contributors

@ananthsub, @awaelchli, @borisdayma, @carmocca, @justusschock, @lezwon, @rohitgr7, @SeanNaren, @SkafteNicki, @ssaru, @tchaton, @ydcjeff

If we forgot someone due to not matching commit email with GitHub account, let us know :]

standard weekly patch release

27 Oct 22:15
5d10a36
Compare
Choose a tag to compare

Detail changes

Added

  • Added dirpath and filename parameter in ModelCheckpoint (#4213)
  • Added plugins docs and DDPPlugin to customize ddp across all accelerators (#4258)
  • Added strict option to the scheduler dictionary (#3586)
  • Added fsspec support for profilers (#4162)
  • Added autogenerated helptext to Trainer.add_argparse_args (#4344)
  • Added support for string values in Trainer's profiler parameter (#3656)

Changed

  • Improved error messages for invalid configure_optimizers returns (#3587)
  • Allow changing the logged step value in validation_step (#4130)
  • Allow setting replace_sampler_ddp=True with a distributed sampler already added (#4273)
  • Fixed santized parameters for WandbLogger.log_hyperparams (#4320)

Deprecated

  • Deprecated filepath in ModelCheckpoint (#4213)
  • Deprecated reorder parameter of the auc metric (#4237)
  • Deprecated bool values in Trainer's profiler parameter (#3656)

Fixed

  • Fixed setting device ids in DDP (#4297)
  • Fixed synchronization of best model path in ddp_accelerator (#4323)
  • Fixed WandbLogger not uploading checkpoint artifacts at the end of training (#4341)

Contributors

@ananthsub, @awaelchli, @carmocca, @ddrevicky, @louis-she, @mauvilsa, @rohitgr7, @SeanNaren, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]