Allow shuffling when overfit_batches is active

## Proposed refactoring or deprecation

Instead of disabling shuffle / replacing `RandomSampler` with `SequentialSampler` in the train dataloader, replace the train dataset with a fixed subset of it using `torch.utils.data.Subset` (eg. first N samples of the dataset, where N is given by `overfit_batches`. This gives the same dataset samples as with the previous implementation.)

### Motivation

This prevents training batches to be the same for every epoch


### Pitch

*Added on 12 Oct 2021:*
The current implementation for `overfit_batches` disables shuffling by replacing `RandomSampler` with `SequentialSampler` in the train dataloader, in order to restrict the training / overfit to the first N samples of the train dataset for every epoch. However, this gives the same sequence of batches & non-unique batches across epochs, which is undesirable.

We should instead allow shuffling within the N samples across epochs, according to the `shuffle` option of the train dataloader, in order to give a different sequence of batches across epochs & mostly unique batches throughout the training process.



______________________________________________________________________

#### If you enjoy Lightning, check out our other projects! ⚡

<sub>

- [**Metrics**](https://github.com/PyTorchLightning/metrics): Machine learning metrics for distributed, scalable PyTorch applications.

- [**Flash**](https://github.com/PyTorchLightning/lightning-flash): The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning

- [**Bolts**](https://github.com/PyTorchLightning/lightning-bolts): Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

- [**Lightning Transformers**](https://github.com/PyTorchLightning/lightning-transformers): Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

</sub>


cc @borda @justusschock @awaelchli @akihironitta @rohitgr7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow shuffling when overfit_batches is active #9850

Proposed refactoring or deprecation

Motivation

Pitch

If you enjoy Lightning, check out our other projects! ⚡

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow shuffling when overfit_batches is active #9850

Description

Proposed refactoring or deprecation

Motivation

Pitch

If you enjoy Lightning, check out our other projects! ⚡

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions