Add stepwise scheduling for GradientAccumulationScheduler

### Description & Motivation

Currently, the GradientAccumulationScheduler only supports scheduling on epoch intervals. However, during pretraining tasks, the model might only run for a single epoch. Therefore, it would be beneficial to be able to schedule the gradient accumulation according to `trainer.global_step` taken. 

Proposal:
- add a `interval` parameter to GradientAccumulationScheduler, which can be `"epoch"` or `"step"`, defaulting to `"epoch"` for backwards compatibility
- add a condition to the current `on_train_epoch_start` to only trigger if `interval == "epoch"` 
- add an `on_train_batch_start`/`on_after_optimizer_step` hook, triggering if `interval == "step"` 

However, given the current warning of scheduling being incompatible with DeepSpeed, I am not sure if scheduling on steps would be unsupported by all/some strategies.


### Pitch

I want to be able to scheduling gradient accumulation by `trainer.global_step` instead of `trainer.current_epoch`.

### Alternatives


### Additional context

Could depend on having an `on_optimizer_step` hook for callbacks. See https://github.com/Lightning-AI/pytorch-lightning/issues/11688#issuecomment-1812863621



cc @lantiga

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add stepwise scheduling for GradientAccumulationScheduler #21534

Description & Motivation

Pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add stepwise scheduling for GradientAccumulationScheduler #21534

Description

Description & Motivation

Pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions