Skip to content

ModelCheckpoint need monitor average loss #20652

@Johnson-yue

Description

@Johnson-yue

Description & Motivation

I know ModelCheckpoint can monitor like "train_loss" , "val_loss" , when the value is min and "every_n_train_steps" is true. but I want to cache "train_loss" or "val_loss" from "every_n_train_steps" to next "every_n_train_steps" ,when cache loss is lower than before, then save model.
Example:
every_n_train_steps = 50, monitor = "train_loss"

from step=50 to step=100, if average "train_loss" is lower than step=[0,50] then save model

Pitch

No response

Alternatives

No response

Additional context

No response

cc @lantiga @Borda

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions