Skip to content

The problem shows: version incompatibility from v1.3.x to v2.4 #20308

@sunhan3787

Description

@sunhan3787

Bug description

lightning_fabric.utilities.exceptions.MisconfigurationException: ReduceLROnPlateau conditioned on metric val_loss which is not available.
Available metrics are: ['train_loss'...]. Condition can be set using monitor key in lr scheduler dict

What version are you seeing the problem on?

v2.4

How to reproduce the bug

https://github.com/jiaor17/DiffCSP
I'm trying to run this work in the pl v2.4.0, the normal problem i had fixed after have a look at your Lightning documentation.
 Is the only way to fixed this problem is to modify my confige file?

Error messages and logs

# Error messages and logs here please

File "/data/coding/DiffCSP/diffcsp/run.py", line 181, in main
run(cfg)
File "/data/coding/DiffCSP/diffcsp/run.py", line 168, in run
trainer.fit(model=model, datamodule=datamodule, ckpt_path=ckpt)
File "/data/miniconda/envs/torch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 538, in fit
call._call_and_handle_interrupt(
File "/data/miniconda/envs/torch/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 47, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/data/miniconda/envs/torch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 574, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/data/miniconda/envs/torch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 981, in _run
results = self._run_stage()
File "/data/miniconda/envs/torch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1025, in _run_stage
self.fit_loop.run()
File "/data/miniconda/envs/torch/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 206, in run
self.on_advance_end()
File "/data/miniconda/envs/torch/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 386, in on_advance_end
self.epoch_loop.update_lr_schedulers("epoch", update_plateau_schedulers=not self.restarting)
File "/data/miniconda/envs/torch/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 349, in update_lr_schedulers
self._update_learning_rates(interval=interval, update_plateau_schedulers=update_plateau_schedulers)
File "/data/miniconda/envs/torch/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 384, in _update_learning_rates
raise MisconfigurationException(
lightning_fabric.utilities.exceptions.MisconfigurationException: ReduceLROnPlateau conditioned on metric val_loss which is not available. Available metrics are: ['train_loss', 'train_loss_step', 'lattice_loss', 'lattice_loss_step', 'coord_loss', 'coord_loss_step', 'train_loss_epoch', 'lattice_loss_epoch', 'coord_loss_epoch']. Condition can be set using monitor key in lr scheduler dict

Environment

Current environment
#- PyTorch Lightning Version (e.g., 2.4.0):    2.4.0
#- PyTorch Version (e.g., 2.4):    2.3.0
#- Python version (e.g., 3.12):    3.10
#- OS (e.g., Linux):     Ubuntu:"22.04.4 LTS
#- CUDA/cuDNN version:    cuda 12.1
#- GPU models and configuration:    RTX3060
#- How you installed Lightning(`conda`, `pip`, source):    pip install

More info

i'm sure the code is OK, and if i want to fixed this problem simple, is the best way to change a confige file?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingneeds triageWaiting to be triaged by maintainersver: 2.4.x

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions