Skip to content

load data sequence is confusing #20358

@workhours

Description

@workhours

Bug description

I understand data consuming sequence in lightning is:
1, sanity check: call val_dataloader
2, training: call train_dataloader
3, validate: call val_dataloader
from above sequence I understand the cycle of a epoch is start from val_dataloader and end at train_dataloader, and the 3rd validate reuse val data from 1st val_dataloader.
but if if you check trainer.current_epoch: assume current_epoch is 1 at sanity check val_dataloader, then it increased to 2 at train_dataloader. in thise case it's seems the cycle of a epoch is start from train_dataloader and end at val_dataloader.
in this situation will confuse how to write code in val_dataloader when dynamic loading data. if infinite epoch, no problem. but at last epoch(I don't know now it's last one), should I ignore val_data is None or should I try to load it as if next round of cycle?

I think sanitcy check logic and validate logic should merge as one data-setup, but used twice for difference purpose. twice call val_dataloader and once call training_dataloader also make difficult to manage data load

What version are you seeing the problem on?

v2.4

How to reproduce the bug

No response

Error messages and logs

# Error messages and logs here please

Environment

Current environment
#- PyTorch Lightning Version (e.g., 2.4.0):
#- PyTorch Version (e.g., 2.4):
#- Python version (e.g., 3.12):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):

More info

No response

cc @tchaton

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions