Best practices for accessing DataModule setup() info (e.g. n_classes, label encodings) from the lightning model init #19208

turian · 2023-12-22T23:12:22Z

turian
Dec 22, 2023

What are best practices for retrieving info from the DataModule setup in my LightningModule?

Use case: My LightningModule wants the label encoding in __init__. Okay, it doesn't really need it in __init__, it needs it in its training/val steps. For metrics, etc. But I can't get it and my workaround is gnarly.

Background Just started a new pipeline, wrote a DataModule. I put the dataset loading in my setup(), as directed. That step determines the number of classes and the label encoding in a deterministic way, so it should be the same across all nodes.

trainer.fit(lightning_model, datamodule)

didn't work, neither did

trainer.fit(lightning_model, datamodule=datamodule)

(!? maybe I should file a bug), so I'm back to:

    datamodule.setup("fit")
    trainer.fit(
        lightning_model,
        train_dataloaders=datamodule.train_dataloader(),
        val_dataloaders=datamodule.val_dataloader(),
    )

But as I iterate on my code, and discover new features mean that my lightning module wants datamodule setup information, I'm stumped. What are the best practices?

I shouldn't put state in prepare_data(), the docs warn against that. But setup() is designed in a way to be stateless and run on individual nodes, not the master.

I do this but it's pretty gross:

    datamodule.setup("fit")
    # WRITING A NEW ATTRIBUTE IS ESPECIALLY GROSS
    lightning_model.class_names = datamodule.technique_encoder.classes_

Is there a better pattern?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Best practices for accessing DataModule setup() info (e.g. n_classes, label encodings) from the lightning model init #19208

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Best practices for accessing DataModule setup() info (e.g. n_classes, label encodings) from the lightning model __init__ #19208

Uh oh!

Uh oh!

turian Dec 22, 2023

Replies: 0 comments

Best practices for accessing DataModule setup() info (e.g. n_classes, label encodings) from the lightning model init #19208

turian
Dec 22, 2023