Separation of concerns #13450

jacanchaplais · 2022-06-29T18:02:45Z

jacanchaplais
Jun 29, 2022

Hi everyone, I hope this post isn't too general. I'm going to keep it short and won't include code examples off the bat, I just want to see what the developers and others thoughts are, and if I am misunderstanding.

I used Lightning a while back to make multi-GPU training easy with my graph neural network models. However, one thing I didn't really like was that it felt like monolithic objects with low cohesion were strongly encouraged. By this I mean, one class would define each layer, the code blocks in training, validation, and test loops, contain the metric trackers, and update them, etc.

On the one hand, I get that collating all of this functionality is what leads to the API being able to implicitly synchronise everything in parallel across all of the GPUs and workers, but on the other hand I found it much harder to manage my code base as it grew in complexity.

Are there any known best practices for keeping code in modular, cohesive chunks, while also being able to leverage the benefits of this library? Can I simply separate the bulk of the attributes included in Lightning Modules out into separate Python modules, then instantiate them in the main model, while preserving all of the synchronisation (ie. could I define a torch.nn.Module outside of a LightningModule without needing to specify which device everything is on, as long as I instantiate the actual object within the LightningModule class)?

I am sorry if these questions sound basic or vague. I am fairly junior when it comes to NNs, and I have abandoned Lightning to use PyTorch's multiprocessing module because of these issues, but I'd really like to make use of the great stuff in here.

Let me know if you want a more systematic and concrete breakdown of what I'm asking, or examples, etc.

akihironitta · 2022-06-29T22:40:16Z

akihironitta
Jun 29, 2022

Hi @jacanchaplais thank you for sharing your concerns here!

I found it much harder to manage my code base as it grew in complexity.

Was there any pain point you had in particular?

Are there any known best practices for keeping code in modular, cohesive chunks, while also being able to leverage the benefits of this library?

PyTorch Lightning provides a variety of hooks you can override in LightningModule (doc) and Callback (doc). If the logic that you override a hook with is not strictly tied to your model and is reusable for other use cases, it should be implemented as a hook Callback instead of LightningModule. This way, you can share your Callback across your multiple projects or with other people.

For this reason, I think PyTorch Lightning already makes code modular and cohesive. Or, do you mean something else by "keeping code in modular, cohesive chunks"?

Can I simply separate the bulk of the attributes included in Lightning Modules out into separate Python modules, then instantiate them in the main model, while preserving all of the synchronisation (ie. could I define a torch.nn.Module outside of a LightningModule without needing to specify which device everything is on, ...)?

Yes, definitely. Related to #8648 (comment).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Separation of concerns #13450

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Separation of concerns #13450

Uh oh!

jacanchaplais Jun 29, 2022

Replies: 1 comment

Uh oh!

akihironitta Jun 29, 2022

jacanchaplais
Jun 29, 2022

akihironitta
Jun 29, 2022