boring model for DDP #6358
Unanswered
hkmztrk
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 1 comment 5 replies
-
I assume you are running with ddp_spawn, so that means you need to move the definition of the model class (TestModel) outside to the main file, like so: ...
# move class here:
class TestModel(BoringModel):
def on_train_epoch_start(self) -> None:
print('override any method to prove your bug')
def test_run():
# remove definition from here and move outside of function
... There is a LSF PR here #5102 |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
We are (@haimasree) trying to run the Boring Model example in DDP setting on LSF cluster to test our configuration and have the following issue:
Is it possible to have a boring model example that could run on DDP? Also it'd be great if you could point out if there is a special configuration needed for LSF different than SLURM.
GPU: (one of the randomly assigned ones)
- Quadro RTX 6000
- Quadro RTX 6000
- Quadro RTX 6000
- Quadro RTX 6000
@williamFalcon @Borda @tchaton @awaelchli
Beta Was this translation helpful? Give feedback.
All reactions