-
Notifications
You must be signed in to change notification settings - Fork 87
Closed
Labels
Description
I noticed LocalExecutor has a hard-coded value for nnodes.
Is there a reason multi-nodes are disabled ? It feeds into torch_run which seems to support multi-nodes
Asking because I am using this with AML where I can usually get multi-node working with torchrun
MattIrv, bernardhan33, oat-mirror and lazy2panda