-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Open
Labels
distributedGeneric distributed-related topicGeneric distributed-related topicfeatureIs an improvement or enhancementIs an improvement or enhancementgood first issueGood for newcomersGood for newcomershelp wantedOpen to be worked onOpen to be worked onlet's do it!approved to implementapproved to implement
Milestone
Description
🚀 Feature
We currently set some attributes like sync_batch_norm, num_nodes, devices etc. lazily in the accelerator connector, so the user does not have to provide them at instantiation.
Example:
Trainer(gpus=4, plugins=DDPPlugin(find_unused_parameters=True)) # plugin may require gpus, num_nodes etc.
# AcceleratorConnector does this:
training_type_plugin.num_nodes = ...
This is fragile. Some attributes may have to be recomputed based on the order in which others are set.
Pitch
Provide one single lazy init method that takes all arguments required. The plugin is responsible for making sure dependencies are resolved in one place:
class DDPPlugin(...):
def lazy_init(self, **kwargs):
self.num_nodes = kwargs.get("num_nodes")
self.num_processes = ...
With lazy_init
, the _configure_launcher
method (#11643) would become obsolete. It can be merged together into lazy_init
.
Metadata
Metadata
Assignees
Labels
distributedGeneric distributed-related topicGeneric distributed-related topicfeatureIs an improvement or enhancementIs an improvement or enhancementgood first issueGood for newcomersGood for newcomershelp wantedOpen to be worked onOpen to be worked onlet's do it!approved to implementapproved to implement