Replies: 1 comment
-
The least intrusive way is to inherit from @MODEL_WRAPPERS.register_module()
class CustomDistributedDataParallel(DistributedDataParallel):
def train_step(self, data: Union[dict, tuple, list],
optim_wrapper: OptimWrapper) -> Dict[str, torch.Tensor]:
# Enable automatic mixed precision training context.
with optim_wrapper.optim_context(self):
data = self.module.data_preprocessor(data, training=True)
losses = self._run_forward(data, mode='loss')
parsed_loss, log_vars = self.module.parse_losses(losses)
loss = optim_wrapper.scale_loss(parsed_loss)
optim_wrapper.backward(loss)
# modify gradients here
if optim_wrapper.should_update():
optim_wrapper.step()
optim_wrapper.zero_grad()
if self.detect_anomalous_params:
detect_anomalous_params(parsed_loss, model=self)
return log_vars And using the model wrapper # https://github.com/open-mmlab/mmengine/blob/237aee386669f0d69a1caf4724bdc1e826178d7d/mmengine/runner/runner.py#L825
model_wrapper_cfg = dict(
type='CustomDistributedDataParallel',
# other configs
) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
After loss.backward(), i want to modify the loss of some layers, e.g., slightly add.
However, the update_param() function is deep in the mmengine framework, usually at OptimWrapper class, this can be solved by creating custom OptimWrapper, and overwriting update_param() function. But how can i modify the loss by the layer name or type?? Casue the OptimWrapper class seems can not access the model.
Usually, i can implement such logic with pytorch:
It is easy to access the model and modify the loss, but how to implement the code above with mmengine? The way i can figure out is using Hook, but there is not mount point between loss.backward() and optimizer.step since they are wrappered deeper than the trainloop:
The code of loss.backward() and optimizer.step() implemented by mmengine are:
This function is belong to class OptimWrapper, and it does not have access to the model. It's optimizer is the type of torch.optim.Optimizer, but the model parameters passed to the optimizer are tensors and have no ability to recognize layer.
Really hope someone can help!!
Beta Was this translation helpful? Give feedback.
All reactions