Skip to content

Rename variable horovod_num_processes in ReturnnTrainingJob and ReturnnRasrTrainingJob #456

@christophmluscher

Description

@christophmluscher

The jobs got extended to enable multi-GPU usage for the torch backend (see #444 and #445). The horovod_num_processes variable name is now incorrect. This change needs to be done carefully since this is a potentially hash-breaking change.
Analog to distributed_launch_command rename horovod_num_processes to distributed_num_processes?

@albertz @Judyxujj @JackTemaki comments?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions