Skip to content

Distributed training does not exit properly after completion #129

@Licko0909

Description

@Licko0909

Dear Author,
This is a great result, I am recently using TAPE for sequence design task, I have no problem in single card training process, but I encounter a little problem using distributed training.
I'm looking forward to hearing from you.

Problem Description:

  • After the distributed training is finished, the program cannot be exited normally and remains in running state
  • There have been no problems with the training process.
  • 4 RTX6000 on one machine

image

image

I'm looking forward to hearing from you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions