-
Notifications
You must be signed in to change notification settings - Fork 121
Open
Description
Hi, may I ask: In the research paper, it was mentioned that the experiment used different configurations of 32 A100 GPUs and 16 A100 GPUs for model training. I would like to know how long it takes to complete a complete training task in such a hardware environment? Or can you provide more detailed training configurations and training duration?
Metadata
Metadata
Assignees
Labels
No labels