Multi-gpu training lag time between epochs #6108
OmarAshkar
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I have been using hovernet tutorial here https://github.com/Project-MONAI/tutorials/tree/main/pathology/hovernet
Running on multi-gpu distributes the data and successfully shorten the training time. However, between each epoch there is 20 mins gap. The delay with single gpu is about 2 minutes only.
I do believe this is metrics/evaluation calculations. Is there is anyway to solve or debug that?
CC. @KumoLiu
Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions