Bringing together results on ddp on a single machine #5886
Unanswered
Arij-Aladel
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I have the same problem as mentioned in 1 and 2, the solution was to use dist.all_gather() inside validation_epoch_end . The difference is that I have as output in each validation_step a dictionary:
{"val_loss" : float_loss, "batch_length": int_len, "preds": text, "answer":text, "doc":
text}.
Is there away to collect text results from all processes?
I can collect loss using dist.all_gather(), but what about text results any suggestions?
Beta Was this translation helpful? Give feedback.
All reactions