Predict on GPU and calculate metrics in CPU, DDP Mode #6433
Unanswered
frapercan
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I'm interested in calculating PRC, ROC, and other metrics over a multilabel segmentation problem (Tons of Pixels). The problem is the amount of vRAM required to use those metrics and that I would like to integrate everything on the same pipeline.
So I would like that during training, using the evaluation loop, to calculate those metrics without OOM. I have seen that that metrics calculation works slower when no GPU is selected but it is mainly due to the time it needs the model to predict values but this is the only way I have achieved to calculate Metrics using RAM.
I have tried to set up and feed the PrecisionRecallCurve metric with CPU tensors meanwhile inferences are done in GPU. But there is no way to work in such a hybrid mode, Update step works fine, but when I do compute the metric it tells me that GPU tensors are required.
I did something like this, and a lot of variants:
initialization (One GPU and DDP are configured):
PrecisionRecallCurve(pos_label=1, num_classes=1, compute_on_step=False).cpu()
step_end:
precision_recall_curve.update(patch_pred[:, defect_index].cpu(), patch_mask[:, defect_index].cpu())
epoch_end:
precision, recall, threshold = precision_recall_curve.compute()
The error:
I would like to discuss and talk about this kind of problem, how can it be solved, I can't find much information about CPU metrics usage in the documentation, and I'm still learning.
I have seen people using the sci-kit-learn library, probably it would fit my requirements, but I wouldn't like to add more stuff to the stack.
Thanks for your work, this framework is amazing :)
Beta Was this translation helpful? Give feedback.
All reactions