Logging metrics at datapoint level #16573
Unanswered
GregorySech
asked this question in
code help: CV
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm trying to log some metrics at the datapoint (image/dataset row/sample) level.
For example, considering a semantic segmentation task I would like to log for each image its intersection over union and loss value.
What I've done so far is use the LightningModule.log method and assign a log key for each image:
This method is then called by both training_step_end and validation_step_end with an appropriate
stage
string.However, I've run into an unexpected issue, after 50 steps from the first log I find the same value logged on Tensorboard.
I should add that I'm using DDP, however, this behaviour happens regardless of how many devices. The screenshot refers to a run with a single GPU.
The Dataset is implemented using the filename as the "image_id" so there are no duplicates (and the number of images does add up to the correct dataset size).
I was wondering if this behaviour is to be expected and if there is a smarter way of logging at this level of granularity.
Beta Was this translation helpful? Give feedback.
All reactions