Why loss value is high in Anomalib's fastflow, Draem, and other models that need to be trained many epochs #1026
Unanswered
laogonggong847
asked this question in
Q&A
Replies: 2 comments 13 replies
-
You should not care about the absolute loss value but only about your image AUROC and F1, which are both 1.0 meaning that each of your test images was predicted correctly. |
Beta Was this translation helpful? Give feedback.
9 replies
-
I agree with @alexriedel1. Your performance scores 100%. I guess, it cannot be any better than that Instead of looking at the loss value, you could perhaps observe how it reduces over time during training? |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Describe the bug
The same error occurs when using Anomalib's fastflow, Draem, and other models that need to be trained many epochs
Dataset
Other (please specify in the text field below)
Model
FastFlow
Steps to reproduce the behavior
OS information
Expected behavior
Many thanks to the authors involved for open sourcing this great library Anomalib. I think it is a milestone in the defect detection field, it is a great work and congratulations to them for the result.
I have successfully trained on my own dataset using Padim, patchCore, etc. and have achieved good results.
But unfortunately, I'm getting a lot of the same errors when training my own data with fastflow, a model that Draem needs to iterate through multiple updates.
Below I will use fastflow's related logs to illustrate
I made changes to fastflow's config.Yaml, but the changes I made were limited to the dataset section, as follows:
Then I gleefully picked up my coffee and prepared to wait for it to run the 500 times I had set for training. But it runs for a while and then reports an error directly:
So I followed the instructions and found the corresponding place in the Yaml file and made the changes (Changed pixel_AUROC to image_AUROC in matric)
But fastflow it only trained three rounds on the error, and his loss function is still very high (but DRAEM I so modified after training more than 40 epochs, the loss function down to 0.17 or so)
I think it is extremely unreasonable that his loss function still has 7.64e+04 at the end of the training
Screenshots
No response
Pip/GitHub
pip
What version/branch did you use?
No response
Configuration YAML
.
Logs
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions