Image Classification Thresholds and Logic #2072
Unanswered
virtualarchitectures
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I'm trying to use the image classification logic in Python. I first tested using the command line as per the documentation:
Doing this I can see that Docling add's the identified class before the base64 encoded image in the markdown output. Using the example code ( https://docling-project.github.io/docling/usage/enrichments/#picture-classification ) for Python I can access the annotations and see the scores for each class using something like the following:
However, I'm not sure what thresholds or logic Docling is using to extract the preferred class based on the scores provided. I looked in the code on GitHub here but I couldn't see the logic for selecting the preferred category: https://github.com/docling-project/docling/blob/main/docling/models/document_picture_classifier.py
I did search the wider codebase but didn't spot what I was looking for. Could you please point me to the part of the code where selection of the preferred class is implemented so I can replicate this in my Python script?
Beta Was this translation helpful? Give feedback.
All reactions