Skip to content

Commit 2636f87

Browse files
authored
Create Confidence Score
1 parent 9edad89 commit 2636f87

File tree

1 file changed

+37
-0
lines changed

1 file changed

+37
-0
lines changed
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# Confidence Scores in Azure AI Content Understanding
2+
Understanding Confidence Scores
3+
What are confidence scores?
4+
Confidence scores represent the probability that the extracted result is correct. For example, a confidence score of 0.95 (95%) suggests that the prediction is likely correct 19 out of 20 times. These scores are derived from various factors, including the quality of the input document, the similarity between the training data and the document being analyzed, and the model's ability to recognize patterns and features in the document.
5+
Why are confidence scores important?
6+
Confidence scores are important because they provide a measure of the model's certainty in its predictions. They help users make informed decisions about the reliability of the extracted data and determine whether human review is necessary. Confidence scores also play a crucial role in automating workflows and improving efficiency by reducing the need for manual validation.
7+
Supported fields
8+
Confidence scores are supported for extractive various fields, including text, tables, and images. The specific fields supported may vary depending on the model and the use case.
9+
10+
What are thresholds for confidence scores?
11+
Thresholds for confidence scores are predefined values that determine whether a prediction is considered reliable or requires further review. These thresholds can be set across different modalities to ensure consistent and accurate results. Setting appropriate thresholds is important because it helps balance the trade-off between automation and accuracy. By setting the right thresholds, users can ensure that only high-confidence predictions are automated, while low-confidence predictions are flagged for human review. This helps improve the overall accuracy and reliability of the predictions
12+
Improving Confidence Scores
13+
What are some common challenges with confidence scores?
14+
Common challenges with confidence scores include low-quality input documents, variability in document types, complexity of the documents, and limitations of the model in recognizing certain types of content or features.
15+
Human in the Loop (HITL)
16+
What is Human in the Loop (HITL)?
17+
Human in the Loop (HITL) is a process that involves human intervention in the model's predictions to validate and correct the results. HITL helps improve the accuracy and reliability of the predictions by incorporating human expertise and judgment. HITL helps identify and correct errors, improve the model's performance, and enhance the overall quality of the predictions by human experts intervening only when the confidence scores are below a certain threshold.
18+
It can improved accuracy and reliability of the predictions, reduced errors, and enhanced overall quality of the results.
19+
20+
How can customers access confidence score in CU?
21+
For every field extraction, confidence score is listed as part of the field extraction output. You can also check confidence score as part of your JSON output under “confidence”
22+
23+
24+
25+
Tips to improve confidence score
26+
1. Correcting an expected output so that the model can understand the definition better. Example: Here we can see the confidence score is 12%, to improve confidence score, we can go to label data, select auto label which will give us predicted field labels. Now we can correct our definition and it will show corrected field label. Test the analyzer again for better confidence score. Here it jumped to 98%. Confidence improvement will vary as per the complexity and nature of document.
27+
28+
29+
30+
31+
32+
33+
2. Adding more samples and label them for different variation and templates the model may expect.
34+
3. Add documents that contains various input values for the schema you want to extract.
35+
4. Improve the quality of your input documents.
36+
5. Incorporate human in the loop for lower confidence results.
37+
Note: Confidence score is only available for document modality in the preview. For other modalities it will be added soon.

0 commit comments

Comments
 (0)