Skip to content

Commit 26e65c0

Browse files
committed
update
1 parent bf7e1e6 commit 26e65c0

File tree

1 file changed

+28
-26
lines changed

1 file changed

+28
-26
lines changed
Lines changed: 28 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,20 @@
11
---
2-
title: Understanding Confidence Scores in Azure AI Content Understanding
2+
title: Understanding Confidence Scores in Azure AI Content Understanding.
33
titleSuffix: Azure AI services
4-
description: Learn about confidence score use-cases, and tips to improve.
5-
author: admaheshwari
6-
ms.author: lajanuar
4+
description: Best practices to interpret and improve Azure AI Content Understanding accuracy and confidence scores.
5+
author: laujan
6+
ms.author: admaheshwari
77
manager: nitinme
88
ms.service: azure-ai-content-understanding
99
ms.topic: overview
1010
ms.date: 02/20/2025
11-
ms.custom: ignite-2024-understanding-release
12-
13-
#customer intent: As a user, I want to learn more about Content Understanding confidence scores.
1411
---
15-
# Confidence Scores in Azure AI Content Understanding
12+
13+
# Interpret and improve accuracy and confidence scores
14+
15+
A confidence score indicates probability by measuring the degree of statistical certainty that the extracted result is detected correctly. The estimated accuracy is calculated by running a few different combinations of the training data to predict the labeled values. In this article, we share how to interpret accuracy and confidence scores and best practices for using those scores to improve accuracy and confidence results.
16+
17+
1618
Understanding Confidence Scores
1719
What are confidence scores?
1820
Confidence scores represent the probability that the extracted result is correct. For example, a confidence score of 0.95 (95%) suggests that the prediction is likely correct 19 out of 20 times. These scores are derived from various factors, including the quality of the input document, the similarity between the training data and the document being analyzed, and the model's ability to recognize patterns and features in the document.
@@ -23,18 +25,18 @@ Confidence scores are supported for extractive fields, including text, tables fo
2325

2426
JSON output for documents
2527
"fields": {
26-
"ClientProjectManager": {
27-
"type": "string",
28-
"valueString": "Nestor Wilke",
29-
"spans": [
30-
{
31-
"offset": 4345,
32-
"length": 12
33-
}
34-
],
35-
"confidence": 0.964,
36-
"source": "D(2,3.5486,8.3139,4.2943,8.3139,4.2943,8.4479,3.5486,8.4479)"
37-
},
28+
"ClientProjectManager": {
29+
"type": "string",
30+
"valueString": "Nestor Wilke",
31+
"spans": [
32+
{
33+
"offset": 4345,
34+
"length": 12
35+
}
36+
],
37+
"confidence": 0.964,
38+
"source": "D(2,3.5486,8.3139,4.2943,8.3139,4.2943,8.4479,3.5486,8.4479)"
39+
},
3840
What are thresholds for confidence scores?
3941
Thresholds for confidence scores are predefined values that determine whether a prediction is considered reliable or requires further review. These thresholds can be set across different modalities to ensure consistent and accurate results. Setting appropriate thresholds is important because it helps balance the trade-off between automation and accuracy. By setting the right thresholds, users can ensure that only high-confidence predictions are automated, while low-confidence predictions are flagged for human review. This helps improve the overall accuracy and reliability of the predictions
4042

@@ -47,13 +49,13 @@ Human in the Loop (HITL) is a process that involves human intervention in the mo
4749
It can improved accuracy and reliability of the predictions, reduced errors, and enhanced overall quality of the results.
4850

4951
How can customers access confidence score in CU?
50-
For every field extraction, confidence score is listed as part of the field extraction output. You can also check confidence score as part of your JSON output under confidence
52+
For every field extraction, confidence score is listed as part of the field extraction output. You can also check confidence score as part of your JSON output under "confidence"
5153

5254
Tips to improve confidence score
53-
1. Correcting an expected output so that the model can understand the definition better. Example: Here we can see the confidence score is 12%, to improve confidence score, we can go to label data, select auto label which will give us predicted field labels. Now we can correct our definition and it will show corrected field label. Test the analyzer again for better confidence score. Here it jumped to 98%. Confidence improvement will vary as per the complexity and nature of document.
55+
1. Correcting an expected output so that the model can understand the definition better. Example: Here we can see the confidence score is 12%, to improve confidence score, we can go to label data, select auto label which will give us predicted field labels. Now we can correct our definition and it will show corrected field label. Test the analyzer again for better confidence score. Here it jumped to 98%. Confidence improvement will vary as per the complexity and nature of document.
5456

55-
2. Adding more samples and label them for different variation and templates the model may expect.
56-
3. Add documents that contains various input values for the schema you want to extract.
57-
4. Improve the quality of your input documents.
58-
5. Incorporate human in the loop for lower confidence results.
57+
2. Adding more samples and label them for different variation and templates the model may expect.
58+
3. Add documents that contains various input values for the schema you want to extract.
59+
4. Improve the quality of your input documents.
60+
5. Incorporate human in the loop for lower confidence results.
5961
Note: Confidence score is only available for document modality in the preview. For other modalities it will be added soon.

0 commit comments

Comments
 (0)