You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|**US W-2 tax form**|You want to extract key information such as salary, wages, and taxes withheld.|[**US tax W-2 model**](concept-tax-document.md)|
54
-
|**US Tax 1098 form**|You want to extract mortgage interest details such as principal, points, and tax.|[**US tax 1098 model**](concept-tax-document.md)|
55
-
|**US Tax 1098-E form**|You want to extract student loan interest details such as lender and interest amount.|[**US tax 1098-E model**](concept-tax-document.md)|
56
-
|**US Tax 1098T form**|You want to extract qualified tuition details such as scholarship adjustments, student status, and lender information.|[**US tax 1098-T model**](concept-tax-document.md)|
57
-
|**US Tax 1099(Variations) form**|You want to extract information from `1099` forms and its variations (A, B, C, CAP, DIV, G, H, INT, K, LS, LTC, MISC, NEC, OID, PATR, Q, QA, R, S, SA, SB).|[**US tax 1099 model**](concept-tax-document.md)|
58
-
|**US Tax 1040(Variations) form**|You want to extract information from `1040` forms and its variations (Schedule 1, Schedule 2, Schedule 3, Schedule 8812, Schedule A, Schedule B, Schedule C, Schedule D, Schedule E, Schedule EIC, Schedule F, Schedule H, Schedule J, Schedule R, Schedule SE, Schedule Senior).|[**US tax 1040 model**](concept-tax-document.md)|
51
+
|**US Tax W-2 tax**|You want to extract key information such as salary, wages, and taxes withheld.|[**US tax W-2 model**](concept-tax-document.md)|
52
+
|**US Tax 1098**|You want to extract mortgage interest details such as principal, points, and tax.|[**US tax 1098 model**](concept-tax-document.md)|
53
+
|**US Tax 1098-E**|You want to extract student loan interest details such as lender and interest amount.|[**US tax 1098-E model**](concept-tax-document.md)|
54
+
|**US Tax 1098T**|You want to extract qualified tuition details such as scholarship adjustments, student status, and lender information.|[**US tax 1098-T model**](concept-tax-document.md)|
55
+
|**US Tax 1099(Variations)**|You want to extract information from `1099` forms and its variations (A, B, C, CAP, DIV, G, H, INT, K, LS, LTC, MISC, NEC, OID, PATR, Q, QA, R, S, SA, SB).|[**US tax 1099 model**](concept-tax-document.md)|
56
+
|**US Tax 1040(Variations)**|You want to extract information from `1040` forms and its variations (Schedule 1, Schedule 2, Schedule 3, Schedule 8812, Schedule A, Schedule B, Schedule C, Schedule D, Schedule E, Schedule `EIC`, Schedule F, Schedule H, Schedule J, Schedule R, Schedule `SE`, Schedule Senior).|[**US tax 1040 model**](concept-tax-document.md)|
59
57
|**Contract** (legal agreement between parties).|You want to extract contract agreement details such as parties, dates, and intervals.|[**Contract model**](concept-contract.md)|
60
58
|**Health insurance card** or health insurance ID.| You want to extract key information such as insurer, member ID, prescription coverage, and group number.|[**Health insurance card model**](./concept-health-insurance-card.md)|
61
59
|**Credit/Debit card** . |You want to extract key information bank cards such as card number and bank name. |[**Credit/Debit card model**](concept-credit-card.md)|
@@ -78,7 +76,8 @@ The following decision charts highlight the features of each **Document Intellig
78
76
| Training set | Example documents | Your best solution |
|**Structured, consistent, documents with a static layout**. |Structured forms such as questionnaires or applications. |[**Custom template model**](./concept-custom-template.md)|
|**Unstructured documents, documents with varying templates**.|● Unstructured documents like contracts or letters</br> ● Varying document templates like loan statements from different mortgage companies|[**Custom generative model**](concept-custom-generative.md)|
82
81
|**A collection of several models each trained on similar-type documents.**|● Supply purchase orders</br>● Equipment purchase orders</br>● Furniture purchase orders</br> **All composed into a single model**.|[**Composed custom model**](concept-composed-models.md)|
Copy file name to clipboardExpand all lines: articles/ai-services/document-intelligence/concept-accuracy-confidence.md
+40-42Lines changed: 40 additions & 42 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,39 +5,22 @@ description: Best practices to interpret the accuracy score from the train model
5
5
author: laujan
6
6
manager: nitinme
7
7
ms.service: azure-ai-document-intelligence
8
-
ms.custom:
9
-
- ignite-2023
10
8
ms.topic: conceptual
11
-
ms.date: 07/11/2024
9
+
ms.date: 08/07/2024
12
10
ms.author: lajanuar
13
11
---
14
12
15
-
# Custom models: accuracy and confidence scores
13
+
# Interpret and improve model accuracy and analysis confidence scores
16
14
17
15
[!INCLUDE [applies to v4.0, v3.1, v3.0, and v2.1](includes/applies-to-v40-v31-v30-v21.md)]
18
16
19
-
> [!NOTE]
20
-
>
21
-
> ***Custom neural models** do not provide accuracy scores during training.
22
-
> * Confidence scores for tables, table rows and table cells are available starting with the **2024-02-29-preview** API version for **custom models**.
23
-
24
-
Custom template models generate an estimated accuracy score when trained. Documents analyzed with a custom model produce a confidence score for extracted fields. A confidence score indicates probability by measuring the degree of statistical certainty that the extracted result is detected correctly. The estimated accuracy is calculated by running a few different combinations of the training data to predict the labeled values. In this article, learn to interpret accuracy and confidence scores and best practices for using those scores to improve accuracy and confidence results.
17
+
A confidence score indicates probability by measuring the degree of statistical certainty that the extracted result is detected correctly. The estimated accuracy is calculated by running a few different combinations of the training data to predict the labeled values. In this article, learn to interpret accuracy and confidence scores and best practices for using those scores to improve accuracy and confidence results.
25
18
26
-
## Accuracy scores
27
-
28
-
The output of a `build` (v3.0) or `train` (v2.1) custom model operation includes the estimated accuracy score. This score represents the model's ability to accurately predict the labeled value on a visually similar document. Accuracy is measured within a percentage value range from 0% (low) to 100% (high). It's best to target a score of 80% or higher. For more sensitive cases, like financial or medical records, we recommend a score of close to 100%. You can also require human review.
29
-
30
-
**Document Intelligence Studio** </br>
31
-
**Trained custom model (invoice)**
32
-
33
-
:::image type="content" source="media/accuracy-confidence/accuracy-studio-results.png" alt-text="Trained custom model accuracy scores":::
34
19
35
20
## Confidence scores
36
-
37
21
> [!NOTE]
38
-
>
39
-
> ***Table, row and cell confidence scores are now included with the 2024-02-29-preview API version**.
40
-
> * Confidence scores for table cells from custom models is added to the API starting with the 2024-02-29-preview API.
22
+
> * Field level confidence is getting update to take into account word confidence score starting with **2024-07-31-preview** API version for **custom models**.
23
+
> * Confidence scores for tables, table rows and table cells are available starting with the **2024-07-31-preview** API version for **custom models**.
41
24
42
25
Document Intelligence analysis results return an estimated confidence for predicted words, key-value pairs, selection marks, regions, and signatures. Currently, not all document fields return a confidence score.
43
26
@@ -60,11 +43,25 @@ After an analysis operation, review the JSON output. Examine the `confidence` va
60
43
61
44
* Use forms that have different values in each field.
62
45
63
-
* For custom models, use a larger set of training documents. Tagging more documents teaches your model to recognize fields with greater accuracy.
46
+
* For custom models, use a larger set of training documents. A larger training set teaches your model to recognize fields with greater accuracy.
47
+
48
+
## Accuracy scores for custom models
49
+
50
+
51
+
> [!NOTE]
52
+
> ***Custom neural and generative models** do not provide accuracy scores during training.
53
+
54
+
The output of a `build` (v3.0 and onward) or `train` (v2.1) custom model operation includes the estimated accuracy score. This score represents the model's ability to accurately predict the labeled value on a visually similar document. Accuracy is measured within a percentage value range from 0% (low) to 100% (high). It's best to target a score of 80% or higher. For more sensitive cases, like financial or medical records, we recommend a score of close to 100%. You can also add a human review stage to validate for more critical automation workflows.
55
+
56
+
**Document Intelligence Studio** </br>
57
+
**Trained custom model (invoice)**
58
+
59
+
:::image type="content" source="media/accuracy-confidence/accuracy-studio-results.png" alt-text="Trained custom model accuracy scores":::
60
+
64
61
65
62
## Interpret accuracy and confidence scores for custom models
66
63
67
-
When interpreting the confidence score from a custom model, you should consider all the confidence scores returned from the model. Let's start with a list of all the confidence scores.
64
+
Custom template models generate an estimated accuracy score when trained. Documents analyzed with a custom model produce a confidence score for extracted fields. When interpreting the confidence score from a custom model, you should consider all the confidence scores returned from the model. Let's start with a list of all the confidence scores.
68
65
69
66
1.**Document type confidence score**: The document type confidence is an indicator of closely the analyzed document resembles documents in the training dataset. When the document type confidence is low, it's indicative of template or structural variations in the analyzed document. To improve the document type confidence, label a document with that specific variation and add it to your training dataset. Once the model is retrained, it should be better equipped to handle that class of variations.
70
67
2.**Field level confidence**: Each labeled field extracted has an associated confidence score. This score reflects the model's confidence on the position of the value extracted. While evaluating confidence scores, you should also look at the underlying extraction confidence to generate a comprehensive confidence for the extracted result. Evaluate the `OCR` results for text extraction or selection marks depending on the field type to generate a composite confidence score for the field.
@@ -80,9 +77,26 @@ The following table demonstrates how to interpret both the accuracy and confiden
80
77
| Low | High |• This result is most unlikely.<br>• For low accuracy scores, add more labeled data or split visually distinct documents into multiple models. |
81
78
| Low | Low|• Add more labeled data.<br>• Split visually distinct documents into multiple models.|
82
79
80
+
81
+
## Ensure high model accuracy for custom models
82
+
83
+
Variances in the visual structure of your documents affect the accuracy of your model. Reported accuracy scores can be inconsistent when the analyzed documents differ from documents used in training. Keep in mind that a document set can look similar when viewed by humans but appear dissimilar to an AI model. To follow, is a list of the best practices for training models with the highest accuracy. Following these guidelines should produce a model with higher accuracy and confidence scores during analysis and reduce the number of documents flagged for human review.
84
+
85
+
* Ensure that all variations of a document are included in the training dataset. Variations include different formats, for example, digital versus scanned PDFs.
86
+
87
+
* Add at least five samples of each type to the training dataset if you expect the model to analyze both types of PDF documents.
88
+
89
+
* Separate visually distinct document types to train different models for custom template and neural models.
90
+
* As a general rule, if you remove all user entered values and the documents look similar, you need to add more training data to the existing model.
91
+
* If the documents are dissimilar, split your training data into different folders and train a model for each variation. You can then [compose](how-to-guides/compose-custom-models.md?view=doc-intel-2.1.0&preserve-view=true#create-a-composed-model) the different variations into a single model.
92
+
93
+
* Ensure that you don't have any extraneous labels.
94
+
95
+
* Ensure that signature and region labeling doesn't include the surrounding text.
96
+
83
97
## Table, row, and cell confidence
84
98
85
-
With the addition of table, row and cell confidence with the ```2024-02-29-preview``` API, here are some common questions that should help with interpreting the table, row, and cell scores:
99
+
With the addition of table, row and cell confidence with the ```2024-02-29-preview``` API and onward, here are some common questions that should help with interpreting the table, row, and cell scores:
86
100
87
101
**Q:** Is it possible to see a high confidence score for cells, but a low confidence score for the row?<br>
88
102
@@ -115,23 +129,7 @@ With the addition of table, row and cell confidence with the ```2024-02-29-previ
115
129
For **fixed tables**, cell-level confidence already captures quite a bit of information on the correctness of things. This means that simply going over each cell and looking at its confidence can be enough to help determine the quality of the prediction.
116
130
For **dynamic tables**, the levels are meant to build on top of each other, so the top-to-bottom approach is more important.
117
131
118
-
## Ensure high model accuracy
119
-
120
-
Variances in the visual structure of your documents affect the accuracy of your model. Reported accuracy scores can be inconsistent when the analyzed documents differ from documents used in training. Keep in mind that a document set can look similar when viewed by humans but appear dissimilar to an AI model. To follow, is a list of the best practices for training models with the highest accuracy. Following these guidelines should produce a model with higher accuracy and confidence scores during analysis and reduce the number of documents flagged for human review.
121
-
122
-
* Ensure that all variations of a document are included in the training dataset. Variations include different formats, for example, digital versus scanned PDFs.
123
-
124
-
* Add at least five samples of each type to the training dataset if you expect the model to analyze both types of PDF documents.
125
-
126
-
* Separate visually distinct document types to train different models.
127
-
* As a general rule, if you remove all user entered values and the documents look similar, you need to add more training data to the existing model.
128
-
* If the documents are dissimilar, split your training data into different folders and train a model for each variation. You can then [compose](how-to-guides/compose-custom-models.md?view=doc-intel-2.1.0&preserve-view=true#create-a-composed-model) the different variations into a single model.
129
-
130
-
* Ensure that you don't have any extraneous labels.
131
-
132
-
* Ensure that signature and region labeling doesn't include the surrounding text.
133
-
134
132
## Next step
135
133
136
134
> [!div class="nextstepaction"]
137
-
> [Learn to create custom models](quickstarts/try-document-intelligence-studio.md#custom-models)
135
+
> [Learn more about custom models](concept-custom.md)
0 commit comments