Skip to content

Commit 0f2c249

Browse files
Merge pull request #4025 from MicrosoftDocs/main
Merged by Learn.Build PR Management system
2 parents a9279bd + 3e29a68 commit 0f2c249

File tree

9 files changed

+109
-41
lines changed

9 files changed

+109
-41
lines changed
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
---
2+
title: Understand and improve confidence scores in Azure AI Content Understanding.
3+
titleSuffix: Azure AI services
4+
description: Tips for interpreting and improving Azure AI Content Understanding accuracy and confidence scores.
5+
author: laujan
6+
ms.author: admaheshwari
7+
manager: nitinme
8+
ms.service: azure-ai-content-understanding
9+
ms.topic: reference
10+
ms.date: 04/09/2025
11+
---
12+
13+
# Interpret and improve confidence and accuracy scores
14+
15+
> [!NOTE]
16+
>
17+
> * Azure AI Content Understanding is currently available in preview.
18+
>
19+
> * While the service is in active development, confidence scores are only available for the document modality.
20+
21+
Confidence scores quantify the probability that a result is accurately detected, by gauging the degree of statistical certainty. The estimated accuracy is derived from evaluating various combinations of the training data to forecast the labeled values. In this article, we share how to interpret accuracy and confidence scores and best practices for using those scores to improve both accuracy and confidence results.
22+
23+
Confidence scores are essential as they indicate the model's level of certainty in its predictions. These scores enable users to assess the reliability of extracted data, guiding whether a human review is necessary. Additionally, confidence scores are instrumental in streamlining workflows and enhancing efficiency by minimizing the need for manual validation.
24+
25+
## Supported fields
26+
27+
Confidence scores are supported for extractive various fields, including text, tables, and images. The specific fields supported may vary depending on the model and the use case.
28+
29+
## Confidence scores
30+
31+
Confidence scores are listed for every field as part of the field extraction output:
32+
33+
:::image type="content" source="../media/confidence-accuracy/field-extraction-score.png" alt-text="Screenshot of field extraction scores from Azure AI Foundry.":::
34+
35+
Confidence scores are also part of extraction output JSON file:
36+
37+
:::image type="content" source="../media/confidence-accuracy/json-output.png" alt-text="Screenshot of field extraction JSON output.":::
38+
39+
## Improving accuracy results
40+
41+
Common challenges with confidence scores include the quality of input documents, diversity in document types, complexity of the documents, and limitations of the model to recognize certain types of content or features. These limitations underscore the need for continuous improvements and adaptations in the modeling process to enhance reliability and accuracy. Here are some tips:
42+
43+
* **Establish appropriate thresholds**. Setting thresholds can enhance the accuracy and reliability of predictions. These thresholds are predefined values that determine whether a prediction is considered reliable or requires further review. Establishing the right thresholds ensures that only high-confidence predictions are automated, while low-confidence predictions are flagged for human review. This approach helps increases the overall accuracy and reliability of predictions.
44+
45+
* **Incorporate human review into workflows**. Human in the Loop (`HITL`) is a process where human intervention is introduced to validate and correct the model's predictions. Utilizing human expertise and judgment enhances the accuracy and reliability of predictions. `HITL` allows for the identification and correction of errors, improves the model's performance, and elevates the overall quality of predictions by involving human experts only when confidence scores fall below a specified threshold.
46+
47+
* **Include diverse input values for the schema you aim to extract**. To enrich the dataset and account for different variations and templates the model might encounter, use forms with unique values in each field and add labeled samples.
48+
49+
* **Improve the quality of your input documents**. Clear, well-structured forms with consistent formatting typically result in higher confidence scores.
50+
51+
## Related content
52+
53+
* [Best practices for Content Understanding](best-practices.md)
54+
55+
* [Document Intelligence accuracy and confidence scores](../../document-intelligence/concept/accuracy-confidence.md)
56+
57+
321 KB
Loading
74.2 KB
Loading

articles/ai-services/content-understanding/toc.yml

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ items:
4141
href: concepts/capabilities.md
4242
- name: Analyzer templates
4343
displayName: analyzer, templates, document, text, images, video, audio, multimodal, visual, structured, content, field, extraction
44-
href: concepts/analyzer-templates.md
44+
href: concepts/analyzer-templates.md
4545
- name: Document
4646
displayName: document, text, images, video, audio, visual, structured, content, field, extraction
4747
href: document/overview.md
@@ -54,9 +54,14 @@ items:
5454
- name: Video
5555
href: video/overview.md
5656
displayName: video, audio, voice, recognition, synthesis, speaker, identification, verification, diarization, transcription, translation, language, understanding, sentiment, analysis, emotion, detection, pronunciation, model
57-
- name: Best practices
58-
displayName: best practices, analyzers, optimization, fields
59-
href: concepts/best-practices.md
57+
- name: Concepts
58+
items:
59+
- name: Best practices
60+
displayName: best practices, analyzers, optimization, fields
61+
href: concepts/best-practices.md
62+
- name: Accuracy and confidence
63+
displayName: accuracy, confidence, analyzers, optimization, fields, scores
64+
href: concepts/accuracy-confidence.md
6065
- name: Responsible AI
6166
items:
6267
- name: Transparency note
@@ -68,4 +73,4 @@ items:
6873
- name: REST API
6974
displayName: quota, tiers, throttle, max, adjustments, requests, support, ocr
7075
href: /rest/api/contentunderstanding/operation-groups?view=rest-contentunderstanding-2024-12-01-preview&preserve-view=true
71-
76+

0 commit comments

Comments
 (0)