Skip to content

Commit 26ee686

Browse files
authored
Fix typos and improve clarity in best practices doc
1 parent 9692458 commit 26ee686

File tree

1 file changed

+11
-11
lines changed

1 file changed

+11
-11
lines changed

articles/ai-services/content-understanding/best-practices.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Best practicies for using Content Understanding
2+
title: Best practices for using Content Understanding
33
titleSuffix: Azure AI services
44
description: Learn about how to best use Azure AI Content Understanding for content and field extractions on documents, images, video and audio files.
55
author: jfilcik
@@ -21,7 +21,7 @@ This document provides guidance on how to effectively use Content Understanding
2121
---
2222

2323
## Use field descriptions to guide output
24-
When defining a schema, it is essential to provide detailed field descriptions. Clear and concise descriptions guide the model to focus on the correct information, improving the accuracy of the output.
24+
When defining a schema, it's essential to provide detailed field descriptions. Clear and concise descriptions guide the model to focus on the correct information, improving the accuracy of the output.
2525

2626
### Example 1:
2727
If you want to extract the date from an invoice, in addition to naming the field `"Date"`, provide a description such as:
@@ -45,26 +45,26 @@ This extra context guides the model to the right location in the document.
4545
---
4646

4747
## Use classification fields for specific outputs
48-
When you need the system to choose from a set of predefined options (e.g., document type, product category, or status), use classification fields. When there's ambiguity with the options, provide clear descriptions for each option, enabling the model to categorize the data accurately.
48+
When you need the system to choose from a set of predefined options (for example, document type, product category, or status), use classification fields. When there's ambiguity with the options, provide clear descriptions for each option, enabling the model to categorize the data accurately.
4949

5050
### Example 1:
5151
If you need to classify documents as either `"Invoice"`, `"Claim"`, or `"Report"`, create a classification field with these words as category names.
5252

5353
### Example 2:
5454
When processing product images, you might need to assign them to categories like `"AlcoholicDrinks"`, `"SoftDrinks"`, `"Snacks"`, and `"DairyProducts"`. Since some items may appear similar, providing precise definitions for close-call cases can help. For example:
5555

56-
- **`"Alcoholic Drinks"`**: Beverages containing alcohol, such as beer, wine, and spirits. This excludes soft drinks or non-alcoholic beverages.
57-
- **`"Soft Drinks"`**: Carbonated non-alcoholic beverages, such as soda and sparkling water. This does not include juices or alcoholic drinks.
56+
- **`"Alcoholic Drinks"`**: Beverages containing alcohol, such as beer, wine, and spirits. This category excludes soft drinks or other nonalcoholic beverages..
57+
- **`"Soft Drinks"`**: Carbonated nonalcoholic beverages, such as soda and sparkling water. This category doesn't include juices or alcoholic drinks..
5858

5959
By clearly defining each category, you ensure that the system correctly classifies products while minimizing misclassification.
6060

6161
---
6262

6363
## Use confidence scores to determine when human review is needed
64-
Confidence scores help you decide when to involve human reviewers. Customers can interpret confidence scores using thresholds to decide which results need more review, minimizing the risk of errors.
64+
Confidence scores help you decide when to involve human reviewers. Customers can interpret confidence scores using thresholds to decide which results need more reviews, minimizing the risk of errors.
6565

6666
### Example:
67-
For an Invoice review use case, if a key extracted field like `"TotalInvoiceAmount"` has a confidence score under **0.80**, route that document to manual review. This ensures that critical fields like invoice totals or legal statements are verified by a human when necessary.
67+
For an Invoice review use case, if a key extracted field like `"TotalInvoiceAmount"` has a confidence score under **0.80**, route that document to manual review. This helps ensure that a human verifies critical fields like invoice totals or legal statements when necessary.
6868

6969
You might set different confidence thresholds based on the type of field. For instance, a lower threshold for a `"Comments"` field that’s less critical and a higher one for `"ContractTerminationDate"` to ensure no mistakes.
7070

@@ -73,12 +73,12 @@ You might set different confidence thresholds based on the type of field. For in
7373
## Reduce errors by narrowing language selection for audio and video
7474
When working with audio and video content, selecting a narrow set of languages for transcription can potentially reduce errors. The more languages you include, the more the system has to guess which language is being spoken, which may increase misrecognition.
7575

76-
### Example 1:
77-
If you are certain that the content only contains English and Spanish, configuring your transcription to these two languages only may improve quality. But if the content accidentally includes another languages, such configuration may actually degrade overall quality.
76+
### Example:
77+
If you're certain that the content only contains English and Spanish, configuring your transcription to these two languages only may improve quality. But if the content accidentally includes other languages, such configuration may actually degrade overall quality.
7878

7979
---
8080

81-
## Transcript, OCR text, and speaker data do not require fields
82-
By default, Content Extraction information such as transcripts, OCR results, and video key frames can be accessed directly from the analyzer output for immediate review or custom processing. There is no need to define a field in the schema for these items. Fields can be used when additional processing is needed (e.g., summarizing transcripts, identifying entities, or extracting specific items from OCR). Each field can instruct the system to extract or generate the content you need.
81+
## Transcript, document text, and speaker data don't require fields
82+
By default, Content Extraction information such as speech transcripts, document text extracted by OCR, and video key frames can be accessed directly from the analyzer output for immediate review or custom processing. There's no need to define a field in the schema for these items. Fields can be used when additional processing is needed (e.g., summarizing transcripts, identifying entities, or extracting specific items from OCR). Each field can instruct the system to extract or generate the content you need.
8383

8484
---

0 commit comments

Comments
 (0)