You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/content-understanding/concepts/best-practices.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,15 +22,15 @@ This document provides guidance and best practices to effectively utilize Conten
22
22
23
23
When defining a schema, it's essential to provide detailed field descriptions. Clear and concise descriptions guide the model to focus on the correct information, improving the accuracy of the output.
24
24
25
-
#####  ***Example***
25
+
#####  ***Example 1***
26
26
27
27
* If you want to extract the date from an invoice, in addition to naming the field `Date`, provide a description such as:
28
28
29
29
30
30
> `The date when the invoice was issued, typically found at the top right corner of the document.`
31
31
32
32
33
-
#####  ***Example***
33
+
#####  ***Example 2***
34
34
35
35
* Suppose you want to extract the `Customer Name` from an invoice. Your description might read:
36
36
@@ -41,7 +41,7 @@ When defining a schema, it's essential to provide detailed field descriptions. C
41
41
42
42
If the system's output isn't meeting expectations, the first step is to try refining and updating the field descriptions. Clarifying the context and being more explicit about what you need, reduces ambiguity and improves accuracy.
43
43
44
-
#####  ***Example***
44
+
#####  ***Example 3***
45
45
46
46
* If the `Shipping date` field generated inconsistent or incorrect extraction, often after a `Dispatch Date` label, update it to something more precise like:
47
47
@@ -54,11 +54,11 @@ If the system's output isn't meeting expectations, the first step is to try refi
54
54
55
55
When you need the system to choose from a set of predefined options, for example, document type, product category, or status, use classification fields. Where there's ambiguity with the options, provide clear descriptions for each option, enabling the model to categorize the data accurately.
56
56
57
-
#####  ***Example***
57
+
#####  ***Example 4***
58
58
59
59
* If you need to classify documents as either `Invoice`, `Claim`, or `Report`, create a classification field with these words as category names.
60
60
61
-
#####  ***Example***
61
+
#####  ***Example 5***
62
62
63
63
* When processing product images, you might need to assign them to categories like `AlcoholicDrinks`, `SoftDrinks`, `Snacks`, and `DairyProducts`. Since some items can appear similar, providing precise definitions for close-call cases can help. For example:
64
64
@@ -72,7 +72,7 @@ When you need the system to choose from a set of predefined options, for example
72
72
73
73
Confidence scores help you decide when to involve human reviewers. Customers can interpret confidence scores using thresholds to decide which results need more reviews, minimizing the risk of errors.
74
74
75
-
#####  ***Example***
75
+
#####  ***Example 6***
76
76
77
77
* For an invoice review use case, if a key extracted field like `TotalInvoiceAmount` has a confidence score under **0.80**, route that document to manual review. This helps ensure that a human verifies critical fields like invoice totals or legal statements when necessary.
78
78
@@ -82,7 +82,7 @@ Confidence scores help you decide when to involve human reviewers. Customers can
82
82
83
83
When you're working with audio and video content, selecting a narrow set of languages for transcription can potentially reduce errors. The more languages you include, the more the system has to guess which language is being spoken, which cam increase misrecognition.
84
84
85
-
#####  ***Example***
85
+
#####  ***Example 7***
86
86
87
87
* If you're certain that the content only contains English and Spanish, configuring your transcription to only these two languages can improve quality. But if the content accidentally includes other languages, such configuration can actually degrade overall quality.
0 commit comments