Skip to content

Commit 32564bd

Browse files
authored
Merge pull request #260091 from laujan/187687-revisit-di-language-support
revisit language support
2 parents e38e09c + 001ba59 commit 32564bd

File tree

5 files changed

+166
-111
lines changed

5 files changed

+166
-111
lines changed

articles/ai-services/document-intelligence/concept-model-overview.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -48,15 +48,15 @@ The following table shows the available models for each current preview and stab
4848
|Model|[2023-10-31-preview](/rest/api/aiservices/document-models/analyze-document?view=rest-aiservices-2023-10-31-preview&preserve-view=true&tabs=HTTP)|[2023-07-31 (GA)](/rest/api/aiservices/document-models/analyze-document?view=rest-aiservices-2023-07-31&preserve-view=true&tabs=HTTP)|[2022-08-31 (GA)](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-2022-08-31/operations/AnalyzeDocument)|[v2.1 (GA)](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-1/operations/AnalyzeBusinessCardAsync)|
4949
|----------------|-----------|---|--|---|
5050
|[Add-on capabilities](concept-add-on-capabilities.md) | ✔️| ✔️| n/a| n/a|
51-
|[Business Card](concept-business-card.md) | deprecated|✔️|✔️|✔️ |
51+
|[Business card](concept-business-card.md) | deprecated|✔️|✔️|✔️ |
5252
|[Contract](concept-contract.md) | ✔️| ✔️| n/a| n/a|
5353
|[Custom classifier](concept-custom-classifier.md) | ✔️| ✔️| n/a| n/a|
5454
|[Custom composed](concept-composed-models.md) | ✔️| ✔️| ✔️| ✔️|
5555
|[Custom neural](concept-custom-neural.md) | ✔️| ✔️| ✔️| n/a|
5656
|[Custom template](concept-custom-template.md) | ✔️| ✔️| ✔️| ✔️|
57-
|[General Document](concept-general-document.md) | deprecated| ✔️| ✔️| n/a|
58-
|[Health Insurance Card](concept-health-insurance-card.md)| ✔️| ✔️| ✔️| n/a|
59-
|[ID Document](concept-id-document.md) | ✔️| ✔️| ✔️| ✔️|
57+
|[General document](concept-general-document.md) | deprecated| ✔️| ✔️| n/a|
58+
|[Health insurance card](concept-health-insurance-card.md)| ✔️| ✔️| ✔️| n/a|
59+
|[ID document](concept-id-document.md) | ✔️| ✔️| ✔️| ✔️|
6060
|[Invoice](concept-invoice.md) | ✔️| ✔️| ✔️| ✔️|
6161
|[Layout](concept-layout.md) | ✔️| ✔️| ✔️| ✔️|
6262
|[Read](concept-read.md) | ✔️| ✔️| ✔️| n/a|

articles/ai-services/document-intelligence/language-support-custom.md

Lines changed: 39 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -37,17 +37,17 @@ ms.date: 11/15/2023
3737

3838
Azure AI Document Intelligence models provide multilingual document processing support. Our language support capabilities enable your users to communicate with your applications in natural ways and empower global outreach. Custom models are trained using your labeled datasets to extract distinct data from structured, semi-structured, and unstructured documents specific to your use cases. Standalone custom models can be combined to create composed models. The following tables list the available language and locale support by model and feature:
3939

40-
## [Custom classifier](#tab/custom-classifier)
41-
42-
***custom classifier model***
40+
## Custom classifier
4341

4442
:::moniker range="doc-intel-3.1.0"
43+
4544
| Language—Locale code | Default |
4645
|:----------------------|:---------|
4746
| English (United States)—en-US| English (United States)—en-US|
4847
:::moniker-end
4948

5049
:::moniker range="doc-intel-4.0.0"
50+
5151
|Language| Code (optional) |
5252
|:-----|:----:|
5353
|Afrikaans| `af`|
@@ -97,25 +97,14 @@ Azure AI Document Intelligence models provide multilingual document processing s
9797
|Ukrainian|`uk`|
9898
|Urdu|`ur`|
9999
|Vietnamese|`vi`|
100-
:::moniker-end
101100

102-
## [Custom neural](#tab/custom-neural)
103-
104-
***custom neural model***
105-
106-
#### Handwritten text
101+
:::moniker-end
107102

108-
The following table lists the supported languages for extracting handwritten texts.
103+
## Custom neural
109104

110-
|Language| Language code (optional) | Language| Language code (optional) |
111-
|:-----|:----:|:-----|:----:|
112-
|English|`en`|Japanese |`ja`|
113-
|Chinese Simplified |`zh-Hans`|Korean |`ko`|
114-
|French |`fr`|Portuguese |`pt`|
115-
|German |`de`|Spanish |`es`|
116-
|Italian |`it`|
105+
:::moniker range=">=doc-intel-3.1.0"
117106

118-
#### Printed text
107+
## [**Printed text**](#tab/printed)
119108

120109
The following table lists the supported languages for printed text.
121110

@@ -125,8 +114,8 @@ The following table lists the supported languages for printed text.
125114
|Albanian| `sq`|
126115
|Arabic|`ar`|
127116
|Bulgarian|`bg`|
128-
|Chinese (Han (Simplified variant))| `zh-Hans`|
129-
|Chinese (Han (Traditional variant))|`zh-Hant`|
117+
|Chinese Simplified| `zh-Hans`|
118+
|Chinese Traditional|`zh-Hant`|
130119
|Croatian|`hr`|
131120
|Czech|`cs`|
132121
|Danish|`da`|
@@ -169,7 +158,19 @@ The following table lists the supported languages for printed text.
169158
|Urdu|`ur`|
170159
|Vietnamese|`vi`|
171160

172-
:::moniker range=">=doc-intel-3.1.0"
161+
## [**Handwritten text**](#tab/handwritten)
162+
163+
The following table lists the supported languages for extracting **handwritten** texts.
164+
165+
|Language| Language code (optional) | Language| Language code (optional) |
166+
|:-----|:----:|:-----|:----:|
167+
|English|`en`|Japanese |`ja`|
168+
|Chinese Simplified |`zh-Hans`|Korean |`ko`|
169+
|French |`fr`|Portuguese |`pt`|
170+
|German |`de`|Spanish |`es`|
171+
|Italian |`it`|
172+
173+
---
173174

174175
Neural models support added languages for the `v3.1` and later APIs.
175176

@@ -184,25 +185,14 @@ Neural models support added languages for the `v3.1` and later APIs.
184185

185186
:::moniker-end
186187

187-
## [Custom template](#tab/custom-template)
188-
189-
***custom template model***
188+
## Custom template
190189

191-
#### Handwritten text
190+
:::moniker range=">=doc-intel-3.0.0"
192191

193-
The following table lists the supported languages for extracting handwritten texts.
194-
195-
|Language| Language code (optional) | Language| Language code (optional) |
196-
|:-----|:----:|:-----|:----:|
197-
|English|`en`|Japanese |`ja`|
198-
|Chinese Simplified |`zh-Hans`|Korean |`ko`|
199-
|French |`fr`|Portuguese |`pt`|
200-
|German |`de`|Spanish |`es`|
201-
|Italian |`it`|
192+
## [**Printed**](#tab/printed)
202193

203-
#### Printed text
194+
The following table lists the supported languages for **printed** text.</br>
204195

205-
The following table lists the supported languages for printed text.
206196
:::row:::
207197
:::column span="":::
208198
|Language| Code (optional) |
@@ -522,4 +512,17 @@ The following table lists the supported languages for printed text.
522512
:::column-end:::
523513
:::row-end:::
524514

515+
## [**Handwritten**](#tab/handwritten)
516+
517+
The following table lists the supported languages for extracting handwritten texts.
518+
519+
|Language| Language code (optional) | Language| Language code (optional) |
520+
|:-----|:----:|:-----|:----:|
521+
|English|`en`|Japanese |`ja`|
522+
|Chinese Simplified |`zh-Hans`|Korean |`ko`|
523+
|French |`fr`|Portuguese |`pt`|
524+
|German |`de`|Spanish |`es`|
525+
|Italian |`it`|
526+
525527
---
528+
:::moniker-end

articles/ai-services/document-intelligence/language-support-ocr.md

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -53,19 +53,20 @@ Azure AI Document Intelligence models provide multilingual document processing s
5353

5454
::: moniker-end
5555

56-
## Read model
57-
58-
##### Model ID: **prebuilt-read**
59-
6056
> [!NOTE]
6157
> **Language code optional**
6258
>
6359
> * Document Intelligence's deep learning based universal models extract all multi-lingual text in your documents, including text lines with mixed languages, and don't require specifying a language code.
64-
> * Don't provide the language code as the parameter unless you are sure about the language and want to force the service to apply only the relevant model. Otherwise, the service may return incomplete and incorrect text.
60+
>
61+
> * Don't provide the language code as the parameter unless you are sure of the language and want to force the service to apply only the relevant model. Otherwise, the service may return incomplete and incorrect text.
6562
>
6663
> * Also, It's not necessary to specify a locale. This is an optional parameter. The Document Intelligence deep-learning technology will auto-detect the text language in your image.
6764
68-
### [Read: handwritten text](#tab/read-hand)
65+
## Read model
66+
67+
##### Model ID: **prebuilt-read**
68+
69+
### [**Read: handwritten text**](#tab/read-hand)
6970

7071
:::moniker range="doc-intel-4.0.0"
7172

@@ -107,15 +108,15 @@ The following table lists read model language support for extracting and analyzi
107108

108109
:::moniker-end
109110

110-
### [Read: printed text](#tab/read-print)
111+
### [**Read: printed text**](#tab/read-print)
111112

112113
:::moniker range=">=doc-intel-3.1.0"
113114

114115
The following table lists read model language support for extracting and analyzing **printed** text. </br>
115116

116117
:::row:::
117118
:::column span="":::
118-
|Language| Code (optional) |
119+
|Language| Code (optional) |
119120
|:-----|:----:|
120121
|Abaza|abq|
121122
|Abkhazian|ab|
@@ -194,7 +195,7 @@ The following table lists read model language support for extracting and analyzi
194195
|Finnish|fi|
195196
:::column-end:::
196197
:::column span="":::
197-
|Language| Code (optional) |
198+
|Language| Code (optional) |
198199
|:-----|:----:|
199200
|Fon|fon|
200201
|French|fr|
@@ -622,7 +623,7 @@ The following table lists read model language support for extracting and analyzi
622623

623624
:::moniker-end
624625

625-
### [Read: language detection](#tab/read-detection)
626+
### [**Read: language detection**](#tab/read-detection)
626627

627628
The [Read model API](concept-read.md) supports **language detection** for the following languages in your documents. This list can include languages not currently supported for text extraction.
628629

@@ -768,7 +769,7 @@ The [Read model API](concept-read.md) supports **language detection** for the fo
768769

769770
##### Model ID: **prebuilt-layout**
770771

771-
### [Layout: handwritten text](#tab/layout-hand)
772+
### [**Layout: handwritten text**](#tab/layout-hand)
772773

773774
:::moniker range="doc-intel-4.0.0"
774775

@@ -820,7 +821,7 @@ The following table lists layout model language support for extracting and analy
820821
|Thai (preview) | `th` | Arabic (preview) | `ar` |
821822
:::moniker-end
822823

823-
### [Layout: printed text](#tab/layout-print)
824+
### [**Layout: printed text**](#tab/layout-print)
824825

825826
:::moniker range=">=doc-intel-3.1.0"
826827

0 commit comments

Comments
 (0)