@@ -17,14 +17,15 @@ Azure AI Content Understanding provides multilingual support in multiple geograp
17
17
18
18
## Region support
19
19
20
- To use the Azure AI Content Understanding service, you must create your Azure AI Service resource in a supported region. The Content Understanding features are available in the following regions:
20
+ To use Azure AI Content Understanding, create your Azure AI Service resource in a supported region. All data at rest is stored in the selected region. For lower latency or increased capacity, you can specify the [ processing location ] ( ./concepts/analyzers-overview.md#data-processing-location ) where analysis occurs. Content Understanding is available in the following regions. When the processing location is set to ` geography ` or ` data zone ` , the corresponding locations are shown.
21
21
22
- | Identifier | Region | Geography | Data Zone |
23
- | --- | --- | --- | --- |
24
- | ` westus ` | West US | United States | United States |
25
- | ` swedencentral ` | Sweden Central | Sweden | European Union |
26
- | ` australiaeast ` | Australia East | Australia | N/A |
22
+ | Identifier | Region | Geography | Data Zone |
23
+ | ----------------- | ---------------- | ----------------- | ------------------ |
24
+ | ` westus ` | West US | United States | United States |
25
+ | ` swedencentral ` | Sweden Central | Sweden | European Union |
26
+ | ` australiaeast ` | Australia East | Australia | N/A † |
27
27
28
+ † Australia East doesn't support data zone as a processing location.
28
29
29
30
## Language support
30
31
@@ -39,97 +40,89 @@ Content Understanding applies [Azure OpenAI models](../openai/overview.md) which
39
40
> * The following list of supported languages have locale-aware normalization for words enabled in post-processing.
40
41
> * Content Understanding supports different languages so we encourage you to try it out and focus on the content and not the value itself.
41
42
42
- | ** Language** | ** Language code** | ** Language** | ** Language code** |
43
- | :-----| :-----| :-----| :-----|
44
- | Afrikaans| ` af ` | Kurdish (Arabic)| ` ku-arab ` |
45
- | Albanian| ` sq ` | Kurdish (Latin)| ` ku ` , ` ku-latn ` |
46
- | Angika| ` anp ` | Kurukh| ` kru ` |
47
- | Arabic| ` ar ` | Kölsch| ` ksh ` |
48
- | Asturian| ` ast ` | Lakota| ` lkt ` |
49
- | Awadhi| ` awa ` | Latin| ` la ` |
50
- | Azerbaijani| ` az ` | Lithuanian| ` lt ` |
51
- | Bagheli| ` bfy ` | Lower Sorbian| ` dsb ` |
52
- | Basque| ` eu ` | Lule Sami| ` smj ` |
53
- | Belarusian (Cyrillic)| ` be ` , ` be-cyrl ` | Luxembourgish| ` lb ` |
54
- | Belarusian (Latin)| ` be-latn ` | Mahasu Pahari| ` bfz ` |
55
- | Bhojpuri| ` bho ` | Malay| ` ms ` |
56
- | Bislama| ` bi ` | Malto| ` kmj ` |
57
- | Bodo| ` brx ` | Manx| ` gv ` |
58
- | Bosnian| ` bs ` | Maori| ` mi ` |
59
- | Braj| ` bra ` | Marathi| ` mr ` |
60
- | Breton| ` br ` | Mongolian| ` mn ` |
61
- | Bulgarian| ` bg ` | Montenegrin (Cyrillic)| ` cnr-cyrl ` |
62
- | Bundeli| ` bns ` | Montenegrin (Latin)| ` cnr ` , ` cnr-latn ` |
63
- | Buriat| ` bua ` | Neapolitan| ` nap ` |
64
- | Camling| ` rab ` | Nepali| ` ne ` |
65
- | Catalan| ` ca ` | Niuean| ` niu ` |
66
- | Cebuano| ` ceb ` | Nogai| ` nog ` |
67
- | Chamorro| ` ch ` | Northern Sami| ` sme ` |
68
- | Chhattisgarhi| ` hne ` | Norwegian| ` no ` |
69
- | Chinese (Simplified)| ` zh ` , ` zh-hans ` | Occitan| ` oc ` |
70
- | Chinese (Traditional)| ` zh-hant ` | Ossetian| ` os ` |
71
- | Cornish| ` kw ` | Panjabi| ` pa ` |
72
- | Corsican| ` co ` | Persian| ` fa ` |
73
- | Crimean Tatar| ` crh ` | Polish| ` pl ` |
74
- | Croatian| ` hr ` | Portuguese| ` pt ` |
75
- | Czech| ` cs ` | Pushto| ` ps ` |
76
- | Danish| ` da ` | Romanian| ` ro ` |
77
- | Dari| ` prs ` | Romansh| ` rm ` |
78
- | Dhimal| ` dhi ` | Russian| ` ru ` |
79
- | Dogri| ` doi ` | Sadri| ` sck ` |
80
- | Dutch| ` nl ` | Samoan| ` sm ` |
81
- | English| ` en-US ` , ` en-AU ` , ` en-CA ` ,` en-GB ` , ` en-IN ` | Sanskrit| ` sa ` |
82
- | Erzya| ` myv ` | Santali| ` sat ` |
83
- | Estonian| ` et ` | Scots| ` sco ` |
84
- | Faroese| ` fo ` | Scottish Gaelic| ` gd ` |
85
- | Fijian| ` fj ` | Serbian (Latin)| ` sr ` , ` sr-latn ` |
86
- | Filipino| ` fil ` | Sirmauri| ` srx ` |
87
- | Finnish| ` fi ` | Skolt Sami| ` sms ` |
88
- | French| ` fr ` | Slovak| ` sk ` |
89
- | Friulian| ` fur ` | Slovenian| ` sl ` |
90
- | Gagauz| ` gag ` | Somali| ` so ` |
91
- | Galician| ` gl ` | Southern Sami| ` sma ` |
92
- | German| ` de ` | Spanish| ` es ` |
93
- | Gilbertese| ` gil ` | Swahili| ` sw ` |
94
- | Gondi| ` gon ` | Swedish| ` sv ` |
95
- | Gurung| ` gvr ` | Tajik| ` tg ` |
96
- | Haitian| ` ht ` | Tatar| ` tt ` |
97
- | Halbi| ` hlb ` | Tetum| ` tet ` |
98
- | Hani| ` hni ` | Thangmi| ` thf ` |
99
- | Haryanvi| ` bgc ` | Thai| ` th ` |
100
- | Hawaiian| ` haw ` | Tonga| ` to ` |
101
- | Hindi| ` hi ` | Turkish| ` tr ` |
102
- | Hmong Daw| ` mww ` | Tuvinian| ` tyv ` |
103
- | Ho| ` hoc ` | Uighur| ` ug ` |
104
- | Hungarian| ` hu ` | Upper Sorbian| ` hsb ` |
105
- | Icelandic| ` is ` | Urdu| ` ur ` |
106
- | Inari Sami| ` smn ` | Uzbek (Arabic)| ` uz-arab ` |
107
- | Indonesian| ` id ` | Uzbek (Cyrillic)| ` uz-cyrl ` |
108
- | Interlingua| ` ia ` | Uzbek (Latin)| ` uz ` , ` uz-latn ` |
109
- | Inuktitut| ` iu ` | Volapük| ` vo ` |
110
- | Irish| ` ga ` | Walser| ` wae ` |
111
- | Italian| ` it ` | Welsh| ` cy ` |
112
- | Japanese| ` ja ` | Western Frisian| ` fy ` |
113
- | Jaunsari| ` jns ` | Yucateco| ` yua ` |
114
- | Javanese| ` jv ` | Zhuang| ` za ` |
115
- | K'iche'| ` quc ` | Zulu| ` zu ` |
116
- | Kabuverdianu| ` kea ` ||
117
- | Kachin| ` kac ` ||
118
- | Kalaallisut| ` kl ` ||
119
- | Kangri| ` xnr ` ||
120
- | Kara-Kalpak (Cyrillic)| ` kaa-cyrl ` ||
121
- | Kara-Kalpak (Latin)| ` kaa ` , ` kaa-latn ` ||
122
- | Karachay-Balkar| ` krc ` ||
123
- | Kashubian| ` csb ` ||
124
- | Kazakh (Cyrillic)| ` kk-cyrl ` ||
125
- | Kazakh (Latin)| ` kk ` , ` kk-latn ` ||
126
- | Khaling| ` klr ` ||
127
- | Khasi| ` kha ` ||
128
- | Kirghiz| ` ky ` ||
129
- | Korean| ` ko ` ||
130
- | Korku| ` kfq ` ||
131
- | Koryak| ` kpy ` ||
132
- | Kosraean| ` kos ` ||
43
+ | ** Language** | ** Language code** | ** Language** | ** Language code** |
44
+ | :-----| :-----| :-----| :-----|
45
+ | Afrikaans| ` af ` | Kazakh (Latin)| ` kk, kk-latn ` |
46
+ | Albanian| ` sq ` | Khaling| ` klr ` |
47
+ | Angika| ` anp ` | Khasi| ` kha ` |
48
+ | Arabic| ` ar ` | Kirghiz| ` ky ` |
49
+ | Asturian| ` ast ` | Korean| ` ko ` |
50
+ | Awadhi| ` awa ` | Korku| ` kfq ` |
51
+ | Azerbaijani| ` az ` | Koryak| ` kpy ` |
52
+ | Bagheli| ` bfy ` | Kosraean| ` kos ` |
53
+ | Basque| ` eu ` | Kurdish (Arabic)| ` ku-arab ` |
54
+ | Belarusian (Cyrillic)| ` be, be-cyrl ` | Kurdish (Latin)| ` ku, ku-latn ` |
55
+ | Belarusian (Latin)| ` be-latn ` | Kurukh| ` kru ` |
56
+ | Bhojpuri| ` bho ` | Kölsch| ` ksh ` |
57
+ | Bislama| ` bi ` | Lakota| ` lkt ` |
58
+ | Bodo| ` brx ` | Latin| ` la ` |
59
+ | Bosnian| ` bs ` | Lithuanian| ` lt ` |
60
+ | Braj| ` bra ` | Lower Sorbian| ` dsb ` |
61
+ | Breton| ` br ` | Volapük| ` smj ` |
62
+ | Bulgarian| ` bg ` | Luxembourgish| ` lb ` |
63
+ | Bundeli| ` bns ` | Mahasu Pahari| ` bfz ` |
64
+ | Buriat| ` bua ` | Malay| ` ms ` |
65
+ | Camling| ` rab ` | Malto| ` kmj ` |
66
+ | Catalan| ` ca ` | Manx| ` gv ` |
67
+ | Cebuano| ` ceb ` | Maori| ` mi ` |
68
+ | Chamorro| ` ch ` | Marathi| ` mr ` |
69
+ | Chhattisgarhi| ` hne ` | Mongolian| ` mn ` |
70
+ | Chinese (Simplified)| ` zh, zh-hans ` | Montenegrin (Cyrillic)| ` cnr-cyrl ` |
71
+ | Chinese (Traditional)| ` zh-hant ` | Montenegrin (Latin)| ` cnr, cnr-latn ` |
72
+ | Cornish| ` kw ` | Neapolitan| ` nap ` |
73
+ | Corsican| ` co ` | Nepali| ` ne ` |
74
+ | Crimean Tatar| ` crh ` | Niuean| ` niu ` |
75
+ | Croatian| ` hr ` | Nogai| ` nog ` |
76
+ | Czech| ` cs ` | Northern Sami| ` sme ` |
77
+ | Danish| ` da ` | Norwegian| ` no ` |
78
+ | Dari| ` prs ` | Occitan| ` oc ` |
79
+ | Dhimal| ` dhi ` | Ossetian| ` os ` |
80
+ | Dogri| ` doi ` | Panjabi| ` pa ` |
81
+ | Dutch| ` nl ` | Persian| ` fa ` |
82
+ | English| ` en-US, en-AU, en-CA,en-GB, en-IN ` | Polish| ` pl ` |
83
+ | Erzya| ` myv ` | Portuguese| ` pt ` |
84
+ | Estonian| ` et ` | Pushto| ` ps ` |
85
+ | Faroese| ` fo ` | Romanian| ` ro ` |
86
+ | Fijian| ` fj ` | Romansh| ` rm ` |
87
+ | Filipino| ` fil ` | Russian| ` ru ` |
88
+ | Finnish| ` fi ` | Sadri| ` sck ` |
89
+ | French| ` fr ` | Samoan| ` sm ` |
90
+ | Friulian| ` fur ` | Sanskrit| ` sa ` |
91
+ | Gagauz| ` gag ` | Santali| ` sat ` |
92
+ | Galician| ` gl ` | Scots| ` sco ` |
93
+ | German| ` de ` | Scottish Gaelic| ` gd ` |
94
+ | Gilbertese| ` gil ` | Serbian (Latin)| ` sr, sr-latn ` |
95
+ | Gondi| ` gon ` | Sirmauri| ` srx ` |
96
+ | Gurung| ` gvr ` | Skolt Sami| ` sms ` |
97
+ | Haitian| ` ht ` | Slovak| ` sk ` |
98
+ | Halbi| ` hlb ` | Slovenian| ` sl ` |
99
+ | Hani| ` hni ` | Somali| ` so ` |
100
+ | Haryanvi| ` bgc ` | Southern Sami| ` sma ` |
101
+ | Hawaiian| ` haw ` | Spanish| ` es ` |
102
+ | Hindi| ` hi ` | Swahili| ` sw ` |
103
+ | Hmong Daw| ` mww ` | Swedish| ` sv ` |
104
+ | Ho| ` hoc ` | Tajik| ` tg ` |
105
+ | Hungarian| ` hu ` | Tatar| ` tt ` |
106
+ | Icelandic| ` is ` | Tetum| ` tet ` |
107
+ | Inari Sami| ` smn ` | Thangmi| ` thf ` |
108
+ | Indonesian| ` id ` | Thai| ` th ` |
109
+ | Interlingua| ` ia ` | Tonga| ` to ` |
110
+ | Inuktitut| ` iu ` | Turkish| ` tr ` |
111
+ | Irish| ` ga ` | Tuvinian| ` tyv ` |
112
+ | Italian| ` it ` | Uighur| ` ug ` |
113
+ | Japanese| ` ja ` | Upper Sorbian| ` hsb ` |
114
+ | Jaunsari| ` jns ` | Urdu| ` ur ` |
115
+ | Javanese| ` jv ` | Uzbek (Arabic)| ` uz-arab ` |
116
+ | K'iche'| ` quc ` | Uzbek (Cyrillic)| ` uz-cyrl ` |
117
+ | Kabuverdianu| ` kea ` | Uzbek (Latin)| ` uz, uz-latn ` |
118
+ | Kachin| ` kac ` | Volapük| ` vo ` |
119
+ | Kalaallisut| ` kl ` | Walser| ` wae ` |
120
+ | Kangri| ` xnr ` | Welsh| ` cy ` |
121
+ | Kara-Kalpak (Cyrillic)| ` kaa-cyrl ` | Western Frisian| ` fy ` |
122
+ | Kara-Kalpak (Latin)| ` kaa, kaa-latn ` | Yucateco| ` yua ` |
123
+ | Karachay-Balkar| ` krc ` | Zhuang| ` za ` |
124
+ | Kashubian| ` csb ` | Zulu| ` zu ` |
125
+ | Kazakh (Cyrillic)| ` kk-cyrl ` |||
133
126
134
127
The following table lists the supported languages/locales for ** handwritten** text.
135
128
@@ -145,61 +138,129 @@ The following table lists the supported languages/locales for **handwritten** te
145
138
146
139
### Speech transcription
147
140
148
- Content Understanding supports the full set of [ Azure AI speech to text languages] ( ../speech-service/language-support.md ) . Content Understanding uses [ fast transcriptions] ( ../speech-service/speech-to-text.md#fast-transcription ) for supported languages to reduce processing latency.
141
+ Content Understanding applies [ Azure AI speech to text] ( ../speech-service/speech-to-text.md ) to transcribe spoken words in the input. For a subset of supported languages, it uses [ fast transcription] ( ../speech-service/speech-to-text.md#fast-transcription ) to reduce processing latency.
142
+
143
+ The following table lists the supported languages/locales for fast transcription.
144
+
145
+ | ** Language** | ** Language code** | ** Language** | ** Language code** |
146
+ | :-----| :----:| :-----| :----:|
147
+ | Chinese (Mandarin, Simplified) | ` zh-CN ` | Indonesian (Indonesia) | ` id-ID ` |
148
+ | Danish (Denmark) | ` da-DK ` | Italian (Italy) | ` it-IT ` |
149
+ | English (India) | ` en-IN ` | Japanese (Japan) | ` ja-JP ` |
150
+ | English (United Kingdom) | ` en-GB ` | Korean (Korea) | ` ko-KR ` |
151
+ | English (United States) | ` en-US ` | Polish (Poland) | ` pl-PL ` |
152
+ | Finnish (Finland) | ` fi-FI ` | Portuguese (Brazil) | ` pt-BR ` |
153
+ | French (France) | ` fr-FR ` | Portuguese (Portugal) | ` pt-PT ` |
154
+ | German (Germany) | ` de-DE ` | Spanish (Mexico) | ` es-MX ` |
155
+ | Hebrew (Israel) | ` he-IL ` | Spanish (Spain) | ` es-ES ` |
156
+ | Hindi (India) | ` hi-IN ` | Swedish (Sweden) | ` sv-SE ` |
157
+
158
+ The following table lists all supported languages/locales.
149
159
150
- > [ !NOTE]
151
- > Only spoken words are transcribed. Music, sound effects, and ambient noise are ignored.
160
+ | ** Language** | ** Language code** | ** Language** | ** Language code** |
161
+ | :-----| :----:| :-----| :----:|
162
+ | Afrikaans (South Africa) | ` af-ZA ` | Hungarian (Hungary) | ` hu-HU ` |
163
+ | Albanian (Albania) | ` sq-AL ` | Icelandic (Iceland) | ` is-IS ` |
164
+ | Amharic (Ethiopia) | ` am-ET ` | Indonesian (Indonesia) | ` id-ID ` |
165
+ | Arabic (Algeria) | ` ar-DZ ` | Irish (Ireland) | ` ga-IE ` |
166
+ | Arabic (Bahrain) | ` ar-BH ` | isiZulu (South Africa) | ` zu-ZA ` |
167
+ | Arabic (Egypt) | ` ar-EG ` | Italian (Italy) | ` it-IT ` |
168
+ | Arabic (Iraq) | ` ar-IQ ` | Italian (Switzerland) | ` it-CH ` |
169
+ | Arabic (Israel) | ` ar-IL ` | Japanese (Japan) | ` ja-JP ` |
170
+ | Arabic (Jordan) | ` ar-JO ` | Javanese (Latin, Indonesia) | ` jv-ID ` |
171
+ | Arabic (Kuwait) | ` ar-KW ` | Kannada (India) | ` kn-IN ` |
172
+ | Arabic (Lebanon) | ` ar-LB ` | Kazakh (Kazakhstan) | ` kk-KZ ` |
173
+ | Arabic (Libya) | ` ar-LY ` | Khmer (Cambodia) | ` km-KH ` |
174
+ | Arabic (Morocco) | ` ar-MA ` | Kiswahili (Kenya) | ` sw-KE ` |
175
+ | Arabic (Oman) | ` ar-OM ` | Kiswahili (Tanzania) | ` sw-TZ ` |
176
+ | Arabic (Palestinian Authority) | ` ar-PS ` | Korean (Korea) | ` ko-KR ` |
177
+ | Arabic (Qatar) | ` ar-QA ` | Lao (Laos) | ` lo-LA ` |
178
+ | Arabic (Saudi Arabia) | ` ar-SA ` | Latvian (Latvia) | ` lv-LV ` |
179
+ | Arabic (Syria) | ` ar-SY ` | Lithuanian (Lithuania) | ` lt-LT ` |
180
+ | Arabic (Tunisia) | ` ar-TN ` | Macedonian (North Macedonia) | ` mk-MK ` |
181
+ | Arabic (United Arab Emirates) | ` ar-AE ` | Malay (Malaysia) | ` ms-MY ` |
182
+ | Arabic (Yemen) | ` ar-YE ` | Malayalam (India) | ` ml-IN ` |
183
+ | Armenian (Armenia) | ` hy-AM ` | Maltese (Malta) | ` mt-MT ` |
184
+ | Assamese (India) | ` as-IN ` | Marathi (India) | ` mr-IN ` |
185
+ | Azerbaijani (Latin, Azerbaijan) | ` az-AZ ` | Mongolian (Mongolia) | ` mn-MN ` |
186
+ | Basque | ` eu-ES ` | Nepali (Nepal) | ` ne-NP ` |
187
+ | Bengali (India) | ` bn-IN ` | Norwegian Bokmål (Norway) | ` nb-NO ` |
188
+ | Bosnian (Bosnia and Herzegovina) | ` bs-BA ` | Odia (India) | ` or-IN ` |
189
+ | Bulgarian (Bulgaria) | ` bg-BG ` | Pashto (Afghanistan) | ` ps-AF ` |
190
+ | Burmese (Myanmar) | ` my-MM ` | Persian (Iran) | ` fa-IR ` |
191
+ | Catalan | ` ca-ES ` | Polish (Poland) | ` pl-PL ` |
192
+ | Chinese (Cantonese, Simplified) | ` yue-CN ` | Portuguese (Brazil) | ` pt-BR ` |
193
+ | Chinese (Cantonese, Traditional) | ` zh-HK ` | Portuguese (Portugal) | ` pt-PT ` |
194
+ | Chinese (Jilu Mandarin, Simplified) | ` zh-CN-shandong ` | Punjabi (India) | ` pa-IN ` |
195
+ | Chinese (Mandarin, Simplified) | ` zh-CN ` | Romanian (Romania) | ` ro-RO ` |
196
+ | Chinese (Southwestern Mandarin, Simplified) | ` zh-CN-sichuan ` | Russian (Russia) | ` ru-RU ` |
197
+ | Chinese (Taiwanese Mandarin, Traditional) | ` zh-TW ` | Serbian (Cyrillic, Serbia) | ` sr-RS ` |
198
+ | Chinese (Wu, Simplified) | ` wuu-CN ` | Sinhala (Sri Lanka) | ` si-LK ` |
199
+ | Croatian (Croatia) | ` hr-HR ` | Slovak (Slovakia) | ` sk-SK ` |
200
+ | Czech (Czechia) | ` cs-CZ ` | Slovenian (Slovenia) | ` sl-SI ` |
201
+ | Danish (Denmark) | ` da-DK ` | Somali (Somalia) | ` so-SO ` |
202
+ | Dutch (Belgium) | ` nl-BE ` | Spanish (Argentina) | ` es-AR ` |
203
+ | Dutch (Netherlands) | ` nl-NL ` | Spanish (Bolivia) | ` es-BO ` |
204
+ | English (Australia) | ` en-AU ` | Spanish (Chile) | ` es-CL ` |
205
+ | English (Canada) | ` en-CA ` | Spanish (Colombia) | ` es-CO ` |
206
+ | English (Ghana) | ` en-GH ` | Spanish (Costa Rica) | ` es-CR ` |
207
+ | English (Hong Kong SAR) | ` en-HK ` | Spanish (Cuba) | ` es-CU ` |
208
+ | English (India) | ` en-IN ` | Spanish (Dominican Republic) | ` es-DO ` |
209
+ | English (Ireland) | ` en-IE ` | Spanish (Ecuador) | ` es-EC ` |
210
+ | English (Kenya) | ` en-KE ` | Spanish (El Salvador) | ` es-SV ` |
211
+ | English (New Zealand) | ` en-NZ ` | Spanish (Equatorial Guinea) | ` es-GQ ` |
212
+ | English (Nigeria) | ` en-NG ` | Spanish (Guatemala) | ` es-GT ` |
213
+ | English (Philippines) | ` en-PH ` | Spanish (Honduras) | ` es-HN ` |
214
+ | English (Singapore) | ` en-SG ` | Spanish (Mexico) | ` es-MX ` |
215
+ | English (South Africa) | ` en-ZA ` | Spanish (Nicaragua) | ` es-NI ` |
216
+ | English (Tanzania) | ` en-TZ ` | Spanish (Panama) | ` es-PA ` |
217
+ | English (United Kingdom) | ` en-GB ` | Spanish (Paraguay) | ` es-PY ` |
218
+ | English (United States) | ` en-US ` | Spanish (Peru) | ` es-PE ` |
219
+ | Estonian (Estonia) | ` et-EE ` | Spanish (Puerto Rico) | ` es-PR ` |
220
+ | Filipino (Philippines) | ` fil-PH ` | Spanish (Spain) | ` es-ES ` |
221
+ | Finnish (Finland) | ` fi-FI ` | Spanish (United States)<sup >1</sup > | ` es-US ` |
222
+ | French (Belgium) | ` fr-BE ` | Spanish (Uruguay) | ` es-UY ` |
223
+ | French (Canada)<sup >1</sup > | ` fr-CA ` | Spanish (Venezuela) | ` es-VE ` |
224
+ | French (France) | ` fr-FR ` | Swedish (Sweden) | ` sv-SE ` |
225
+ | French (Switzerland) | ` fr-CH ` | Tamil (India) | ` ta-IN ` |
226
+ | Galician | ` gl-ES ` | Telugu (India) | ` te-IN ` |
227
+ | Georgian (Georgia) | ` ka-GE ` | Thai (Thailand) | ` th-TH ` |
228
+ | German (Austria) | ` de-AT ` | Turkish (Türkiye) | ` tr-TR ` |
229
+ | German (Germany) | ` de-DE ` | Ukrainian (Ukraine) | ` uk-UA ` |
230
+ | German (Switzerland) | ` de-CH ` | Urdu (India) | ` ur-IN ` |
231
+ | Greek (Greece) | ` el-GR ` | Uzbek (Latin, Uzbekistan) | ` uz-UZ ` |
232
+ | Gujarati (India) | ` gu-IN ` | Vietnamese (Vietnam) | ` vi-VN ` |
233
+ | Hebrew (Israel) | ` he-IL ` | Welsh (United Kingdom) | ` cy-GB ` |
234
+ | Hindi (India) | ` hi-IN ` |||
152
235
153
236
154
237
### Field value normalization
155
238
156
239
Different locales have different ways to represent numbers, date, and time. Content Understanding supports normalizing these different representations into standardized ISO forms for the following locales.
157
240
158
- | ** Language** | ** Language code** |
159
- | :-----------| :-----------------|
160
- | Arabic| ` ar-AE ` , ` ar-EG ` , ` ar-SA ` |
161
- | Bengla| ` bn-IN ` |
162
- | Bulgarian| ` bg-BG ` |
163
- | Catalan| ` ca-ES ` |
164
- | Chinese (Simplified) | ` zh-CN ` |
165
- | Chinese (Traditional)| ` zh-TW ` |
166
- | Croatian| ` hr-HR ` |
167
- | Czech| ` cs-CZ ` |
168
- | Danish| ` da-DK ` |
169
- | Dutch| ` nl-NL ` |
170
- | English| ` en-AU ` , ` en-CA ` , ` en-GB ` , ` en-IL ` , ` en-IN ` , ` en-MY ` , ` en-US ` |
171
- | Estonian| ` et-EE ` |
172
- | Finnish| ` fi-FI ` |
173
- | French| ` fr-CA ` , ` fr-FR ` |
174
- | Galician| ` gl-ES ` |
175
- | German| ` de-DE ` |
176
- | Greek| ` el-GR ` |
177
- | Hebrew| ` he-IL ` |
178
- | Hindi| ` hi-IN ` |
179
- | Hungarian| ` hu-HU ` |
180
- | Icelandic| ` is-IS ` |
181
- | Indonesian| ` id-ID ` |
182
- | Italian| ` it-IT ` |
183
- | Japanese| ` ja-JP ` |
184
- | Korean| ` ko-KR ` |
185
- | Latvian| ` lv-LV ` |
186
- | Lithuanian| ` lt-LT ` |
187
- | Malay| ` ms-MY ` |
188
- | Marathi| ` mr-IN ` |
189
- | Nepali| ` ne-IN ` |
190
- | Norwegian| ` no-NO ` |
191
- | Polish| ` pl-PL ` |
192
- | Portuguese| ` pt-BR ` , ` pt-PT ` |
193
- | Romanian| ` ro-RO ` |
194
- | Russian| ` ru-RU ` |
195
- | Serbian| ` sr-RS ` |
196
- | Slovak| ` sk-SK ` |
197
- | Slovenian| ` sl-SI ` |
198
- | Spanish| ` es-AR ` , ` es-ES ` , ` es-MX ` |
199
- | Swedish| ` sv-SE ` |
200
- | Tamil| ` ta-IN ` |
201
- | Thai| ` th-TH ` |
202
- | Turkish| ` tr-TR ` |
203
- | Ukrainian| ` uk-UA ` |
204
- | Vietnamese| ` vi-VN ` |
241
+ | ** Language** | ** Language code** | ** Language** | ** Language code** |
242
+ | :-----| :----:| :-----| :----:|
243
+ | Arabic| ` ar-AE ` , ` ar-EG ` , ` ar-SA ` | Japanese| ` ja-JP ` |
244
+ | Bengla| ` bn-IN ` | Korean| ` ko-KR ` |
245
+ | Bulgarian| ` bg-BG ` | Latvian| ` lv-LV ` |
246
+ | Catalan| ` ca-ES ` | Lithuanian| ` lt-LT ` |
247
+ | Chinese (Simplified) | ` zh-CN ` | Malay| ` ms-MY ` |
248
+ | Chinese (Traditional)| ` zh-TW ` | Marathi| ` mr-IN ` |
249
+ | Croatian| ` hr-HR ` | Nepali| ` ne-IN ` |
250
+ | Czech| ` cs-CZ ` | Norwegian| ` no-NO ` |
251
+ | Danish| ` da-DK ` | Polish| ` pl-PL ` |
252
+ | Dutch| ` nl-NL ` | Portuguese| ` pt-BR ` , ` pt-PT ` |
253
+ | English| ` en-AU ` , ` en-CA ` , ` en-GB ` , ` en-IL ` , ` en-IN ` , ` en-MY ` , ` en-US ` | Romanian| ` ro-RO ` |
254
+ | Estonian| ` et-EE ` | Russian| ` ru-RU ` |
255
+ | Finnish| ` fi-FI ` | Serbian| ` sr-RS ` |
256
+ | French| ` fr-CA ` , ` fr-FR ` | Slovak| ` sk-SK ` |
257
+ | Galician| ` gl-ES ` | Slovenian| ` sl-SI ` |
258
+ | German| ` de-DE ` | Spanish| ` es-AR ` , ` es-ES ` , ` es-MX ` |
259
+ | Greek| ` el-GR ` | Swedish| ` sv-SE ` |
260
+ | Hebrew| ` he-IL ` | Tamil| ` ta-IN ` |
261
+ | Hindi| ` hi-IN ` | Thai| ` th-TH ` |
262
+ | Hungarian| ` hu-HU ` | Turkish| ` tr-TR ` |
263
+ | Icelandic| ` is-IS ` | Ukrainian| ` uk-UA ` |
264
+ | Indonesian| ` id-ID ` | Vietnamese| ` vi-VN ` |
265
+ | Italian| ` it-IT ` |||
205
266
0 commit comments