You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/index-add-language-analyzers.md
+9-5Lines changed: 9 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ ms.date: 09/08/2021
12
12
---
13
13
# Add language analyzers to string fields in an Azure Cognitive Search index
14
14
15
-
A *language analyzer* is a specific type of [text analyzer](search-analyzers.md) that performs lexical analysis using the linguistic rules of the target language. Every searchable field has an **analyzer** property. If your content consists of translated strings, such as separate fields for English and Chinese text, you could specify language analyzers on each field to access the rich linguistic capabilities of those analyzers.
15
+
A *language analyzer* is a specific type of [text analyzer](search-analyzers.md) that performs lexical analysis using the linguistic rules of the target language. Every searchable string field has an **analyzer** property. If your content consists of translated strings, such as separate fields for English and Chinese text, you could specify language analyzers on each field to access the rich linguistic capabilities of those analyzers.
16
16
17
17
## When to use a language analyzer
18
18
@@ -49,13 +49,17 @@ The default analyzer is Standard Lucene, which works well for English, but perha
49
49
50
50
## How to specify a language analyzer
51
51
52
-
A language analyzer is specified on field definitions in the index schema *when the field is created* and before it's loaded with data.
52
+
Set the analyzer during index creation, before it's loaded with data.
53
53
54
-
Set a language analyzer on "searchable" fields of type Edm.String during field definition, using the "analyzer" property only. Although field definitions have several analyzer-related properties, only the "analyzer" property can be used for language analyzers. The value of "analyzer" must be one of the language analyzers from the [supported analyzers list](#language-analyzer-list).
54
+
1. In the field definition, make sure the field is attributed as "searchable" and is of type Edm.String.
55
55
56
-
Language analyzers are used as-is and cannot be customized. If you can't find an analyzer that meets your requirements, you can create a [custom analyzer](cognitive-search-working-with-skillsets.md) with the microsoft_language_tokenizer or microsoft_language_stemming_tokenizer, and add filters for pre- and post-tokenization processing.
56
+
1. Set the "analyzer" property to one of the language analyzers from the [supported analyzers list](#language-analyzer-list).
57
57
58
-
The following example illustrates a language analyzer specification:
58
+
The "analyzer" property is the only property that will accept a language analyzer, and it's used for both indexing and queries. Other analyzer-related properties ("searchAnalyzer" and "indexAnalyzer") will not accept a language analyzer.
59
+
60
+
Language analyzers cannot be customized. If an analyzer isn't meeting your requirements, you can try creating a [custom analyzer](cognitive-search-working-with-skillsets.md) with the microsoft_language_tokenizer or microsoft_language_stemming_tokenizer, and add filters for pre- and post-tokenization processing.
61
+
62
+
The following example illustrates a language analyzer specification in an index:
Copy file name to clipboardExpand all lines: articles/search/search-analyzers.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,7 +33,7 @@ For more background on lexical analysis, listen to the following video clip for
33
33
34
34
## Default analyzer
35
35
36
-
In Azure Cognitive Search queries, an analyzer is automatically invoked on all string fields marked as searchable.
36
+
In Azure Cognitive Search, an analyzer is automatically invoked on all string fields marked as searchable.
37
37
38
38
By default, Azure Cognitive Search uses the [Apache Lucene Standard analyzer (standard lucene)](https://lucene.apache.org/core/6_6_1/core/org/apache/lucene/analysis/standard/StandardAnalyzer.html), which breaks text into elements following the ["Unicode Text Segmentation"](https://unicode.org/reports/tr29/) rules. Additionally, the standard analyzer converts all characters to their lower case form. Both indexed documents and search terms go through the analysis during indexing and query processing.
0 commit comments