Skip to content

Commit 6872e0a

Browse files
committed
reformatted a numbered procedure
1 parent e85c9e6 commit 6872e0a

File tree

2 files changed

+10
-6
lines changed

2 files changed

+10
-6
lines changed

articles/search/index-add-language-analyzers.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ms.date: 09/08/2021
1212
---
1313
# Add language analyzers to string fields in an Azure Cognitive Search index
1414

15-
A *language analyzer* is a specific type of [text analyzer](search-analyzers.md) that performs lexical analysis using the linguistic rules of the target language. Every searchable field has an **analyzer** property. If your content consists of translated strings, such as separate fields for English and Chinese text, you could specify language analyzers on each field to access the rich linguistic capabilities of those analyzers.
15+
A *language analyzer* is a specific type of [text analyzer](search-analyzers.md) that performs lexical analysis using the linguistic rules of the target language. Every searchable string field has an **analyzer** property. If your content consists of translated strings, such as separate fields for English and Chinese text, you could specify language analyzers on each field to access the rich linguistic capabilities of those analyzers.
1616

1717
## When to use a language analyzer
1818

@@ -49,13 +49,17 @@ The default analyzer is Standard Lucene, which works well for English, but perha
4949

5050
## How to specify a language analyzer
5151

52-
A language analyzer is specified on field definitions in the index schema *when the field is created* and before it's loaded with data.
52+
Set the analyzer during index creation, before it's loaded with data.
5353

54-
Set a language analyzer on "searchable" fields of type Edm.String during field definition, using the "analyzer" property only. Although field definitions have several analyzer-related properties, only the "analyzer" property can be used for language analyzers. The value of "analyzer" must be one of the language analyzers from the [supported analyzers list](#language-analyzer-list).
54+
1. In the field definition, make sure the field is attributed as "searchable" and is of type Edm.String.
5555

56-
Language analyzers are used as-is and cannot be customized. If you can't find an analyzer that meets your requirements, you can create a [custom analyzer](cognitive-search-working-with-skillsets.md) with the microsoft_language_tokenizer or microsoft_language_stemming_tokenizer, and add filters for pre- and post-tokenization processing.
56+
1. Set the "analyzer" property to one of the language analyzers from the [supported analyzers list](#language-analyzer-list).
5757

58-
The following example illustrates a language analyzer specification:
58+
The "analyzer" property is the only property that will accept a language analyzer, and it's used for both indexing and queries. Other analyzer-related properties ("searchAnalyzer" and "indexAnalyzer") will not accept a language analyzer.
59+
60+
Language analyzers cannot be customized. If an analyzer isn't meeting your requirements, you can try creating a [custom analyzer](cognitive-search-working-with-skillsets.md) with the microsoft_language_tokenizer or microsoft_language_stemming_tokenizer, and add filters for pre- and post-tokenization processing.
61+
62+
The following example illustrates a language analyzer specification in an index:
5963

6064
```json
6165
{

articles/search/search-analyzers.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ For more background on lexical analysis, listen to the following video clip for
3333
3434
## Default analyzer
3535

36-
In Azure Cognitive Search queries, an analyzer is automatically invoked on all string fields marked as searchable.
36+
In Azure Cognitive Search, an analyzer is automatically invoked on all string fields marked as searchable.
3737

3838
By default, Azure Cognitive Search uses the [Apache Lucene Standard analyzer (standard lucene)](https://lucene.apache.org/core/6_6_1/core/org/apache/lucene/analysis/standard/StandardAnalyzer.html), which breaks text into elements following the ["Unicode Text Segmentation"](https://unicode.org/reports/tr29/) rules. Additionally, the standard analyzer converts all characters to their lower case form. Both indexed documents and search terms go through the analysis during indexing and query processing.
3939

0 commit comments

Comments
 (0)