Skip to content

Commit 3abf9c9

Browse files
authored
Merge pull request #171737 from HeidiSteen/heidist-gh
[azure search] GH issue (language analyzer)
2 parents 5545b6d + a729f2d commit 3abf9c9

File tree

2 files changed

+9
-5
lines changed

2 files changed

+9
-5
lines changed

articles/search/index-add-language-analyzers.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ manager: nitinme
88
ms.author: heidist
99
ms.service: cognitive-search
1010
ms.topic: conceptual
11-
ms.date: 03/17/2021
11+
ms.date: 09/08/2021
1212
---
1313
# Add language analyzers to string fields in an Azure Cognitive Search index
1414

@@ -18,7 +18,7 @@ A *language analyzer* is a specific type of [text analyzer](search-analyzers.md)
1818

1919
You should consider a language analyzer when awareness of word or sentence structure adds value to text parsing. A common example is the association of irregular verb forms ("bring" and "brought) or plural nouns ("mice" and "mouse"). Without linguistic awareness, these strings are parsed on physical characteristics alone, which fails to catch the connection. Since large chunks of text are more likely to have this content, fields consisting of descriptions, reviews, or summaries are good candidates for a language analyzer.
2020

21-
You should also consider language analyzers when content consists of non-Western language strings. While the [default analyzer](search-analyzers.md#default-analyzer) is language-agnostic, the concept of using spaces and special characters (hyphens and slashes) to separate strings tends is more applicable to Western languages than non-Western ones.
21+
You should also consider language analyzers when content consists of non-Western language strings. While the [default analyzer (Standard Lucene)](search-analyzers.md#default-analyzer) is language-agnostic, the concept of using spaces and special characters (hyphens and slashes) to separate strings is more applicable to Western languages than non-Western ones.
2222

2323
For example, in Chinese, Japanese, Korean (CJK), and other Asian languages, a space is not necessarily a word delimiter. Consider the following Japanese string. Because it has no spaces, a language-agnostic analyzer would likely analyze the entire string as one token, when in fact the string is actually a phrase.
2424

@@ -49,9 +49,13 @@ The default analyzer is Standard Lucene, which works well for English, but perha
4949

5050
## How to specify a language analyzer
5151

52-
Set a language analyzer on "searchable" fields of type Edm.String during field definition.
52+
A language analyzer is specified on field definitions in the index schema *when the field is created* and before it's loaded with data.
5353

54-
Although field definitions have several analyzer-related properties, only the "analyzer" property can be used for language analyzers. The value of "analyzer" must be one of the language analyzers from the support analyzers list.
54+
Set a language analyzer on "searchable" fields of type Edm.String during field definition, using the "analyzer" property only. Although field definitions have several analyzer-related properties, only the "analyzer" property can be used for language analyzers. The value of "analyzer" must be one of the language analyzers from the [supported analyzers list](#language-analyzer-list).
55+
56+
Language analyzers are used as-is and cannot be customized. If you can't find an analyzer that meets your requirements, you can create a [custom analyzer](cognitive-search-working-with-skillsets.md) with the microsoft_language_tokenizer or microsoft_language_stemming_tokenizer, and add filters for pre- and post-tokenization processing.
57+
58+
The following example illustrates a language analyzer specification:
5559

5660
```json
5761
{

articles/search/knowledge-store-connect-power-bi.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ ms.date: 09/07/2021
1515

1616
In this article, learn how to connect to and explore a knowledge store using Power Query in the Power BI Desktop app. You can get started faster with templates, or build a custom dashboard from scratch.
1717

18-
A knowledge store composed of tables in Azure Storage works best in Power BI. If the tables contain projections from the same skillset and projection group, you can easily build table visualizations that combine fields from related tables.
18+
A knowledge store that's composed of tables in Azure Storage work best in Power BI. If the tables contain projections from the same skillset and projection group, you can easily "join" them to build table visualizations that include fields from related tables.
1919

2020
Follow along with the steps in this article using the sample data and knowledge store [created in the Azure portal](knowledge-store-create-portal.md) or through [Postman and REST APIs](knowledge-store-create-rest.md).
2121

0 commit comments

Comments
 (0)