You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/semantic-answers.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -59,7 +59,7 @@ The "searchFields" parameter is critical to returning a high quality answer, bot
59
59
60
60
+ A query string must not be null and should be formulated as question. In this preview, the "queryType" and "queryLanguage" must be set exactly as shown in the example.
61
61
62
-
+ The "searchFields" parameter determines which fields provide tokens to the extraction model. Be sure to set this parameter. You must have at least one string field, but include any string field that you think is useful in providing an answer. Only about 8,000 tokens per document are passed into the model. Start the field list with concise fields, and then progress to text-rich fields. For precise guidance on how to set this field, see [Set searchFields](semantic-how-to-query-request.md#searchfields).
62
+
+ The "searchFields" parameter determines which fields provide tokens to the extraction model. Be sure to set this parameter. You must have at least one string field, but include any string field that you think is useful in providing an answer. Collectively across all fields in searchFields, only about 8,000 tokens per document are passed into the model. Start the field list with concise fields, and then progress to text-rich fields. For precise guidance on how to set this field, see [Set searchFields](semantic-how-to-query-request.md#searchfields).
63
63
64
64
+ For "answers", the basic parameter construction is `"answers": "extractive"`, where the default number of answers returned is one. You can increase the number of answers by adding a count, up to a maximum of five. Whether you need more than one answer depends on the user experience of your app, and how you want to render results.
65
65
@@ -111,15 +111,15 @@ Given the query "how do clouds form", the following answer is returned in the re
111
111
112
112
For best results, return semantic answers on a document corpus having the following characteristics:
113
113
114
-
+ "searchFields" should include one or more fields that provides sufficient text in which an answer is likely to be found.
115
-
116
-
+ Semantic extraction and summarization have limits over how much content can be analyzed in a timely fashion. Collectively, only the first 20,000 tokens are analyzed. Anything beyond that is ignored. In practical terms, if you have large documents that run into hundreds of pages, you should try to break the content up into manageable parts first.
114
+
+ "searchFields" must provide fields that offer sufficient text in which an answer is likely to be found. Only verbatim text from a document can be appear as an answer.
117
115
118
116
+ query strings must not be null (search=`*`) and the string should have the characteristics of a question, as opposed to a keyword search (a sequential list of arbitrary terms or phrases). If the query string does not appear to be answer, answer processing is skipped, even if the request specifies "answers" as a query parameter.
119
117
118
+
+ Semantic extraction and summarization have limits over how many tokens per document can be analyzed in a timely fashion. In practical terms, if you have large documents that run into hundreds of pages, you should try to break the content up into smaller documents first.
Copy file name to clipboardExpand all lines: articles/search/semantic-how-to-query-request.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -136,7 +136,7 @@ Follow these guidelines to ensure optimum results when two or more searchFields
136
136
137
137
+ Follow those fields by descriptive fields where the answer to semantic queries may be found, such as the main content of a document.
138
138
139
-
If only one field specified, use a descriptive field where the answer to semantic queries may be found, such as the main content of a document. Choose a field that provides sufficient content. To ensure timely processing, only about 8,000 tokens of the collective contents of searchFields undergo semantic evaluation and ranking.
139
+
If only one field specified, use a descriptive field where the answer to semantic queries may be found, such as the main content of a document. Choose a field that provides sufficient content. To ensure timely processing, only about 8,000 tokens of the aggregate contents of searchFields undergo semantic evaluation and ranking.
140
140
141
141
#### Step 3: Remove orderBy clauses
142
142
@@ -186,7 +186,7 @@ The response for the above example query returns the following match as the top
186
186
Recall that semantic ranking and responses are built over an initial result set. Any logic that improves the quality of the initial results will carry forward to semantic search. As a next step, review the features that contribute to initial results, including analyzers that affect how strings are tokenized, scoring profiles that can tune results, and the default relevance algorithm.
187
187
188
188
+ [Analyzers for text processing](search-analyzers.md)
189
-
+ [Similarity and scoring in Cognitive Search](index-similarity-and-scoring.md)
Copy file name to clipboardExpand all lines: articles/search/semantic-ranking.md
+20-10Lines changed: 20 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,25 +24,35 @@ The semantic ranking is both resource and time intensive. In order to complete p
24
24
25
25
For semantic ranking, the model uses both machine reading comprehension and transfer learning to re-score the documents based on how well each one matches the intent of the query.
26
26
27
-
1. For each document, the semantic ranker evaluates the fields in the searchFields parameter in order, consolidating the contents into one large string.
27
+
### Preparation (passage extraction) phase
28
28
29
-
1. The string is then trimmed to ensure the overall length is not more than 8,000 tokens. If you have very large documents, with a content field or merged_content field that has many pages of content, anything after the token limit is ignored.
29
+
For each document in the initial results, there is a passage extraction exercise that identifies key passages. This is a downsizing exercise that reduces content to an amount that can be processed swiftly.
30
30
31
-
1.Each of the 50 documents is now represented by a single long string. This string is sent to the summarization model. The summarization model produces captions (and answers), using machine reading comprehension to identify passages that appear to summarize the content or answer the question. The output of the summarization model is a further reduced string, which be at most 128 tokens.
31
+
1.For each of the 50 documents, each field in the searchFields parameter is evaluated in consecutive order. Contents from each field are consolidated into one long string.
32
32
33
-
1. The smaller string becomes the caption of the document, and it represents the most relevant passages found in the larger string. The set of 50 (or fewer) captions is then ranked in order relevance.
33
+
1. The long string is then trimmed to ensure the overall length is not more than 8,000 tokens. For this reason, it's recommended that you position concise fields first so that they are included in the string. If you have very large documents with text-heavy fields, anything after the token limit is ignored.
34
34
35
-
Conceptual and semantic relevance is established through vector representation and term clusters. Whereas a keyword similarity algorithm might give equal weight to any term in the query, the semantic model has been trained to recognize the interdependency and relationships among words that are otherwise unrelated on the surface. As a result, if a query string includes terms from the same cluster, a document containing both will rank higher than one that doesn't.
35
+
1. Each document is now represented by a single long string that is up to 8,000 tokens. These strings are sent to the summarization model, which will reduce the string further. The summarization model evaluates the long string for key sentences or passages that best summarize the document or answer the question.
36
36
37
-
:::image type="content" source="media/semantic-search-overview/semantic-vector-representation.png" alt-text="Vector representation for context" border="true":::
37
+
1. The output of this phase is a caption (and optionally, an answer). The caption is at most 128 tokens per document, and it is considered the most representative of the document.
38
38
39
-
## Next steps
39
+
### Scoring and ranking phases
40
+
41
+
In this phase, all 50 captions are evaluated to assess relevance.
42
+
43
+
1. Scoring is determined by evaluating each caption for conceptual and semantic relevance, relative to the query provided.
44
+
45
+
The following diagram provides an illustration of what "semantic relevance" means. Consider the term "capital", which could be used in the context of finance, law, geography, or grammar. If a query includes terms from the same vector space (for example, "capital" and "investment"), a document that also includes tokens in the same cluster will score higher than one that doesn't.
40
46
41
-
Semantic ranking is offered on Standard tiers, in specific regions. For more information and to sign up, see [Availability and pricing](semantic-search-overview.md#availability-and-pricing).
47
+
:::image type="content" source="media/semantic-search-overview/semantic-vector-representation.png" alt-text="Vector representation for context" border="true":::
48
+
49
+
1. The output of this phase is an @search.rerankerScore assigned to each document. Once all documents are scored, they are listed in descending order and included in the query response payload.
50
+
51
+
## Next steps
42
52
43
-
A new query type enables the relevance ranking and response structures of semantic search. [Create a semantic query](semantic-how-to-query-request.md) to get started.
53
+
Semantic ranking is offered on Standard tiers, in specific regions. For more information and to sign up, see [Availability and pricing](semantic-search-overview.md#availability-and-pricing). A new query type enables the relevance ranking and response structures of semantic search. To get started, [Create a semantic query](semantic-how-to-query-request.md).
44
54
45
55
Alternatively, review either of the following articles for related information.
46
56
47
-
+[Add spell check to query terms](speller-how-to-add.md)
Copy file name to clipboardExpand all lines: articles/search/semantic-search-overview.md
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,13 +8,13 @@ author: HeidiSteen
8
8
ms.author: heidist
9
9
ms.service: cognitive-search
10
10
ms.topic: conceptual
11
-
ms.date: 03/12/2021
11
+
ms.date: 03/18/2021
12
12
ms.custom: references_regions
13
13
---
14
14
# Semantic search in Azure Cognitive Search
15
15
16
16
> [!IMPORTANT]
17
-
> Semantic search features are in public preview, available through the preview REST API only. Preview features are offered as-is, under [Supplemental Terms of Use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/), and are not guaranteed to have the same implementation at general availability. For more information, see [Availability and pricing](semantic-search-overview.md#availability-and-pricing).
17
+
> Semantic search is in public preview, available through the preview REST API only. Preview features are offered as-is, under [Supplemental Terms of Use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/), and are not guaranteed to have the same implementation at general availability. These features are billable. For more information, see [Availability and pricing](semantic-search-overview.md#availability-and-pricing).
18
18
19
19
Semantic search is a collection of query-related features that support a higher-quality, more natural query experience.
20
20
@@ -67,4 +67,6 @@ A new query type enables the relevance ranking and response structures of semant
67
67
68
68
+[Add spell check to query terms](speller-how-to-add.md)
+[Find meaningful insights using semantic capabilities (AI Show video)](https://channel9.msdn.com/Shows/AI-Show/Find-meaningful-insights-using-semantic-capabilities-in-Azure-Cognitive-Search)
0 commit comments