Skip to content

Commit 9dc877e

Browse files
Merge pull request #267045 from HeidiSteen/heidist-fresh
[azure search] Addressed verbatim feedback
2 parents 707e3ea + 748ab87 commit 9dc877e

9 files changed

+82
-42
lines changed

articles/search/cognitive-search-skill-azure-openai-embedding.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: cognitive-search
88
ms.custom:
99
- ignite-2023
1010
ms.topic: reference
11-
ms.date: 12/21/2023
11+
ms.date: 02/21/2024
1212
---
1313

1414
# Azure OpenAI Embedding skill
@@ -18,6 +18,8 @@ ms.date: 12/21/2023
1818
1919
The **Azure OpenAI Embedding** skill connects to a deployed embedding model on your [Azure OpenAI](/azure/ai-services/openai/overview) resource to generate embeddings.
2020

21+
The [Import and vectorize data](search-get-started-portal-import-vectors.md) uses the **Azure OpenAI Embedding** skill to vectorize content. You can run the wizard and review the generated skillset to see how the wizard builds it.
22+
2123
> [!NOTE]
2224
> This skill is bound to Azure OpenAI and is charged at the existing [Azure OpenAI pay-as-you go price](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/#pricing).
2325
>
@@ -28,24 +30,24 @@ Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill
2830

2931
## Data limits
3032

31-
The maximum size of a text input should be 8,000 tokens. If input exceeds the maximum allowed, the model throws an invalid request error. For more information, see the [tokens](/azure/ai-services/openai/overview#tokens) key concept in the Azure OpenAI documentation.
33+
The maximum size of a text input should be 8,000 tokens. If input exceeds the maximum allowed, the model throws an invalid request error. For more information, see the [tokens](/azure/ai-services/openai/overview#tokens) key concept in the Azure OpenAI documentation. Consider using the [Text Split skill](cognitive-search-skill-textsplit.md) if you need data chunking.
3234

3335
## Skill parameters
3436

3537
Parameters are case-sensitive.
3638

3739
| Inputs | Description |
3840
|---------------------|-------------|
39-
| `resourceUri` | The URI where a valid Azure OpenAI model is deployed. The model should be an embedding model, such as text-embedding-ada-002. See the [List of Azure OpenAI models](/azure/ai-services/openai/concepts/models) for supported models. |
40-
| `apiKey` | The secret key pertaining to a valid Azure OpenAI `resourceUri.` If you provide a key, leave `authIdentity` empty. If you set both the `apiKey` and `authIdentity`, the `apiKey` is used on the connection. |
41-
| `deploymentId` | The name of the deployed Azure OpenAI embedding model.|
41+
| `resourceUri` | The URI of a model provider, such as an Azure OpenAI resource or an OpenAI URL. |
42+
| `apiKey` | The secret key used to access the model. If you provide a key, leave `authIdentity` empty. If you set both the `apiKey` and `authIdentity`, the `apiKey` is used on the connection. |
43+
| `deploymentId` | The name of the deployed Azure OpenAI embedding model. The model should be an embedding model, such as text-embedding-ada-002. See the [List of Azure OpenAI models](/azure/ai-services/openai/concepts/models) for supported models.|
4244
| `authIdentity` | A user-managed identity used by the search service for connecting to Azure OpenAI. You can use either a [system or user managed identity](search-howto-managed-identities-data-sources.md). To use a system manged identity, leave `apiKey` and `authIdentity` blank. The system-managed identity is used automatically. A managed identity must have [Cognitive Services OpenAI User](/azure/ai-services/openai/how-to/role-based-access-control#azure-openai-roles) permissions to send text to Azure OpenAI. |
4345

4446
## Skill inputs
4547

4648
| Input | Description |
4749
|--------------------|-------------|
48-
| `text` | The input text to be vectorized.|
50+
| `text` | The input text to be vectorized. If you're using data chunking, the source might be `/document/pages/*`. |
4951

5052
## Skill outputs
5153

@@ -100,13 +102,13 @@ For the given input text, a vectorized embedding output is produced.
100102
}
101103
```
102104

103-
The output resides in memory. To send this output to a field in the search index, you must define an [outputFieldMapping](cognitive-search-output-field-mapping.md) that maps the vectorized embedding output (which is an array) to the single index field which is of Collection(Edm.Single) type. Following the example above and assuming the index field in which you want to store the results of the vectorized embedding output is called **embeddingindexfield**, the outputFieldMapping to include in the definition of the indexer would look like the following:
105+
The output resides in memory. To send this output to a field in the search index, you must define an [outputFieldMapping](cognitive-search-output-field-mapping.md) that maps the vectorized embedding output (which is an array) to a [vector field](vector-search-how-to-create-index.md). Assuming the skill output resides in the document's **embedding** node, and **content_vector** is the field in the search index, the outputFieldMapping in indexer should look like:
104106

105107
```json
106108
"outputFieldMappings": [
107109
{
108110
"sourceFieldName": "/document/embedding/*",
109-
"targetFieldName": "embeddingindexfield"
111+
"targetFieldName": "content_vector"
110112
}
111113
]
112114
```

articles/search/cognitive-search-skill-shaper.md

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,24 @@ ms.service: cognitive-search
88
ms.custom:
99
- ignite-2023
1010
ms.topic: reference
11-
ms.date: 08/12/2021
11+
ms.date: 02/22/2024
1212
---
1313

1414
# Shaper cognitive skill
1515

16-
The **Shaper** skill consolidates several inputs into a [complex type](search-howto-complex-data-types.md) that can be referenced later in the enrichment pipeline. The **Shaper** skill allows you to essentially create a structure, define the name of the members of that structure, and assign values to each member. Examples of consolidated fields useful in search scenarios include combining a first and last name into a single structure, city and state into a single structure, or name and birthdate into a single structure to establish unique identity.
16+
The **Shaper** skill is used to reshape or modify the structure of the [in-memory enrichment tree](cognitive-search-working-with-skillsets.md#enrichment-tree) created by a skillset. If skill outputs can't be mapped directly to search fields, you can add a **Shaper** skill to create the data shape you need for your search index or knowledge store.
1717

18-
Additionally, the **Shaper** skill illustrated in [scenario 3](#nested-complex-types) adds an optional *sourceContext* property to the input. The *source* and *sourceContext* properties are mutually exclusive. If the input is at the context of the skill, simply use *source*. If the input is at a *different* context than the skill context, use the *sourceContext*. The *sourceContext* requires you to define a nested input with the specific element being addressed as the source.
18+
Primary use-cases for this skill include:
1919

20-
The output name is always "output". Internally, the pipeline can map a different name, such as "analyzedText" as shown in the examples below, but the **Shaper** skill itself returns "output" in the response. This might be important if you are debugging enriched documents and notice the naming discrepancy, or if you build a custom skill and are structuring the response yourself.
20+
+ You're populating a knowledge store. The physical structure of the tables and objects of a knowledge store are defined through projections. A **Shaper** skill adds granularity by creating data shapes that can be pushed to the projections.
21+
22+
+ You want to map multiple skill outputs into a single structure in your search index, usually a [complex type](search-howto-complex-data-types.md), as described in [scenario 1](#scenario-1-complex-types).
23+
24+
+ Skills produce multiple outputs, but you want to combine into a single field (it doesn't have to be a complex type), as described in [scenario 2](#scenario-2-input-consolidation). For example, combining titles and authors into a single field.
25+
26+
+ Skills produce multiple outputs with child elements, and you want to combine them. This use-case is illustrated in [scenario 3](#nested-complex-types).
27+
28+
The output name of a **Shaper** skill is always "output". Internally, the pipeline can map a different name, such as "analyzedText" as shown in the examples below, but the **Shaper** skill itself returns "output" in the response. This might be important if you are debugging enriched documents and notice the naming discrepancy, or if you build a custom skill and are structuring the response yourself.
2129

2230
> [!NOTE]
2331
> This skill isn't bound to Azure AI services. It is non-billable and has no Azure AI services key requirement.
@@ -101,7 +109,6 @@ An incoming JSON document providing usable input for this **Shaper** skill could
101109
}
102110
```
103111

104-
105112
### Skill output
106113

107114
The **Shaper** skill generates a new element called *analyzedText* with the combined elements of *text* and *sentiment*. This output conforms to the index schema. It will be imported and indexed in an Azure AI Search index.
@@ -126,7 +133,7 @@ The **Shaper** skill generates a new element called *analyzedText* with the comb
126133

127134
## Scenario 2: input consolidation
128135

129-
In another example, imagine that at different stages of pipeline processing, you have extracted the title of a book, and chapter titles on different pages of the book. You could now create a single structure composed of these various inputs.
136+
In another example, imagine that at different stages of pipeline processing, you have extracted the title of a book, and chapter titles on different pages of the book. You could now create a single structure composed of these various outputs.
130137

131138
The **Shaper** skill definition for this scenario might look like the following example:
132139

@@ -180,7 +187,9 @@ In this case, the **Shaper** flattens all chapter titles to create a single arra
180187

181188
## Scenario 3: input consolidation from nested contexts
182189

183-
Imagine you have the title, chapters, and contents of a book and have run entity recognition and key phrases on the contents and now need to aggregate results from the different skills into a single shape with the chapter name, entities, and key phrases.
190+
Imagine you have chapter titles and chapter numbers of a book and have run entity recognition and key phrases on the contents and now need to aggregate results from the different skills into a single shape with the chapter name, entities, and key phrases.
191+
192+
This example adds an optional `sourceContext` property to the "chapterTitles" input. The `source` and `sourceContext` properties are mutually exclusive. If the input is at the context of the skill, you can use `source`. If the input is at a *different* context than the skill context, use `sourceContext`. The `sourceContext` requires you to define a nested input, where each input has a `source` that identifies the specific element used to populate the named node.
184193

185194
The **Shaper** skill definition for this scenario might look like the following example:
186195

articles/search/hybrid-search-how-to-query.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,12 @@ ms.service: cognitive-search
99
ms.custom:
1010
- ignite-2023
1111
ms.topic: how-to
12-
ms.date: 11/15/2023
12+
ms.date: 02/22/2024
1313
---
1414

1515
# Create a hybrid query in Azure AI Search
1616

17-
Hybrid search consists of keyword queries and vector queries in a single search request.
17+
Hybrid search combines one or more keyword queries with vector queries in a single search request.
1818

1919
The response includes the top results ordered by search score. Both vector queries and free text queries are assigned an initial search score from their respective scoring or similarity algorithms. Those scores are merged using [Reciprocal Rank Fusion (RRF)](hybrid-search-ranking.md) to return a single ranked result set.
2020

@@ -24,7 +24,7 @@ The response includes the top results ordered by search score. Both vector queri
2424

2525
+ A search index containing vector and non-vector fields. See [Create an index](search-how-to-create-search-index.md) and [Add vector fields to a search index](vector-search-how-to-create-index.md).
2626

27-
+ Use [**Search Post REST API version 2023-11-01**](/rest/api/searchservice/documents/search-post), Search Explorer in the Azure portal, or packages in the Azure SDKs that have been updated to use this feature.
27+
+ Use [**Search Post REST API version 2023-11-01**](/rest/api/searchservice/documents/search-post) or **REST API 2023-10-01-preview**, Search Explorer in the Azure portal, or packages in the Azure SDKs that have been updated to use this feature.
2828

2929
+ (Optional) If you want to also use [semantic ranking](semantic-search-overview.md) and vector search together, your search service must be Basic tier or higher, with [semantic ranking enabled](semantic-how-to-enable-disable.md).
3030

articles/search/query-lucene-syntax.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ms.service: cognitive-search
1010
ms.custom:
1111
- ignite-2023
1212
ms.topic: conceptual
13-
ms.date: 01/17/2024
13+
ms.date: 02/22/2024
1414
---
1515

1616
# Lucene query syntax in Azure AI Search
@@ -140,10 +140,12 @@ The following example helps illustrate the differences. Suppose that there's a s
140140

141141
For example, to find documents containing `motel` or `hotel`, specify `/[mh]otel/`. Regular expression searches are matched against single words.
142142

143-
Some tools and languages impose other escape character requirements. For JSON, strings that include a forward slash are escaped with a backward slash: `microsoft.com/azure/` becomes `search=/.*microsoft.com\/azure\/.*/` where `search=/.* <string-placeholder>.*/` sets up the regular expression, and `microsoft.com\/azure\/` is the string with an escaped forward slash.
143+
Some tools and languages impose extra escape character requirements beyond the [escape rules](#escaping-special-characters) imposed by Azure AI Search. For JSON, strings that include a forward slash are escaped with a backward slash: `microsoft.com/azure/` becomes `search=/.*microsoft.com\/azure\/.*/` where `search=/.* <string-placeholder>.*/` sets up the regular expression, and `microsoft.com\/azure\/` is the string with an escaped forward slash.
144144

145145
Two common symbols in regex queries are `.` and `*`. A `.` matches any one character and a `*` matches the previous character zero or more times. For example, `/be./` matches the terms `bee` and `bet` while `/be*/` would match `be`, `bee`, and `beee` but not `bet`. Together, `.*` allow you to match any series of characters so `/be.*/` would match any term that starts with `be` such as `better`.
146146

147+
If you get syntax errors in your regular expression, review the [escape rules](#escaping-special-characters) for special characters. You might also try a different client to confirm whether the problem is tool-specific.
148+
147149
## <a name="bkmk_wildcard"></a> Wildcard search
148150

149151
You can use generally recognized syntax for multiple (`*`) or single (`?`) character wildcard searches. Full Lucene syntax supports prefix, infix, and suffix matching.

articles/search/search-explorer.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: HeidiSteen
77
ms.author: heidist
88
ms.service: cognitive-search
99
ms.topic: quickstart
10-
ms.date: 01/18/2024
10+
ms.date: 02/22/2024
1111
ms.custom:
1212
- mode-ui
1313
- ignite-2023
@@ -43,11 +43,19 @@ Before you begin, have the following prerequisites in place:
4343

4444
:::image type="content" source="media/search-explorer/search-explorer-tab.png" alt-text="Screenshot of the Search explorer tab." border="true":::
4545

46-
1. To specify query parameters and an API version, switch to **JSON view**. The examples in this article assume JSON view throughout. You can paste JSON examples from this article into the text area.
46+
## Query two ways
47+
48+
There are two approaches for querying in Search explorer.
49+
50+
+ The default search bar accepts an empty query or free text query with booleans. For example, `seattle condo +parking`.
51+
52+
+ JSON view supports parameterized queries. Filters, orderby, select, count, searchFields, and all other parameters must be set in JSON view.
53+
54+
Switch to **JSON view** for parameterized queries. The examples in this article assume JSON view throughout. You can paste JSON examples from this article into the text area.
4755

4856
:::image type="content" source="media/search-explorer/search-explorer-json-view.png" alt-text="Screenshot of the JSON view selector." border="true":::
4957

50-
## Unspecified query
58+
## Run an unspecified query
5159

5260
In Search explorer, POST requests are formulated internally using the [Search POST REST API](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2023-10-01-preview&preserve-view=true), with responses returned as verbose JSON documents.
5361

@@ -69,6 +77,8 @@ Equivalent syntax for an empty search is `*` or `"search": "*"`.
6977

7078
Free-form queries, with or without operators, are useful for simulating user-defined queries sent from a custom app to Azure AI Search. Only those fields attributed as "searchable" in the index definition are scanned for matches.
7179

80+
You don't need JSON view for a free text query, but we provide it in JSON for consistency with other examples in this article.
81+
7282
Notice that when you provide search criteria, such as query terms or expressions, search rank comes into play. The following example illustrates a free text search. The "@search.score" is a relevance score computed for the match using the [default scoring algorithm](index-ranking-similarity.md#default-scoring-algorithm).
7383

7484
```json

articles/search/search-get-started-semantic.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,7 @@ This quickstart walks you through the index and query modifications that invoke
2929

3030
+ Azure AI Search, at Basic tier or higher, with [semantic ranking enabled](semantic-how-to-enable-disable.md).
3131

32-
+ An API key and search service endpoint:
33-
34-
Sign in to the [Azure portal](https://portal.azure.com) and [find your search service](https://portal.azure.com/#blade/HubsExtension/BrowseResourceBlade/resourceType/Microsoft.Search%2FsearchServices).
32+
+ An API key and search service endpoint. Sign in to the [Azure portal](https://portal.azure.com) and [find your search service](https://portal.azure.com/#blade/HubsExtension/BrowseResourceBlade/resourceType/Microsoft.Search%2FsearchServices).
3533

3634
In **Overview**, copy the URL and save it to Notepad for a later step. An example endpoint might look like `https://mydemo.search.windows.net`.
3735

articles/search/search-get-started-text.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,7 @@ This quickstart has [steps](#create-load-and-query-an-index) for the following S
3333

3434
+ An Azure AI Search service. [Create a service](search-create-service-portal.md) if you don't have one. You can use a free tier for this quickstart.
3535

36-
+ An API key and service endpoint:
37-
38-
Sign in to the [Azure portal](https://portal.azure.com) and [find your search service](https://portal.azure.com/#blade/HubsExtension/BrowseResourceBlade/resourceType/Microsoft.Search%2FsearchServices).
36+
+ An API key and service endpoint. Sign in to the [Azure portal](https://portal.azure.com) and [find your search service](https://portal.azure.com/#blade/HubsExtension/BrowseResourceBlade/resourceType/Microsoft.Search%2FsearchServices).
3937

4038
In **Overview**, copy the URL and save it to Notepad for a later step. An example endpoint might look like `https://mydemo.search.windows.net`.
4139

articles/search/semantic-how-to-query-request.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,22 @@ In this step, add parameters to the query request. To be successful, your query
7171

7272
:::image type="content" source="./media/semantic-search-overview/semantic-portal-json-query.png" alt-text="Screenshot showing JSON query syntax in the Azure portal." border="true":::
7373

74+
Here's some JSON text that you can paste into the view:
75+
76+
```json
77+
{
78+
"queryType": "semantic",
79+
"search": "historic hotel with good food",
80+
"semanticConfiguration": "my-semantic-config",
81+
"answers": "extractive|count-3",
82+
"captions": "extractive|highlight-true",
83+
"highlightPreTag": "<strong>",
84+
"highlightPostTag": "</strong>",
85+
"select": "HotelId,HotelName,Description,Category",
86+
"count": true
87+
}
88+
```
89+
7490
### [**REST API**](#tab/rest-query)
7591

7692
Use [Search Documents](/rest/api/searchservice/documents/search-post) to formulate the request.

0 commit comments

Comments
 (0)