Merge pull request #267045 from HeidiSteen/heidist-fresh

prmerger-automator[bot] · web-flow · commit 9dc877ea2396 · 2024-02-23T03:19:13.000Z
[azure search] Addressed verbatim feedback
diff --git a/articles/search/cognitive-search-skill-azure-openai-embedding.md b/articles/search/cognitive-search-skill-azure-openai-embedding.md
@@ -8,7 +8,7 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: reference
-ms.date: 12/21/2023
+ms.date: 02/21/2024
 ---
 
 #	Azure OpenAI Embedding skill
@@ -18,6 +18,8 @@ ms.date: 12/21/2023
 
 The **Azure OpenAI Embedding** skill connects to a deployed embedding model on your [Azure OpenAI](/azure/ai-services/openai/overview) resource to generate embeddings.
 
+The [Import and vectorize data](search-get-started-portal-import-vectors.md) uses the **Azure OpenAI Embedding** skill to vectorize content. You can run the wizard and review the generated skillset to see how the wizard builds it.
+
 > [!NOTE]
 > This skill is bound to Azure OpenAI and is charged at the existing [Azure OpenAI pay-as-you go price](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/#pricing).
 >
@@ -28,24 +30,24 @@ Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill
 
 ## Data limits
 
-The maximum size of a text input should be 8,000 tokens. If input exceeds the maximum allowed, the model throws an invalid request error. For more information, see the [tokens](/azure/ai-services/openai/overview#tokens) key concept in the Azure OpenAI documentation.
+The maximum size of a text input should be 8,000 tokens. If input exceeds the maximum allowed, the model throws an invalid request error. For more information, see the [tokens](/azure/ai-services/openai/overview#tokens) key concept in the Azure OpenAI documentation. Consider using the [Text Split skill](cognitive-search-skill-textsplit.md) if you need data chunking.
 
 ## Skill parameters
 
 Parameters are case-sensitive.
 
 | Inputs | Description |
 |---------------------|-------------|
-| `resourceUri` | The URI where a valid Azure OpenAI model is deployed. The model should be an embedding model, such as text-embedding-ada-002. See the [List of Azure OpenAI models](/azure/ai-services/openai/concepts/models) for supported models. |
-| `apiKey`   |  The secret key pertaining to a valid Azure OpenAI `resourceUri.` If you provide a key, leave `authIdentity` empty. If you set both the `apiKey` and `authIdentity`, the `apiKey` is used on the connection. |
-| `deploymentId`   | The name of the deployed Azure OpenAI embedding model.|
+| `resourceUri` | The URI of a model provider, such as an Azure OpenAI resource or an OpenAI URL.  |
+| `apiKey`   |  The secret key used to access the model. If you provide a key, leave `authIdentity` empty. If you set both the `apiKey` and `authIdentity`, the `apiKey` is used on the connection. |
+| `deploymentId`   | The name of the deployed Azure OpenAI embedding model. The model should be an embedding model, such as text-embedding-ada-002. See the [List of Azure OpenAI models](/azure/ai-services/openai/concepts/models) for supported models.|
 | `authIdentity`   | A user-managed identity used by the search service for connecting to Azure OpenAI. You can use either a [system or user managed identity](search-howto-managed-identities-data-sources.md). To use a system manged identity, leave `apiKey` and `authIdentity` blank. The system-managed identity is used automatically. A managed identity must have [Cognitive Services OpenAI User](/azure/ai-services/openai/how-to/role-based-access-control#azure-openai-roles) permissions to send text to Azure OpenAI. |
 
 ## Skill inputs
 
 | Input	 | Description |
 |--------------------|-------------|
-| `text` | The input text to be vectorized.|
+| `text` | The input text to be vectorized. If you're using data chunking, the source might be `/document/pages/*`. |
 
 ## Skill outputs
 
@@ -100,13 +102,13 @@ For the given input text, a vectorized embedding output is produced.
 }
 ```
 
-The output resides in memory. To send this output to a field in the search index, you must define an [outputFieldMapping](cognitive-search-output-field-mapping.md) that maps the vectorized embedding output (which is an array) to the single index field which is of Collection(Edm.Single) type. Following the example above and assuming the index field in which you want to store the results of the vectorized embedding output is called **embeddingindexfield**, the outputFieldMapping to include in the definition of the indexer would look like the following:
+The output resides in memory. To send this output to a field in the search index, you must define an [outputFieldMapping](cognitive-search-output-field-mapping.md) that maps the vectorized embedding output (which is an array) to a [vector field](vector-search-how-to-create-index.md). Assuming the skill output resides in the document's **embedding** node, and **content_vector** is the field in the search index, the outputFieldMapping in indexer should look like:
 
 ```json
   "outputFieldMappings": [
     {
       "sourceFieldName": "/document/embedding/*",
-      "targetFieldName": "embeddingindexfield"
+      "targetFieldName": "content_vector"
     }
   ]
 ```
diff --git a/articles/search/cognitive-search-skill-shaper.md b/articles/search/cognitive-search-skill-shaper.md
@@ -8,16 +8,24 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: reference
-ms.date: 08/12/2021
+ms.date: 02/22/2024
 ---
 
 # Shaper cognitive skill
 
-The **Shaper** skill consolidates several inputs into a [complex type](search-howto-complex-data-types.md) that can be referenced later in the enrichment pipeline. The **Shaper** skill allows you to essentially create a structure, define the name of the members of that structure, and assign values to each member. Examples of consolidated fields useful in search scenarios include combining a first and last name into a single structure, city and state into a single structure, or name and birthdate into a single structure to establish unique identity.
+The **Shaper** skill is used to reshape or modify the structure of the [in-memory enrichment tree](cognitive-search-working-with-skillsets.md#enrichment-tree) created by a skillset. If skill outputs can't be mapped directly to search fields, you can add a **Shaper** skill to create the data shape you need for your search index or knowledge store.
 
-Additionally, the **Shaper** skill illustrated in [scenario 3](#nested-complex-types) adds an optional *sourceContext* property to the input. The *source* and *sourceContext* properties are mutually exclusive. If the input is at the context of the skill, simply use *source*. If the input is at a *different* context than the skill context, use the *sourceContext*. The *sourceContext* requires you to define a nested input with the specific element being addressed as the source. 
+Primary use-cases for this skill include:
 
-The output name is always "output". Internally, the pipeline can map a different name, such as "analyzedText" as shown in the examples below, but the **Shaper** skill itself returns "output" in the response. This might be important if you are debugging enriched documents and notice the naming discrepancy, or if you build a custom skill and are structuring the response yourself.
++ You're populating a knowledge store. The physical structure of the tables and objects of a knowledge store are defined through projections. A **Shaper** skill adds granularity by creating data shapes that can be pushed to the projections.
+
++ You want to map multiple skill outputs into a single structure in your search index, usually a [complex type](search-howto-complex-data-types.md), as described in [scenario 1](#scenario-1-complex-types). 
+
++ Skills produce multiple outputs, but you want to combine into a single field (it doesn't have to be a complex type), as described in [scenario 2](#scenario-2-input-consolidation). For example, combining titles and authors into a single field.
+
++ Skills produce multiple outputs with child elements, and you want to combine them. This use-case is illustrated in [scenario 3](#nested-complex-types).
+
+The output name of a **Shaper** skill is always "output". Internally, the pipeline can map a different name, such as "analyzedText" as shown in the examples below, but the **Shaper** skill itself returns "output" in the response. This might be important if you are debugging enriched documents and notice the naming discrepancy, or if you build a custom skill and are structuring the response yourself.
 
 > [!NOTE]
 > This skill isn't bound to Azure AI services. It is non-billable and has no Azure AI services key requirement.
@@ -101,7 +109,6 @@ An incoming JSON document providing usable input for this **Shaper** skill could
 }
 ```
 
-
 ###	Skill output
 
 The **Shaper** skill generates a new element called *analyzedText* with the combined elements of *text* and *sentiment*. This output conforms to the index schema. It will be imported and indexed in an Azure AI Search index.
@@ -126,7 +133,7 @@ The **Shaper** skill generates a new element called *analyzedText* with the comb
 
 ## Scenario 2: input consolidation
 
-In another example, imagine that at different stages of pipeline processing, you have extracted the title of a book, and chapter titles on different pages of the book. You could now create a single structure composed of these various inputs.
+In another example, imagine that at different stages of pipeline processing, you have extracted the title of a book, and chapter titles on different pages of the book. You could now create a single structure composed of these various outputs.
 
 The **Shaper** skill definition for this scenario might look like the following example:
 
@@ -180,7 +187,9 @@ In this case, the **Shaper** flattens all chapter titles to create a single arra
 
 ## Scenario 3: input consolidation from nested contexts
 
-Imagine you have the title, chapters, and contents of a book and have run entity recognition and key phrases on the contents and now need to aggregate results from the different skills into a single shape with the chapter name, entities, and key phrases.
+Imagine you have chapter titles and chapter numbers of a book and have run entity recognition and key phrases on the contents and now need to aggregate results from the different skills into a single shape with the chapter name, entities, and key phrases.
+
+This example adds an optional `sourceContext` property to the "chapterTitles" input. The `source` and `sourceContext` properties are mutually exclusive. If the input is at the context of the skill, you can use `source`. If the input is at a *different* context than the skill context, use `sourceContext`. The `sourceContext` requires you to define a nested input, where each input has a `source` that identifies the specific element used to populate the named node. 
 
 The **Shaper** skill definition for this scenario might look like the following example:
 
diff --git a/articles/search/hybrid-search-how-to-query.md b/articles/search/hybrid-search-how-to-query.md
@@ -9,12 +9,12 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: how-to
-ms.date: 11/15/2023
+ms.date: 02/22/2024
 ---
 
 # Create a hybrid query in Azure AI Search
 
-Hybrid search consists of keyword queries and vector queries in a single search request. 
+Hybrid search combines one or more keyword queries with vector queries in a single search request. 
 
 The response includes the top results ordered by search score. Both vector queries and free text queries are assigned an initial search score from their respective scoring or similarity algorithms. Those scores are merged using [Reciprocal Rank Fusion (RRF)](hybrid-search-ranking.md) to return a single ranked result set. 
 
@@ -24,7 +24,7 @@ The response includes the top results ordered by search score. Both vector queri
 
 + A search index containing vector and non-vector fields. See [Create an index](search-how-to-create-search-index.md) and [Add vector fields to a search index](vector-search-how-to-create-index.md).
 
-+ Use [**Search Post REST API version 2023-11-01**](/rest/api/searchservice/documents/search-post), Search Explorer in the Azure portal, or packages in the Azure SDKs that have been updated to use this feature.
++ Use [**Search Post REST API version 2023-11-01**](/rest/api/searchservice/documents/search-post) or **REST API 2023-10-01-preview**, Search Explorer in the Azure portal, or packages in the Azure SDKs that have been updated to use this feature.
 
 + (Optional) If you want to also use [semantic ranking](semantic-search-overview.md) and vector search together, your search service must be Basic tier or higher, with [semantic ranking enabled](semantic-how-to-enable-disable.md).
 
diff --git a/articles/search/query-lucene-syntax.md b/articles/search/query-lucene-syntax.md
@@ -10,7 +10,7 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: conceptual
-ms.date: 01/17/2024
+ms.date: 02/22/2024
 ---
 
 # Lucene query syntax in Azure AI Search
@@ -140,10 +140,12 @@ The following example helps illustrate the differences. Suppose that there's a s
 
  For example, to find documents containing `motel` or `hotel`, specify `/[mh]otel/`. Regular expression searches are matched against single words.
 
-Some tools and languages impose other escape character requirements. For JSON, strings that include a forward slash are escaped with a backward slash: `microsoft.com/azure/` becomes `search=/.*microsoft.com\/azure\/.*/` where `search=/.* <string-placeholder>.*/` sets up the regular expression, and `microsoft.com\/azure\/` is the string with an escaped forward slash.
+Some tools and languages impose extra escape character requirements beyond the [escape rules](#escaping-special-characters) imposed by Azure AI Search. For JSON, strings that include a forward slash are escaped with a backward slash: `microsoft.com/azure/` becomes `search=/.*microsoft.com\/azure\/.*/` where `search=/.* <string-placeholder>.*/` sets up the regular expression, and `microsoft.com\/azure\/` is the string with an escaped forward slash. 
 
 Two common symbols in regex queries are `.` and `*`. A `.` matches any one character and a `*` matches the previous character zero or more times.  For example, `/be./` matches the terms `bee` and `bet` while `/be*/` would match `be`, `bee`, and `beee` but not `bet`. Together, `.*` allow you to match any series of characters so `/be.*/` would match any term that starts with `be` such as `better`.
 
+If you get syntax errors in your regular expression, review the [escape rules](#escaping-special-characters) for special characters. You might also try a different client to confirm whether the problem is tool-specific.
+
 ##  <a name="bkmk_wildcard"></a> Wildcard search
 
 You can use generally recognized syntax for multiple (`*`) or single (`?`) character wildcard searches. Full Lucene syntax supports prefix, infix, and suffix matching. 
diff --git a/articles/search/search-explorer.md b/articles/search/search-explorer.md
@@ -7,7 +7,7 @@ author: HeidiSteen
 ms.author: heidist
 ms.service: cognitive-search
 ms.topic: quickstart
-ms.date: 01/18/2024
+ms.date: 02/22/2024
 ms.custom:
   - mode-ui
   - ignite-2023
@@ -43,11 +43,19 @@ Before you begin, have the following prerequisites in place:
 
    :::image type="content" source="media/search-explorer/search-explorer-tab.png" alt-text="Screenshot of the Search explorer tab." border="true":::
 
-1. To specify query parameters and an API version, switch to **JSON view**. The examples in this article assume JSON view throughout. You can paste JSON examples from this article into the text area.
+## Query two ways
+
+There are two approaches for querying in Search explorer. 
+
++ The default search bar accepts an empty query or free text query with booleans. For example, `seattle condo +parking`.
+
++ JSON view supports parameterized queries. Filters, orderby, select, count, searchFields, and all other parameters must be set in JSON view.
+
+  Switch to **JSON view** for parameterized queries. The examples in this article assume JSON view throughout. You can paste JSON examples from this article into the text area.
 
    :::image type="content" source="media/search-explorer/search-explorer-json-view.png" alt-text="Screenshot of the JSON view selector." border="true":::
 
-## Unspecified query
+## Run an unspecified query
 
 In Search explorer, POST requests are formulated internally using the [Search POST REST API](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2023-10-01-preview&preserve-view=true), with responses returned as verbose JSON documents.
 
@@ -69,6 +77,8 @@ Equivalent syntax for an empty search is `*` or `"search": "*"`.
 
 Free-form queries, with or without operators, are useful for simulating user-defined queries sent from a custom app to Azure AI Search. Only those fields attributed as "searchable" in the index definition are scanned for matches. 
 
+You don't need JSON view for a free text query, but we provide it in JSON for consistency with other examples in this article.
+
 Notice that when you provide search criteria, such as query terms or expressions, search rank comes into play. The following example illustrates a free text search. The "@search.score" is a relevance score computed for the match using the [default scoring algorithm](index-ranking-similarity.md#default-scoring-algorithm).
 
    ```json
diff --git a/articles/search/search-get-started-semantic.md b/articles/search/search-get-started-semantic.md
@@ -29,9 +29,7 @@ This quickstart walks you through the index and query modifications that invoke
 
 + Azure AI Search, at Basic tier or higher, with [semantic ranking enabled](semantic-how-to-enable-disable.md).
 
-+ An API key and search service endpoint:
-
-  Sign in to the [Azure portal](https://portal.azure.com) and [find your search service](https://portal.azure.com/#blade/HubsExtension/BrowseResourceBlade/resourceType/Microsoft.Search%2FsearchServices).
++ An API key and search service endpoint. Sign in to the [Azure portal](https://portal.azure.com) and [find your search service](https://portal.azure.com/#blade/HubsExtension/BrowseResourceBlade/resourceType/Microsoft.Search%2FsearchServices).
 
   In **Overview**, copy the URL and save it to Notepad for a later step. An example endpoint might look like `https://mydemo.search.windows.net`.
 
diff --git a/articles/search/search-get-started-text.md b/articles/search/search-get-started-text.md
@@ -33,9 +33,7 @@ This quickstart has [steps](#create-load-and-query-an-index) for the following S
 
 + An Azure AI Search service. [Create a service](search-create-service-portal.md) if you don't have one. You can use a free tier for this quickstart.
 
-+ An API key and service endpoint:
-
-  Sign in to the [Azure portal](https://portal.azure.com) and [find your search service](https://portal.azure.com/#blade/HubsExtension/BrowseResourceBlade/resourceType/Microsoft.Search%2FsearchServices).
++ An API key and service endpoint. Sign in to the [Azure portal](https://portal.azure.com) and [find your search service](https://portal.azure.com/#blade/HubsExtension/BrowseResourceBlade/resourceType/Microsoft.Search%2FsearchServices).
 
   In **Overview**, copy the URL and save it to Notepad for a later step. An example endpoint might look like `https://mydemo.search.windows.net`.
 
diff --git a/articles/search/semantic-how-to-query-request.md b/articles/search/semantic-how-to-query-request.md
@@ -71,6 +71,22 @@ In this step, add parameters to the query request. To be successful, your query
 
    :::image type="content" source="./media/semantic-search-overview/semantic-portal-json-query.png" alt-text="Screenshot showing JSON query syntax in the Azure portal." border="true":::
 
+   Here's some JSON text that you can paste into the view:
+
+   ```json
+    {
+        "queryType": "semantic",
+        "search": "historic hotel with good food",
+        "semanticConfiguration": "my-semantic-config",
+        "answers": "extractive|count-3",
+        "captions": "extractive|highlight-true",
+        "highlightPreTag": "<strong>",
+        "highlightPostTag": "</strong>",
+        "select": "HotelId,HotelName,Description,Category",
+        "count": true
+    }
+   ```
+   
 ### [**REST API**](#tab/rest-query)
 
 Use [Search Documents](/rest/api/searchservice/documents/search-post) to formulate the request.
diff --git a/articles/search/vector-search-how-to-chunk-documents.md b/articles/search/vector-search-how-to-chunk-documents.md