MicrosoftDocs
diff --git a/‎articles/search/TOC.yml
Lines changed: 20 additions & 2 deletions b/‎articles/search/TOC.yml
Lines changed: 20 additions & 2 deletions
diff --git a/‎articles/search/cognitive-search-aml-skill.md
Lines changed: 5 additions & 2 deletions b/‎articles/search/cognitive-search-aml-skill.md
Lines changed: 5 additions & 2 deletions
diff --git a/‎articles/search/cognitive-search-predefined-skills.md
Lines changed: 1 addition & 0 deletions b/‎articles/search/cognitive-search-predefined-skills.md
Lines changed: 1 addition & 0 deletions
diff --git a/‎articles/search/cognitive-search-skill-annotation-language.md
Lines changed: 24 additions & 8 deletions b/‎articles/search/cognitive-search-skill-annotation-language.md
Lines changed: 24 additions & 8 deletions
diff --git a/‎articles/search/cognitive-search-skill-azure-openai-embedding.md
Lines changed: 27 additions & 4 deletions b/‎articles/search/cognitive-search-skill-azure-openai-embedding.md
Lines changed: 27 additions & 4 deletions
@@ -25,6 +25,8 @@
     items:
     - name: Create an index
       href: search-get-started-portal.md
+    - name: Data chunking and vectorization in Azure portal (preview)
+      href: search-get-started-portal-import-vectors.md
     - name: Create a demo app
       href: search-create-app-portal.md
     - name: Create a skillset
@@ -286,6 +288,8 @@
             href: search-howto-connecting-azure-sql-mi-to-azure-search-using-indexers.md
           - name: Azure SQL Server VMs
             href: search-howto-connecting-azure-sql-iaas-to-azure-search-using-indexers.md
+        - name: OneLake files
+          href: search-how-to-index-onelake-files.md
         - name: SharePoint in Microsoft 365
           href: search-howto-index-sharepoint-online.md
     - name: Skillsets
@@ -322,6 +326,8 @@
     items:
     - name: Create a vector index
       href: vector-search-how-to-create-index.md
+    - name: Index binary data for vector search
+      href: vector-search-how-to-index-binary-data.md
     - name: Query vectors
       href: vector-search-how-to-query.md
     - name: Filter vectors
@@ -336,8 +342,8 @@
       href: vector-search-how-to-chunk-documents.md
     - name: Generate embeddings
       href: vector-search-how-to-generate-embeddings.md
-    - name: Chunk and embed in Azure portal (preview)
-      href: search-get-started-portal-import-vectors.md
+    - name: Integrated vectorization with Azure AI Studio models
+      href: vector-search-integrated-vectorization-ai-studio.md
   - name: Keyword search
     items:
     - name: Full text query
@@ -623,6 +629,8 @@
         href: cognitive-search-skill-sentiment-v3.md
       - name: Text Translation
         href: cognitive-search-skill-text-translation.md
+      - name: AI Vision multimodal embeddings
+        href: cognitive-search-skill-vision-vectorize.md
     - name: Azure AI Search utility skills (nonbillable)
       items:
       - name: Conditional
@@ -654,6 +662,16 @@
         href: cognitive-search-skill-entity-recognition.md
       - name: Sentiment (v2)
         href: cognitive-search-skill-sentiment.md
+  - name: Vectorizers reference
+    items:
+    - name: Azure OpenAI
+      href: vector-search-vectorizer-azure-open-ai.md
+    - name: Azure AI Vision
+      href: vector-search-vectorizer-ai-services-vision.md
+    - name: Azure AI Studio model catalog
+      href: vector-search-vectorizer-azure-machine-learning-ai-studio-catalog.md
+    - name: Custom Web API
+      href: vector-search-vectorizer-custom-web-api.md
 - name: Resources
   items:
   - name: Stack Overflow
 
@@ -9,7 +9,7 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: reference
-ms.date: 12/01/2022
+ms.date: 05/08/2024
 ---
 
 # AML skill in an Azure AI Search enrichment pipeline
@@ -19,7 +19,9 @@ ms.date: 12/01/2022
 
 The **AML** skill allows you to extend AI enrichment with a custom [Azure Machine Learning](../machine-learning/overview-what-is-azure-machine-learning.md) (AML) model. Once an AML model is [trained and deployed](../machine-learning/concept-azure-machine-learning-architecture.md#workspace), an **AML** skill integrates it into AI enrichment.
 
-Like built-in skills, an **AML** skill has inputs and outputs. The inputs are sent to your deployed AML online endpoint as a JSON object, which outputs a JSON payload as a response along with a success status code. The response is expected to have the outputs specified by your **AML** skill. Any other response is considered an error and no enrichments are performed.
+Like other built-in skills, an **AML** skill has inputs and outputs. The inputs are sent to your deployed AML online endpoint as a JSON object, which outputs a JSON payload as a response along with a success status code. The response is expected to have the outputs specified by your **AML** skill. Any other response is considered an error and no enrichments are performed.
+
+If you're using the [Azure AI Studio model catalog vectorizer (preview)](vector-search-vectorizer-azure-machine-learning-ai-studio-catalog.md) for integrated vectorization at query time, you should also use the **AML** skill for integrated vectorization during indexing. See [How to implement integrated vectorization using models from Azure AI Studio](vector-search-integrated-vectorization-ai-studio.md) for instructions. This scenario is supported through the 2024-05-01-preview REST API and the Azure portal.
 
 > [!NOTE]
 > The indexer will retry twice for certain standard HTTP status codes returned from the AML online endpoint. These HTTP status codes are:
@@ -165,3 +167,4 @@ For cases when the AML online endpoint is unavailable or returns an HTTP error,
 
 + [How to define a skillset](cognitive-search-defining-skillset.md)
 + [AML online endpoint troubleshooting](../machine-learning/how-to-troubleshoot-online-endpoints.md)
++ [Integrated vectorization with models from Azure AI Studio](vector-search-integrated-vectorization-ai-studio.md)
@@ -41,6 +41,7 @@ Skills that call the Azure AI are billed at the pay-as-you-go rate when you [att
 | [Microsoft.Skills.Text.TranslationSkill](cognitive-search-skill-text-translation.md) | This skill uses a pretrained model to translate the input text into various languages for normalization or localization use cases. | Azure AI services ([pricing](https://azure.microsoft.com/pricing/details/cognitive-services/)) | 
 | [Microsoft.Skills.Vision.ImageAnalysisSkill](cognitive-search-skill-image-analysis.md) | This skill uses an image detection algorithm to identify the content of an image and generate a text description. | Azure AI services ([pricing](https://azure.microsoft.com/pricing/details/cognitive-services/)) | 
 | [Microsoft.Skills.Vision.OcrSkill](cognitive-search-skill-ocr.md) | Optical character recognition. | Azure AI services ([pricing](https://azure.microsoft.com/pricing/details/cognitive-services/)) |
+| [Microsoft.Skills.Vision.VectorizeSkill](cognitive-search-skill-vision-vectorize.md) | Multimodal image and text vectorization. | Azure AI services ([pricing](https://azure.microsoft.com/pricing/details/cognitive-services/)) |
 
 ## Azure OpenAI skills
 
 
@@ -1,7 +1,7 @@
 ---
 title: Skill context and input annotation reference language
 titleSuffix: Azure AI Search
-description: Annotation syntax reference for annotation in the context, inputs and outputs of a skillset in an AI enrichment pipeline in Azure AI Search.
+description: Annotation syntax reference for annotation in the context, inputs, and outputs of a skillset in an AI enrichment pipeline in Azure AI Search.
 
 author: BertrandLeRoy
 ms.author: beleroy
@@ -27,7 +27,7 @@ The enriched data structure can be [inspected from debug sessions](cognitive-sea
 Expressions querying the structure can also be [tested from debug sessions](cognitive-search-debug-session.md#expression-evaluator).
 
 Throughout the article, we'll use the following enriched data as an example.
-This data is typical of the kind of structure you would get when enriching a document using a skillset with [OCR](cognitive-search-skill-ocr.md), [key phrase extraction](cognitive-search-skill-keyphrases.md), [text translation](cognitive-search-skill-text-translation.md), [language detection](cognitive-search-skill-language-detection.md), [entity recognition](cognitive-search-skill-entity-recognition-v3.md) skills and a custom tokenizer skill.
+This data is typical of the kind of structure you would get when enriching a document using a skillset with [OCR](cognitive-search-skill-ocr.md), [key phrase extraction](cognitive-search-skill-keyphrases.md), [text translation](cognitive-search-skill-text-translation.md), [language detection](cognitive-search-skill-language-detection.md), and [entity recognition](cognitive-search-skill-entity-recognition-v3.md) skills, as well as a custom tokenizer skill.
 
 |Path|Value|
 |---|---|
@@ -134,7 +134,7 @@ The `'#'` token expresses that the array should be treated as a single value ins
 
 ### Enumerating arrays in context
 
-It is often useful to process each element of an array in isolation and have a different set of skill inputs and outputs for each.
+It's often useful to process each element of an array in isolation and have a different set of skill inputs and outputs for each.
 This can be done by setting the context of the skill to an enumeration instead of the default `"/document"`.
 
 In the following example, we use one of the input expressions we used before, but with a different context that changes the resulting value.
@@ -143,10 +143,10 @@ In the following example, we use one of the input expressions we used before, bu
 |---|---|---|
 |`/document/normalized_images/*`|`/document/normalized_images/*/text/words/*`|`["Study", "of", "BMN", "110" ...]`<br/>`["it", "is", "certainly" ...]`<br>...|
 
-For this combination of context and input, the skill will get executed once for each normalized image: once for `"/document/normalized_images/0"` and once for `"/document/normalized_images/1"`. The two input values corresponding to each skill execution are detailed in the values column.
+For this combination of context and input, the skill gets executed once for each normalized image: once for `"/document/normalized_images/0"` and once for `"/document/normalized_images/1"`. The two input values corresponding to each skill execution are detailed in the values column.
 
 When enumerating an array in context, any outputs the skill produces will also be added to the document as enrichments of the context.
-In the above example, an output named `"out"` will have its values for each execution added to the document respectively under `"/document/normalized_images/0/out"` and `"/document/normalized_images/1/out"`.
+In the above example, an output named `"out"` has its values for each execution added to the document respectively under `"/document/normalized_images/0/out"` and `"/document/normalized_images/1/out"`.
 
 ## Literal values
 
@@ -162,9 +162,25 @@ String values can be enclosed in single `'` or double `"` quotes.
 |`="unicod\u0065"`|`"unicode"`|
 |`=false`|`false`|
 
+### In line arrays
+
+If a certain skill input requires an array of data, but the data is represented as a single value currently or you need to combine multiple different single values into an array field, then you can create an array value inline as part of a skill input expression by wrapping a comma separated list of expressions in brackets (`[` and `]`). The array value can be a combination of expression paths or literal values as needed. You can also create nested arrays within arrays this way.
+
+|Expression|Value|
+|---|---|
+|`=['item']`|["item"]|
+|`=[$(/document/merged_content/entities/0/text), 'item']`|["BMN", "item"]|
+|`=[1, 3, 5]`|[1, 3, 5]|
+|`=[true, true, false]`|[true, true,  false]|
+|`=[[$(/document/merged_content/entities/0/text), 'item'],['item2', $(/document/merged_content/keyphrases/1)]]`|[["BMN", "item"], ["item2", "Syndrome"]]|
+
+If the skill has a context that explains to run the skill per an array input (that is, how `"context": "/document/pages/*"` means the skill runs once per "page" in `pages`) then passing that value as the expression as input to an in line array uses one of those values at a time. 
+
+For an example with our sample enriched data, if your skill's `context` is `/document/merged_content/keyphrases/*` and then you create an inline array of the following `=['key phrase', $(/document/merged_content/keyphrases/*)]` on an input of that skill, then the skill is executed three times, once with a value of ["key phrase", "Study of BMN"], another with a value of ["key phrase", "Syndrome"], and finally with a value of ["key phrase", "Pediatric Patients"]. The literal "key phrase" value stays the same each time, but the value of the expression path changes with each skill execution.
+
 ## Composite expressions
 
-It's possible to combine values together using unary, binary and ternary operators.
+It's possible to combine values together using unary, binary, and ternary operators.
 Operators can combine literal values and values resulting from path evaluation.
 When used inside an expression, paths should be enclosed between `"$("` and `")"`.
 
@@ -225,7 +241,7 @@ When used inside an expression, paths should be enclosed between `"$("` and `")"
 |`=15>4`|`true`|
 |`=1>=2`|`false`|
 
-### Equality and non-equality `'=='` `'!='`
+### Equality and nonequality `'=='` `'!='`
 
 |Expression|Value|
 |---|---|
@@ -248,7 +264,7 @@ When used inside an expression, paths should be enclosed between `"$("` and `")"
 
 ### Ternary operator `'?:'`
 
-It is possible to give an input different values based on the evaluation of a Boolean expression using the ternary operator.
+It's possible to give an input different values based on the evaluation of a Boolean expression using the ternary operator.
 
 |Expression|Value|
 |---|---|
 
@@ -8,17 +8,17 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: reference
-ms.date: 03/28/2024
+ms.date: 05/08/2024
 ---
 
 #	Azure OpenAI Embedding skill
 
 > [!IMPORTANT] 
-> This feature is in public preview under [Supplemental Terms of Use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). The [2023-10-01-Preview REST API](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2023-10-01-preview&preserve-view=true) supports this feature.
+> This feature is in public preview under [Supplemental Terms of Use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). The [2023-10-01-preview REST API](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2023-10-01-preview&preserve-view=true) supports the first iteration of this feature. The [2024-05-01-preview REST API](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true) adds more properties and supports more text embedding models on Azure OpenAI.
 
-The **Azure OpenAI Embedding** skill connects to a deployed embedding model on your [Azure OpenAI](/azure/ai-services/openai/overview) resource to generate embeddings.
+The **Azure OpenAI Embedding** skill connects to a deployed embedding model on your [Azure OpenAI](/azure/ai-services/openai/overview) resource to generate embeddings during indexing.
 
-The [Import and vectorize data](search-get-started-portal-import-vectors.md) uses the **Azure OpenAI Embedding** skill to vectorize content. You can run the wizard and review the generated skillset to see how the wizard builds it.
+The [Import and vectorize data wizard](search-get-started-portal-import-vectors.md) in the Azure portal uses the **Azure OpenAI Embedding** skill to vectorize content. You can run the wizard and review the generated skillset to see how the wizard builds the skill for the text-embedding-ada-002 model. 
 
 > [!NOTE]
 > This skill is bound to Azure OpenAI and is charged at the existing [Azure OpenAI pay-as-you go price](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/#pricing).
@@ -42,6 +42,18 @@ Parameters are case-sensitive.
 | `apiKey`   |  The secret key used to access the model. If you provide a key, leave `authIdentity` empty. If you set both the `apiKey` and `authIdentity`, the `apiKey` is used on the connection. |
 | `deploymentId`   | The name of the deployed Azure OpenAI embedding model. The model should be an embedding model, such as text-embedding-ada-002. See the [List of Azure OpenAI models](/azure/ai-services/openai/concepts/models) for supported models.|
 | `authIdentity`   | A user-managed identity used by the search service for connecting to Azure OpenAI. You can use either a [system or user managed identity](search-howto-managed-identities-data-sources.md). To use a system manged identity, leave `apiKey` and `authIdentity` blank. The system-managed identity is used automatically. A managed identity must have [Cognitive Services OpenAI User](/azure/ai-services/openai/how-to/role-based-access-control#azure-openai-roles) permissions to send text to Azure OpenAI. |
+| `modelName` | This property is required if your skillset is created using the 2024-05-01-preview REST API. Set this property to the deployment name of an Azure OpenAI embedding model deployed on the provider specified through `resourceUri` and identified through `deploymentId`. Currently, the supported values are `text-embedding-ada-002`, `text-embedding-3-large`, and `text-embedding-3-small`.  |
+| `dimensions` | (Optional, introduced in the 2024-05-01-preview REST API). The dimensions of embeddings that you would like to generate if the model supports reducing the embedding dimensions. Supported ranges are listed below. Defaults to the maximum dimensions for each model if not specified. For skillsets created using the 2023-10-01-preview, dimensions are fixed at 1536. |
+
+## Supported dimensions by `modelName`
+
+The supported dimensions for an Azure OpenAI Embedding skill depend on the `modelName` that is configured.
+
+| `modelName` | Minimum dimensions | Maximum dimensions |
+|--------------------|-------------|-------------|
+| text-embedding-ada-002 | 1536 | 1536 |
+| text-embedding-3-large | 1 | 3072 |
+| text-embedding-3-small | 1 | 1536 |
 
 ## Skill inputs
 
@@ -73,6 +85,8 @@ Then your skill definition might look like this:
   "description": "Connects a deployed embedding model.",
   "resourceUri": "https://my-demo-openai-eastus.openai.azure.com/",
   "deploymentId": "my-text-embedding-ada-002-model",
+  "modelName": "text-embedding-ada-002",
+  "dimensions": 1536,
   "inputs": [
     {
       "name": "text",
@@ -116,11 +130,20 @@ The output resides in memory. To send this output to a field in the search index
 ## Best practices
 
 The following are some best practices you need to consider when utilizing this skill:
+
 - If you are hitting your Azure OpenAI TPM (Tokens per minute) limit, consider the [quota limits advisory](../ai-services/openai/quotas-limits.md) so you can address accordingly. Refer to the [Azure OpenAI monitoring](../ai-services/openai/how-to/monitoring.md) documentation for more information about your Azure OpenAI instance performance.
+
 -	The Azure OpenAI embeddings model deployment you use for this skill should be ideally separate from the deployment used for other use cases, including the [query vectorizer](vector-search-how-to-configure-vectorizer.md). This helps each deployment to be tailored to its specific use case, leading to optimized performance and identifying traffic from the indexer and the index embedding calls easily.
+
 - Your Azure OpenAI instance should be in the same region or at least geographically close to the region where your AI Search service is hosted. This reduces latency and improves the speed of data transfer between the services.
+
 -	If you have a larger than default Azure OpenAI TPM (Tokens per minute) limit as published in [quotas and limits](../ai-services/openai/quotas-limits.md) documentation, open a [support case](../azure-portal/supportability/how-to-create-azure-support-request.md) with the Azure AI Search team, so this can be adjusted accordingly. This helps your indexing process not being unnecessarily slowed down by the documented default TPM limit, if you have higher limits.
 
+- For examples and working code samples using this skill, see the following links:
+
+  - [Integrated vectorization (Python)](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/integrated-vectorization/readme.md)
+  - [Integrated vectorization (C#)](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-dotnet/DotNetIntegratedVectorizationDemo/readme.md)
+  - [Integrated vectorization (Java)](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-java/demo-integrated-vectorization/readme.md)
 
 ## Errors and warnings