Merge pull request #3496 from MicrosoftDocs/main

PhilKang0704 · web-flow · commit acb6c21af3a5 · 2025-03-12T13:30:55.000+08:00
3/12/2025 11:00 AM IST Publish
diff --git a/articles/ai-foundry/how-to/develop/langchain.md b/articles/ai-foundry/how-to/develop/langchain.md
@@ -7,7 +7,7 @@ ms.service: azure-ai-foundry
 ms.custom:
   - ignite-2024
 ms.topic: how-to
-ms.date: 11/04/2024
+ms.date: 03/11/2025
 ms.reviewer: fasantia
 ms.author: sgilley
 author: sdgilley
@@ -21,7 +21,7 @@ Models deployed to [Azure AI Foundry](https://ai.azure.com) can be used with Lan
 
 - **Using the Azure AI model inference API:** All models deployed to Azure AI Foundry support the [Azure AI model inference API](../../../ai-foundry/model-inference/reference/reference-model-inference-api.md), which offers a common set of functionalities that can be used for most of the models in the catalog. The benefit of this API is that, since it's the same for all the models, changing from one to another is as simple as changing the model deployment being use. No further changes are required in the code. When working with LangChain, install the extensions `langchain-azure-ai`.
 
-- **Using the model's provider specific API:** Some models, like OpenAI, Cohere, or Mistral, offer their own set of APIs and extensions for LangChain. Those extensions may include specific functionalities that the model support and hence are suitable if you want to exploit them. When working with LangChain, install the extension specific for the model you want to use, like `langchain-openai` or `langchain-cohere`.
+- **Using the model's provider specific API:** Some models, like OpenAI, Cohere, or Mistral, offer their own set of APIs and extensions for LangChain. Those extensions might include specific functionalities that the model support and hence are suitable if you want to exploit them. When working with LangChain, install the extension specific for the model you want to use, like `langchain-openai` or `langchain-cohere`.
 
 In this tutorial, you learn how to use the packages `langchain-azure-ai` to build applications with LangChain.
 
@@ -38,7 +38,7 @@ To run this tutorial, you need:
     pip install langchain-core
     ```
 
-* In this example, we are working with the Azure AI model inference API, hence we install the following packages:
+* In this example, we're working with the Azure AI model inference API, hence we install the following packages:
 
     ```bash
     pip install -U langchain-azure-ai
@@ -65,7 +65,7 @@ export AZURE_INFERENCE_ENDPOINT="<your-model-endpoint-goes-here>"
 export AZURE_INFERENCE_CREDENTIAL="<your-key-goes-here>"
 ```
 
-Once configured, create a client to connect to the endpoint. In this case, we are working with a chat completions model hence we import the class `AzureAIChatCompletionsModel`.
+Once configured, create a client to connect to the endpoint. In this case, we're working with a chat completions model hence we import the class `AzureAIChatCompletionsModel`.
 
 ```python
 import os
@@ -98,7 +98,7 @@ model = AzureAIChatCompletionsModel(
 > [!NOTE]
 > When using Microsoft Entra ID, make sure that the endpoint was deployed with that authentication method and that you have the required permissions to invoke it.
 
-If you are planning to use asynchronous calling, it's a best practice to use the asynchronous version for the credentials:
+If you're planning to use asynchronous calling, it's a best practice to use the asynchronous version for the credentials:
 
 ```python
 from azure.identity.aio import (
@@ -127,7 +127,7 @@ model = AzureAIChatCompletionsModel(
 
 ## Use chat completions models
 
-Let's first use the model directly. `ChatModels` are instances of LangChain `Runnable`, which means they expose a standard interface for interacting with them. To simply call the model, we can pass in a list of messages to the `invoke` method.
+Let's first use the model directly. `ChatModels` are instances of LangChain `Runnable`, which means they expose a standard interface for interacting with them. To call the model, we can pass in a list of messages to the `invoke` method.
 
 ```python
 from langchain_core.messages import HumanMessage, SystemMessage
@@ -140,7 +140,7 @@ messages = [
 model.invoke(messages)
 ```
 
-You can also compose operations as needed in what's called **chains**. Let's now use a prompt template to translate sentences:
+You can also compose operations as needed in **chains**. Let's now use a prompt template to translate sentences:
 
 ```python
 from langchain_core.output_parsers import StrOutputParser
@@ -178,7 +178,7 @@ chain.invoke({"language": "italian", "text": "hi"})
 
 Models deployed to Azure AI Foundry support the Azure AI model inference API, which is standard across all the models. Chain multiple LLM operations based on the capabilities of each model so you can optimize for the right model based on capabilities. 
 
-In the following example, we create two model clients, one is a producer and another one is a verifier. To make the distinction clear, we are using a multi-model endpoint like the [Azure AI model inference service](../../model-inference/overview.md) and hence we are passing the parameter `model_name` to use a `Mistral-Large` and a `Mistral-Small` model, quoting the fact that **producing content is more complex than verifying it**.
+In the following example, we create two model clients. One is a producer and another one is a verifier. To make the distinction clear, we're using a multi-model endpoint like the [Azure AI model inference service](../../model-inference/overview.md) and hence we're passing the parameter `model_name` to use a `Mistral-Large` and a `Mistral-Small` model, quoting the fact that **producing content is more complex than verifying it**.
 
 ```python
 from langchain_azure_ai.chat_models import AzureAIChatCompletionsModel
@@ -254,7 +254,7 @@ chain.invoke({"topic": "living in a foreign country"})
 
 ## Use embeddings models
 
-In the same way, you create an LLM client, you can connect to an embeddings model. In the following example, we are setting the environment variable to now point to an embeddings model:
+In the same way, you create an LLM client, you can connect to an embeddings model. In the following example, we're setting the environment variable to now point to an embeddings model:
 
 ```bash
 export AZURE_INFERENCE_ENDPOINT="<your-model-endpoint-goes-here>"
diff --git a/articles/ai-foundry/how-to/develop/llama-index.md b/articles/ai-foundry/how-to/develop/llama-index.md
@@ -7,7 +7,7 @@ ms.service: azure-ai-foundry
 ms.custom:
   - ignite-2024
 ms.topic: how-to
-ms.date: 11/04/2024
+ms.date: 03/11/2025
 ms.reviewer: fasantia
 ms.author: sgilley
 author: sdgilley
@@ -21,9 +21,9 @@ Models deployed to [Azure AI Foundry](https://ai.azure.com) can be used with Lla
 
 - **Using the Azure AI model inference API:** All models deployed to Azure AI Foundry support the [Azure AI model inference API](../../../ai-foundry/model-inference/reference/reference-model-inference-api.md), which offers a common set of functionalities that can be used for most of the models in the catalog. The benefit of this API is that, since it's the same for all the models, changing from one to another is as simple as changing the model deployment being use. No further changes are required in the code. When working with LlamaIndex, install the extensions `llama-index-llms-azure-inference` and `llama-index-embeddings-azure-inference`.
 
-- **Using the model's provider specific API:** Some models, like OpenAI, Cohere, or Mistral, offer their own set of APIs and extensions for LlamaIndex. Those extensions may include specific functionalities that the model support and hence are suitable if you want to exploit them. When working with `llama-index`, install the extension specific for the model you want to use, like `llama-index-llms-openai` or `llama-index-llms-cohere`.
+- **Using the model's provider specific API:** Some models, like OpenAI, Cohere, or Mistral, offer their own set of APIs and extensions for LlamaIndex. Those extensions might include specific functionalities that the model support and hence are suitable if you want to exploit them. When working with `llama-index`, install the extension specific for the model you want to use, like `llama-index-llms-openai` or `llama-index-llms-cohere`.
 
-In this example, we are working with the **Azure AI model inference API**.
+In this example, we're working with the **Azure AI model inference API**.
 
 ## Prerequisites
 
@@ -42,7 +42,7 @@ To run this tutorial, you need:
     pip install llama-index
     ```
 
-* In this example, we are working with the Azure AI model inference API, hence we install the following packages:
+* In this example, we're working with the Azure AI model inference API, hence we install the following packages:
 
     ```bash
     pip install -U llama-index-llms-azure-inference
@@ -117,7 +117,7 @@ llm = AzureAICompletionsModel(
 > [!NOTE]
 > When using Microsoft Entra ID, make sure that the endpoint was deployed with that authentication method and that you have the required permissions to invoke it.
 
-If you are planning to use asynchronous calling, it's a best practice to use the asynchronous version for the credentials:
+If you're planning to use asynchronous calling, it's a best practice to use the asynchronous version for the credentials:
 
 ```python
 from azure.identity.aio import (
@@ -133,7 +133,7 @@ llm = AzureAICompletionsModel(
 
 ### Azure OpenAI models and Azure AI model inference service
 
-If you are using Azure OpenAI service or [Azure AI model inference service](../../model-inference/overview.md), ensure you have at least version `0.2.4` of the LlamaIndex integration. Use `api_version` parameter in case you need to select a specific `api_version`. 
+If you're using Azure OpenAI service or [Azure AI model inference service](../../model-inference/overview.md), ensure you have at least version `0.2.4` of the LlamaIndex integration. Use `api_version` parameter in case you need to select a specific `api_version`. 
 
 For the [Azure AI model inference service](../../model-inference/overview.md), you need to pass `model_name` parameter:
 
@@ -216,7 +216,7 @@ The `complete` method is still available for model of type `chat-completions`. O
 
 ## Use embeddings models
 
-In the same way you create an LLM client, you can connect to an embeddings model. In the following example, we are setting the environment variable to now point to an embeddings model:
+In the same way you create an LLM client, you can connect to an embeddings model. In the following example, we're setting the environment variable to now point to an embeddings model:
 
 ```bash
 export AZURE_INFERENCE_ENDPOINT="<your-model-endpoint-goes-here>"
@@ -260,7 +260,7 @@ Settings.llm = llm
 Settings.embed_model = embed_model
 ```
 
-However, there are scenarios where you want to use a general model for most of the operations but a specific one for a given task. On those cases, it's useful to set the LLM or embedding model you are using for each LlamaIndex construct. In the following example, we set a specific model:
+However, there are scenarios where you want to use a general model for most of the operations but a specific one for a given task. On those cases, it's useful to set the LLM or embedding model you're using for each LlamaIndex construct. In the following example, we set a specific model:
 
 ```python
 from llama_index.core.evaluation import RelevancyEvaluator
diff --git a/articles/ai-services/openai/how-to/function-calling.md b/articles/ai-services/openai/how-to/function-calling.md
@@ -11,7 +11,7 @@ ms.date: 02/28/2025
 manager: nitinme
 ---
 
-# How to use function calling with Azure OpenAI Service (Preview)
+# How to use function calling with Azure OpenAI Service
 
 The latest versions of gpt-35-turbo and gpt-4 are fine-tuned to work with functions and are able to both determine when and how a function should be called. If one or more functions are included in your request, the model determines if any of the functions should be called based on the context of the prompt. When the model determines that a function should be called, it responds with a JSON object including the arguments for the function. 
 
@@ -71,7 +71,7 @@ from zoneinfo import ZoneInfo
 client = AzureOpenAI(
     azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
     api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
-    api_version="2024-05-01-preview"
+    api_version="2025-02-01-preview"
 )
 
 # Define the deployment you want to use for your chat completions API calls
@@ -250,7 +250,7 @@ from zoneinfo import ZoneInfo
 client = AzureOpenAI(
     azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
     api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
-    api_version="2024-05-01-preview"
+    api_version="2025-02-01-preview"
 )
 
 # Provide the model deployment name you want to use for this example
diff --git a/articles/search/cognitive-search-predefined-skills.md b/articles/search/cognitive-search-predefined-skills.md
@@ -10,14 +10,14 @@ ms.custom:
   - build-2024
   - ignite-2024
 ms.topic: concept-article
-ms.date: 09/19/2024
+ms.date: 03/11/2025
 ---
 
 # Skills for extra processing during indexing (Azure AI Search)
 
 This article describes the skills in Azure AI Search that you can include in a [skillset](cognitive-search-working-with-skillsets.md) to access external processing. 
 
-A *skill* provides an atomic operation that transforms content in some way. Often, it's an operation that recognizes or extracts text, but it can also be a utility skill that reshapes the enrichments that are already created. Typically, the output is text-based so that it can be used in [full text search](search-lucene-query-architecture.md) or vectors used in [vector search](vector-search-overview.md).
+A *skill* is an atomic operation that transforms content in some way. Often, it's an operation that recognizes or extracts text, but it can also be a utility skill that reshapes the enrichments that are already created. Typically, the output is either text-based so that it can be used in [full text search](search-lucene-query-architecture.md), or vectors used in [vector search](vector-search-overview.md).
 
 Skills are organized into categories:
 
diff --git a/articles/search/hybrid-search-how-to-query.md b/articles/search/hybrid-search-how-to-query.md
@@ -9,7 +9,7 @@ ms.service: azure-ai-search
 ms.custom:
   - ignite-2023
 ms.topic: how-to
-ms.date: 10/01/2024
+ms.date: 03/11/2025
 ---
 
 # Create a hybrid query in Azure AI Search
@@ -19,19 +19,13 @@ ms.date: 10/01/2024
 In this article, learn how to:
 
 + Set up a basic request
-+ Formulate hybrid queries with more parameters and filters
++ Add parameters and filters
 + Improve relevance using semantic ranking or vector weights
 + Optimize query behaviors by controlling text and vector inputs
 
 > [!NOTE]
 > New in [**2024-09-01-preview**](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-09-01-preview&preserve-view=true) is the ability to target filters to just the vector subqueries in a hybrid request. This gives you more precision over how filters are applied. For more information, see [targeting filters to vector subqueries](#hybrid-search-with-filters-targeting-vector-subqueries-preview) in this article.
 
-<!-- To improve relevance in a hybrid query, use these parameters:
-
-+ [vector.queries.weight](vector-search-how-to-query.md#vector-weighting) lets you set the relative weight of the vector query. This feature is particularly useful in complex queries where two or more distinct result sets need to be combined, as is the case for hybrid search. This feature is generally available.
-
-+ [hybridsearch.maxTextRecallSize and countAndFacetMode (preview)](#set-maxtextrecallsize-and-countandfacetmode) give you more control over text inputs into a hybrid query. This feature requires a preview API version.
- -->
 ## Prerequisites
 
 + A search index containing `searchable` vector and nonvector fields. We recommend the [Import and vectorize data wizard](search-import-data-portal.md) to create an index quickly. Otherwise, see [Create an index](search-how-to-create-search-index.md) and [Add vector fields to a search index](vector-search-how-to-create-index.md).
diff --git a/articles/search/hybrid-search-ranking.md b/articles/search/hybrid-search-ranking.md
@@ -9,12 +9,12 @@ ms.service: azure-ai-search
 ms.custom:
   - ignite-2023
 ms.topic: conceptual
-ms.date: 10/01/2024
+ms.date: 03/11/2025
 ---
 
 # Relevance scoring in hybrid search using Reciprocal Rank Fusion (RRF)
 
-Reciprocal Rank Fusion (RRF) is an algorithm that evaluates the search scores from multiple, previously ranked results to produce a unified result set. In Azure AI Search, RRF is used whenever there are two or more queries that execute in parallel. Each query produces a ranked result set, and RRF is used to merge and homogenize the rankings into a single result set, returned in the query response. Examples of scenarios where RRF is always used include [*hybrid search*](hybrid-search-overview.md) and multiple vector queries executing concurrently. 
+Reciprocal Rank Fusion (RRF) is an algorithm that evaluates the search scores from multiple, previously ranked results to produce a unified result set. In Azure AI Search, RRF is used whenever there are two or more queries that execute in parallel. Each query produces a ranked result set, and RRF merges and homogenizes the rankings into a single result set for the query response. Examples of scenarios where RRF is always used include [*hybrid search*](hybrid-search-overview.md) and multiple vector queries executing concurrently. 
 
 RRF is based on the concept of *reciprocal rank*, which is the inverse of the rank of the first relevant document in a list of search results. The goal of the technique is to take into account the position of the items in the original rankings, and give higher importance to items that are ranked higher in multiple lists. This can help improve the overall quality and reliability of the final ranking, making it more useful for the task of fusing multiple ordered search results.
 
diff --git a/articles/search/search-capacity-planning.md b/articles/search/search-capacity-planning.md
@@ -11,7 +11,7 @@ ms.custom:
   - ignite-2023
   - ignite-2024
 ms.topic: conceptual
-ms.date: 10/02/2024
+ms.date: 03/11/2025
 ---
 
 # Estimate and manage capacity of a search service
diff --git a/articles/search/search-filters.md b/articles/search/search-filters.md
@@ -8,7 +8,7 @@ author: HeidiSteen
 ms.author: heidist
 ms.service: azure-ai-search
 ms.topic: concept-article
-ms.date: 09/19/2024
+ms.date: 03/11/2025
 ms.custom:
   - devx-track-csharp
   - ignite-2023
diff --git a/articles/search/search-how-to-index-csv-blobs.md b/articles/search/search-how-to-index-csv-blobs.md
@@ -11,7 +11,7 @@ ms.service: azure-ai-search
 ms.custom:
   - ignite-2023
 ms.topic: how-to
-ms.date: 10/23/2024
+ms.date: 03/11/2025
 ---
 
 # Index CSV blobs and files using delimitedText parsing mode
diff --git a/articles/search/search-howto-schedule-indexers.md b/articles/search/search-howto-schedule-indexers.md
@@ -9,7 +9,7 @@ ms.service: azure-ai-search
 ms.custom:
   - ignite-2023
 ms.topic: how-to
-ms.date: 10/02/2024
+ms.date: 03/11/2025
 ---
 
 # Schedule an indexer in Azure AI Search
diff --git a/articles/search/search-indexer-tutorial.md b/articles/search/search-indexer-tutorial.md
@@ -8,7 +8,7 @@ author: HeidiSteen
 ms.author: heidist
 ms.service: azure-ai-search
 ms.topic: tutorial
-ms.date: 09/23/2024
+ms.date: 03/11/2025
 ms.custom:
   - devx-track-csharp
   - devx-track-dotnet
diff --git a/articles/search/search-what-is-data-import.md b/articles/search/search-what-is-data-import.md
@@ -10,12 +10,12 @@ ms.service: azure-ai-search
 ms.custom:
   - ignite-2023
 ms.topic: concept-article
-ms.date: 09/17/2024
+ms.date: 03/11/2025
 ---
 
 # Data import in Azure AI Search
 
-In Azure AI Search, queries execute over user-owned content that's loaded into a [search index](search-what-is-an-index.md). This article describes the two basic workflows for populating an index: *push* your data into the index programmatically, or *pull* in the data using a [search indexer](search-indexer-overview.md).
+In Azure AI Search, queries execute over your content that's loaded into a [search index](search-what-is-an-index.md). This article describes the two basic workflows for populating an index: *push* your data into the index programmatically, or *pull* in the data using a [search indexer](search-indexer-overview.md).
 
 Both approaches load documents from an external data source. Although you can create an empty index, it's not queryable until you add the content.
 
diff --git a/articles/search/tutorial-rag-build-solution-maximize-relevance.md b/articles/search/tutorial-rag-build-solution-maximize-relevance.md
@@ -10,7 +10,7 @@ ms.service: azure-ai-search
 ms.custom:
   - ignite-2024
 ms.topic: tutorial
-ms.date: 10/05/2024
+ms.date: 03/11/2025
 ---
 
 # Tutorial: Maximize relevance (RAG in Azure AI Search)
diff --git a/articles/search/tutorial-rag-build-solution.md b/articles/search/tutorial-rag-build-solution.md
@@ -8,7 +8,7 @@ author: HeidiSteen
 ms.author: heidist
 ms.service: azure-ai-search
 ms.topic: overview
-ms.date: 10/04/2024
+ms.date: 03/11/2024
 
 ---
 
@@ -34,8 +34,6 @@ Sample code can be found in [this Python notebook](https://github.com/Azure-Samp
 
 - Minimize storage and costs
 
-<!-- - Deploy and secure an app -->
-
 We omitted a few aspects of a RAG pattern to reduce complexity:
 
 - No management of chat history and context. Chat history is typically stored and managed separately from your grounding data, which means extra steps and code. This tutorial assumes atomic question and answers from the LLM and the default LLM experience.
diff --git a/articles/search/vector-search-how-to-chunk-documents.md b/articles/search/vector-search-how-to-chunk-documents.md
diff --git a/articles/search/vector-search-how-to-generate-embeddings.md b/articles/search/vector-search-how-to-generate-embeddings.md
diff --git a/articles/search/vector-search-how-to-query.md b/articles/search/vector-search-how-to-query.md
diff --git a/articles/search/vector-store.md b/articles/search/vector-store.md