Merge branch 'MicrosoftDocs:main' into heidist-freshness

HeidiSteen · web-flow · commit 4a1251e0132a · 2025-08-15T09:46:35.000-07:00
diff --git a/articles/ai-foundry/agents/how-to/tools/azure-ai-search.md b/articles/ai-foundry/agents/how-to/tools/azure-ai-search.md
@@ -166,6 +166,13 @@ You can add the Azure AI Search tool to an agent programmatically using the code
 
 1. The index is created and connected to the [Azure AI Search](/azure/search/) service. You can now use the index with the Azure AI Search tool in your agent. You can also use the index outside of the agent, such as the Azure AI Search REST API or SDKs.
 
+
+## Limitations
+
+Currently, if you want to use the Azure AI Search tool in the Azure AI Foundry portal behind a virtual network, you must create an agent using the SDK or REST API. After creating the agent in a code-based manner, you can then use it in the portal. 
+
+The Azure AI Search tool can only include one search index. If you want to utilize multiple indexes, consider using [connected agents](../connected-agents.md) with an Azure AI search index configured with each agent.
+
 ## Next steps
 
 * See examples on how to use the [Azure AI Search tool](azure-ai-search-samples.md). 
diff --git a/articles/ai-foundry/concepts/prompt-flow.md b/articles/ai-foundry/concepts/prompt-flow.md
@@ -7,6 +7,7 @@ ms.custom:
   - ignite-2023
   - build-2024
   - ignite-2024
+  - hub-only
 ms.topic: concept-article
 ms.date: 06/30/2025
 ms.reviewer: none
@@ -24,6 +25,8 @@ Prompt flow is a development tool designed to streamline the entire development
 
 Prompt flow is available independently as an open-source project on [GitHub](https://github.com/microsoft/promptflow), with its own SDK and [VS Code extension](https://marketplace.visualstudio.com/items?itemName=prompt-flow.prompt-flow). Prompt flow is also available and recommended to use as a feature within both [Azure AI Foundry](https://ai.azure.com/?cid=learnDocs) and [Azure Machine Learning studio](https://ml.azure.com). This set of documentation focuses on prompt flow in Azure AI Foundry portal.
 
+[!INCLUDE [hub-only-prereq](../includes/uses-hub-only.md)]
+
 Definitions:
 
 - *Prompt flow* is a feature that can be used to generate, customize, or run a flow.
diff --git a/articles/ai-foundry/foundry-models/quotas-limits.md b/articles/ai-foundry/foundry-models/quotas-limits.md
@@ -6,19 +6,19 @@ author: msakande
 ms.service: azure-ai-model-inference
 ms.custom: ignite-2024, github-universe-2024
 ms.topic: concept-article
-ms.date: 05/19/2025
+ms.date: 08/14/2025
 ms.author: mopeakande
-ms.reviewer: fasantia
-reviewer: santiagxf
+ms.reviewer: shiyingfu
+reviewer: swingfu
 ---
 
 # Azure AI Foundry Models quotas and limits
 
-This article contains a quick reference and a detailed description of the quotas and limits for Azure AI Foundry Models. For quotas and limits specific to the Azure OpenAI in Foundry Models, see [Quota and limits in Azure OpenAI](../openai/quotas-limits.md).
+This article provides a quick reference and detailed description of the quotas and limits for Azure AI Foundry Models. For quotas and limits specific to the Azure OpenAI in Foundry Models, see [Quota and limits in Azure OpenAI](../openai/quotas-limits.md).
 
 ## Quotas and limits reference
 
-Azure uses quotas and limits to prevent budget overruns due to fraud, and to honor Azure capacity constraints. Consider these limits as you scale for production workloads. The following sections provide you with a quick guide to the default quotas and limits that apply to Azure AI model's inference service in Azure AI Foundry:
+Azure uses quotas and limits to prevent budget overruns due to fraud and to honor Azure capacity constraints. Consider these limits as you scale for production workloads. The following sections provide a quick guide to the default quotas and limits that apply to Azure AI model inference service in Azure AI Foundry:
 
 ### Resource limits
 
@@ -30,58 +30,66 @@ Azure uses quotas and limits to prevent budget overruns due to fraud, and to hon
 
 ### Rate limits
 
-| Limit name           | Applies to          | Limit value |
-| -------------------- | ------------------- | ----------- |
-| Tokens per minute    | Azure OpenAI models | Varies per model and SKU. See [limits for Azure OpenAI](../openai/quotas-limits.md). |
-| Requests per minute  | Azure OpenAI models | Varies per model and SKU. See [limits for Azure OpenAI](../openai/quotas-limits.md). |
-| Tokens per minute    | DeepSeek-R1<br />DeepSeek-V3-0324         | 5,000,000 |
-| Requests per minute  | DeepSeek-R1<br />DeepSeek-V3-0324         | 5,000     |
-| Concurrent requests  | DeepSeek-R1<br />DeepSeek-V3-0324         | 300       |
-| Tokens per minute    | Rest of models      | 400,000   |
-| Requests per minute  | Rest of models      | 1,000     |
-| Concurrent requests  | Rest of models      | 300       |
+The following table lists limits for Foundry Models for the following rates:
 
-You can [request increases to the default limits](#request-increases-to-the-default-limits). Due to high demand, limit increase requests can be submitted and evaluated per request.
+- Tokens per minute
+- Requests per minute
+- Concurrent request
+
+| Models                                                                 | Tokens per minute                                   | Requests per minute                                   | Concurrent requests   |
+| ---------------------------------------------------------------------- | --------------------------------------------------- | ----------------------------------------------------- | -------------------- |
+| Azure OpenAI models                                                    | Varies per model and SKU. See [limits for Azure OpenAI](../openai/quotas-limits.md). | Varies per model and SKU. See [limits for Azure OpenAI](../openai/quotas-limits.md). | not applicable       |
+| - DeepSeek-R1<br />- DeepSeek-V3-0324                                      | 5,000,000                                           | 5,000                                                 | 300                  |
+| - Llama 3.3 70B Instruct<br />- Llama-4-Maverick-17B-128E-Instruct-FP8<br />- Grok 3<br />- Grok 3 mini | 400,000                                             | 1,000                                                 | 300                  |
+| - Flux-Pro 1.1<br />- Flux.1-Kontext Pro                                   | not applicable                                      | 2 capacity units (6 requests per minute)              | not applicable       |
+| Rest of models                                                         | 400,000                                             | 1,000                                                 | 300                  |
+
+To increase your quota:
+
+- For Azure OpenAI, use [Azure AI Foundry Service: Request for Quota Increase](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR4xPXO648sJKt4GoXAed-0pUMFE1Rk9CU084RjA0TUlVSUlMWEQzVkJDNCQlQCN0PWcu) to submit your request. 
+- For other models, see [request increases to the default limits](#request-increases-to-the-default-limits). 
+ 
+Due to high demand, we evaluate limit increase requests per request.
 
 ### Other limits
 
 | Limit name | Limit value |
 |--|--|
 | Max number of custom headers in API requests<sup>1</sup> | 10 |
 
-<sup>1</sup> Our current APIs allow up to 10 custom headers, which are passed through the pipeline, and returned. We have noticed some customers now exceed this header count resulting in HTTP 431 errors. There is no solution for this error, other than to reduce header volume. **In future API versions we will no longer pass through custom headers**. We recommend customers not depend on custom headers in future system architectures.
+<sup>1</sup> Our current APIs allow up to 10 custom headers, which the pipeline passes through and returns. If you exceed this header count, your request results in an HTTP 431 error. To resolve this error, reduce the header volume. **Future API versions won't pass through custom headers**. We recommend that you don't depend on custom headers in future system architectures.
 
 ## Usage tiers
 
-Global Standard deployments use Azure's global infrastructure, dynamically routing customer traffic to the data center with best availability for the customer's inference requests. This enables more consistent latency for customers with low to medium levels of traffic. Customers with high sustained levels of usage might see more variabilities in response latency.
+Global Standard deployments use Azure's global infrastructure to dynamically route customer traffic to the data center with best availability for the customer's inference requests. This infrastructure enables more consistent latency for customers with low to medium levels of traffic. Customers with high sustained levels of usage might see more variabilities in response latency.
 
 The Usage Limit determines the level of usage above which customers might see larger variability in response latency. A customer's usage is defined per model and is the total tokens consumed across all deployments in all subscriptions in all regions for a given tenant.
 
 ## Request increases to the default limits
 
-Limit increase requests can be submitted and evaluated per request. [Open an online customer support request](https://portal.azure.com/#blade/Microsoft_Azure_Support/HelpAndSupportBlade/newsupportrequest/). When requesting for endpoint limit increase, provide the following information:
+You can submit limit increase requests, which we evaluate one at a time. [Open an online customer support request](https://portal.azure.com/#blade/Microsoft_Azure_Support/HelpAndSupportBlade/newsupportrequest/). When you request an endpoint limit increase, provide the following information:
 
-1. When opening the support request, select **Service and subscription limits (quotas)** as the **Issue type**.
+1. Select **Service and subscription limits (quotas)** as the **Issue type** when you open the support request.
 
-1. Select the subscription of your choice.
+1. Select the subscription you want to use.
 
 1. Select **Cognitive Services** as **Quota type**.
 
 1. Select **Next**.
 
-1. On the **Additional details** tab, you need to provide detailed reasons for the limit increase in order for your request to be processed. Be sure to add the following information into the reason for limit increase:
+1. On the **Additional details** tab, provide detailed reasons for the limit increase so that your request can be processed. Be sure to add the following information to the reason for limit increase:
 
    * Model name, model version (if applicable), and deployment type (SKU).
    * Description of your scenario and workload.
    * Rationale for the requested increase.
-   * Provide the target throughput: Tokens per minute, requests per minute, etc.
-   * Provide planned time plan (by when you need increased limits).
+   * Target throughput: Tokens per minute, requests per minute, and other relevant metrics.
+   * Planned time plan (by when you need increased limits).
 
-1. Finally, select **Save and continue** to continue.
+1. Select **Save and continue**.
 
-## General best practices to remain within rate limits
+## General best practices to stay within rate limits
 
-To minimize issues related to rate limits, it's a good idea to use the following techniques:
+To minimize issues related to rate limits, use the following techniques:
 
 - Implement retry logic in your application.
 - Avoid sharp changes in the workload. Increase the workload gradually.
diff --git a/articles/ai-foundry/how-to/flow-bulk-test-evaluation.md b/articles/ai-foundry/how-to/flow-bulk-test-evaluation.md
@@ -6,6 +6,7 @@ ms.service: azure-ai-foundry
 ms.custom:
   - ignite-2023
   - build-2024
+  - hub-only
 ms.topic: how-to
 ms.date: 5/21/2024
 ms.reviewer: none
@@ -32,6 +33,8 @@ In this article you learn to:
 
 ## Prerequisites
 
+[!INCLUDE [hub-only-prereq](../includes/hub-only-prereq.md)]
+
 For a batch run and to use an evaluation method, you need to have the following ready:
 
 - A test dataset for batch run. Your dataset should be in one of these formats: `.csv`, `.tsv`, or `.jsonl`. Your data should also include headers that match the input names of your flow. If your flow inputs include a complex structure like a list or dictionary, use `jsonl` format to represent your data. 
diff --git a/articles/ai-foundry/how-to/flow-develop-evaluation.md b/articles/ai-foundry/how-to/flow-develop-evaluation.md
@@ -6,6 +6,7 @@ ms.service: azure-ai-foundry
 ms.custom:
   - ignite-2023
   - build-2024
+  - hub-only
 ms.topic: how-to
 ms.date: 3/31/2025
 ms.reviewer: mithigpe
@@ -26,6 +27,10 @@ In prompt flow, you can customize or create your own evaluation flow tailored to
 - How to develop an evaluation method.
 - Understand inputs, outputs, and logging metrics for prompt flow evaluations.
 
+## Prerequisites
+
+[!INCLUDE [hub-only-prereq](../includes/hub-only-prereq.md)]
+
 ## Starting to develop an evaluation method
 
 There are two ways to develop your own evaluation methods:
diff --git a/articles/ai-foundry/how-to/flow-process-image.md b/articles/ai-foundry/how-to/flow-process-image.md
@@ -5,6 +5,7 @@ description: Learn how to use images in prompt flow.
 ms.service: azure-ai-foundry
 ms.custom:
   - build-2024
+  - hub-only
 ms.topic: how-to
 ms.date: 06/30/2025
 ms.reviewer: none
@@ -28,6 +29,8 @@ In this article, you learn:
 > - How to create a batch run using image data.  
 > - How to consume online endpoint with image data.
 
+[!INCLUDE [uses-hub-only](../includes/uses-hub-only.md)]
+
 ## Image type in prompt flow
 
 Prompt flow input and output support Image as a new data type.
diff --git a/articles/ai-foundry/how-to/flow-tune-prompts-using-variants.md b/articles/ai-foundry/how-to/flow-tune-prompts-using-variants.md
@@ -6,6 +6,7 @@ ms.service: azure-ai-foundry
 ms.custom:
   - ignite-2023
   - build-2024
+  - hub-only
 ms.topic: how-to
 ms.date: 3/31/2025
 ms.reviewer: none
@@ -25,6 +26,8 @@ Crafting a good prompt is a challenging task that requires much creativity, clar
 
 Variants can help you test the model’s behavior under different conditions, such as different wording, formatting, context, temperature, or top-k. You can compare and find the best prompt and configuration that maximizes the model's accuracy, diversity, or coherence.
 
+[!INCLUDE [uses-hub-only](../includes/uses-hub-only.md)]
+
 ## Variants in Prompt flow
 
 With prompt flow, you can use variants to tune your prompt. A variant refers to a specific version of a tool node that has distinct settings. Currently, variants are supported only in the [LLM tool](prompt-flow-tools/llm-tool.md). For example, in the LLM tool, a new variant can represent either a different prompt content or different connection settings.
diff --git a/articles/ai-foundry/how-to/prompt-flow-tools/embedding-tool.md b/articles/ai-foundry/how-to/prompt-flow-tools/embedding-tool.md
@@ -6,6 +6,7 @@ ms.service: azure-ai-foundry
 ms.custom:
   - ignite-2023
   - build-2024
+  - hub-only
 ms.topic: reference
 ms.date: 6/30/2025
 ms.reviewer: none
@@ -21,9 +22,13 @@ ms.update-cycle: 180-days
 
 The prompt flow Embedding tool enables you to convert text into dense vector representations for various natural language processing tasks.
 
-> [!NOTE]
+> [!TIP]
 > For chat and completion tools, learn more about the large language model [(LLM) tool](llm-tool.md).
 
+## Prerequisites
+
+[!INCLUDE [hub-only-prereq](../../includes/hub-only-prereq.md)]
+
 ## Build with the Embedding tool
 
 1. Create or open a flow in [Azure AI Foundry](https://ai.azure.com/?cid=learnDocs). For more information, see [Create a flow](../flow-develop.md).
diff --git a/articles/ai-foundry/how-to/prompt-flow-tools/index-lookup-tool.md b/articles/ai-foundry/how-to/prompt-flow-tools/index-lookup-tool.md
@@ -5,6 +5,7 @@ description: This article introduces you to the Index Lookup tool for flows in A
 ms.service: azure-ai-foundry
 ms.custom:
   - build-2024
+  - hub-only
 ms.topic: reference
 ms.date: 6/30/2025
 ms.reviewer: none
@@ -20,6 +21,10 @@ ms.update-cycle: 180-days
 
 The prompt flow Index Lookup tool enables the use of common vector indices (such as Azure AI Search, Faiss, and Pinecone) for retrieval augmented generation in prompt flow. The tool automatically detects the indices in the workspace and allows the selection of the index to be used in the flow.
 
+## Prerequisites
+
+[!INCLUDE [hub-only-prereq](../../includes/hub-only-prereq.md)]
+
 ## Build with the Index Lookup tool
 
 1. Create or open a flow in [Azure AI Foundry](https://ai.azure.com/?cid=learnDocs). For more information, see [Create a flow](../flow-develop.md).
diff --git a/articles/ai-foundry/how-to/prompt-flow-tools/llm-tool.md b/articles/ai-foundry/how-to/prompt-flow-tools/llm-tool.md
@@ -6,6 +6,7 @@ ms.service: azure-ai-foundry
 ms.custom:
   - ignite-2023
   - build-2024
+  - hub-only
 ms.topic: reference
 ms.date: 6/30/2025
 ms.reviewer: none
@@ -21,11 +22,13 @@ ms.update-cycle: 180-days
 
 To use large language models (LLMs) for natural language processing, you use the prompt flow LLM tool.
 
-> [!NOTE]
+> [!TIP]
 > For embeddings to convert text into dense vector representations for various natural language processing tasks, see [Embedding tool](embedding-tool.md).
 
 ## Prerequisites
 
+[!INCLUDE [hub-only-prereq](../../includes/hub-only-prereq.md)]
+
 Prepare a prompt as described in the [Prompt tool](prompt-tool.md#prerequisites) documentation. The LLM tool and Prompt tool both support [Jinja](https://jinja.palletsprojects.com/en/stable/) templates. For more information and best practices, see [Prompt engineering techniques](../../openai/concepts/advanced-prompt-engineering.md).
 
 ## Build with the LLM tool
diff --git a/articles/ai-foundry/how-to/prompt-flow-tools/prompt-flow-tools-overview.md b/articles/ai-foundry/how-to/prompt-flow-tools/prompt-flow-tools-overview.md
@@ -5,6 +5,7 @@ description: Learn about prompt flow tools that are available in Azure AI Foundr
 ms.service: azure-ai-foundry
 ms.custom:
   - build-2024
+  - hub-only
 ms.topic: reference
 ms.date: 6/30/2025
 ms.reviewer: none
@@ -34,6 +35,8 @@ The following table provides an index of tools in prompt flow.
 
 <sup>1</sup> The Index Lookup tool replaces the three deprecated legacy index tools: Vector Index Lookup, Vector DB Lookup, and Faiss Index Lookup.
 
+[!INCLUDE [uses-hub-only](../../includes/uses-hub-only.md)]
+
 ## Custom tools
 
 To discover more custom tools developed by the open-source community such as [Azure AI Language tools](https://pypi.org/project/promptflow-azure-ai-language/), see [More custom tools](https://microsoft.github.io/promptflow/integrations/tools/index.html).
diff --git a/articles/ai-foundry/how-to/prompt-flow-tools/prompt-tool.md b/articles/ai-foundry/how-to/prompt-flow-tools/prompt-tool.md
@@ -6,6 +6,7 @@ ms.service: azure-ai-foundry
 ms.custom:
   - ignite-2023
   - build-2024
+  - hub-only
 ms.topic: reference
 ms.date: 6/30/2025
 ms.reviewer: none
@@ -23,6 +24,8 @@ The prompt flow Prompt tool offers a collection of textual templates that serve
 
 ## Prerequisites
 
+[!INCLUDE [hub-only-prereq](../../includes/hub-only-prereq.md)]
+
 Prepare a prompt. The [LLM tool](llm-tool.md) and Prompt tool both support [Jinja](https://jinja.palletsprojects.com/en/stable/) templates.
 
 In this example, the prompt incorporates Jinja templating syntax to dynamically generate the welcome message and personalize it based on the user's name. It also presents a menu of options for the user to choose from. Depending on whether the `user_name` variable is provided, it either addresses the user by name or uses a generic greeting.
diff --git a/articles/ai-foundry/how-to/prompt-flow-tools/python-tool.md b/articles/ai-foundry/how-to/prompt-flow-tools/python-tool.md
@@ -24,6 +24,10 @@ ms.update-cycle: 180-days
 
 The prompt flow Python tool offers customized code snippets as self-contained executable nodes. You can quickly create Python tools, edit code, and verify results.
 
+## Prerequisites
+
+[!INCLUDE [hub-only-prereq](../../includes/hub-only-prereq.md)]
+
 ## Build with the Python tool
 
 1. Create or open a flow in [Azure AI Foundry](https://ai.azure.com/?cid=learnDocs). For more information, see [Create a flow](../flow-develop.md).
diff --git a/articles/ai-foundry/how-to/prompt-flow-tools/rerank-tool.md b/articles/ai-foundry/how-to/prompt-flow-tools/rerank-tool.md
@@ -8,6 +8,7 @@ ms.date: 6/30/2025
 ms.reviewer: jingyizhu
 ms.author: lagayhar
 author: lgayhardt
+ms.custom: hub-only
 ms.collection: ce-skilling-ai-copilot, ce-skilling-fresh-tier1
 ms.update-cycle: 180-days
 ---
@@ -19,6 +20,10 @@ The prompt flow Rerank tool improves search quality of relevant documents given
 
 [!INCLUDE [feature-preview](../../includes/feature-preview.md)]
 
+## Prerequisites
+
+[!INCLUDE [hub-only-prereq](../../includes/hub-only-prereq.md)]
+
 ## Use the Rerank tool
 
 1. Create or open a flow in Azure AI Foundry portal. For more information, see [Create a flow](../flow-develop.md).
diff --git a/articles/ai-foundry/openai/concepts/models.md b/articles/ai-foundry/openai/concepts/models.md
diff --git a/articles/ai-foundry/responsible-ai/face/data-privacy-security.md b/articles/ai-foundry/responsible-ai/face/data-privacy-security.md
diff --git a/articles/ai-services/computer-vision/concept-liveness-abuse-monitoring.md b/articles/ai-services/computer-vision/concept-liveness-abuse-monitoring.md
diff --git a/articles/ai-services/computer-vision/how-to/liveness-use-network-isolation.md b/articles/ai-services/computer-vision/how-to/liveness-use-network-isolation.md
diff --git a/articles/ai-services/computer-vision/toc.yml b/articles/ai-services/computer-vision/toc.yml
diff --git a/articles/search/search-capacity-planning.md b/articles/search/search-capacity-planning.md