Merge pull request #6225 from MicrosoftDocs/main

learn-build-service-prod[bot] · web-flow · commit 24f9c9b6c5ad · 2025-07-25T17:08:02.000Z
Auto Publish – main to live - 2025-07-25 17:05 UTC
diff --git a/articles/ai-foundry/agents/concepts/threads-runs-messages.md b/articles/ai-foundry/agents/concepts/threads-runs-messages.md
@@ -34,11 +34,11 @@ A custom AI that uses AI models in conjunction with tools.
 
 ## Threads
 
-Threads are conversation sessions between an agent and a user. They store messages and automatically handle truncation to fit content into a model’s context. When you create a thread, you can append new messages to it as users respond.
+Threads are conversation sessions between an agent and a user. They store messages and automatically handle truncation to fit content into a model’s context. When you create a thread, you can append new messages (maximum of 100,000 per thread) to it as users respond.
 
 ## Messages
 
-Messages are the individual pieces of communication within a thread. They can be created by either the agent or the user and can include text, or other files. Messages are stored as a list within the thread, allowing for a structured and organized conversation flow.
+Messages are the individual pieces of communication within a thread. They can be created by either the agent or the user and can include text, or other files. Messages are stored as a list within the thread, allowing for a structured and organized conversation flow. You can attach up to 100,000 messages to a single thread.
 
 ## Runs
 
diff --git a/articles/ai-foundry/agents/quotas-limits.md b/articles/ai-foundry/agents/quotas-limits.md
@@ -21,10 +21,13 @@ The following sections provide you with a guide to the default quotas and limits
 
 | Limit Name | Limit Value |
 |--|--|
-| Max files per agent/thread | 10,000 |
-| Max file size for agents & fine-tuning | 512 MB |
-| Max size for all uploaded files for agents |200 GB |  
-| agents token limit | 2,000,000 token limit |
+| Maximum number of files per agent/thread | 10,000 |
+| Maximum file size for agents & fine-tuning | 512 MB |
+| Maximum size for all uploaded files for agents | 300 GB |
+| Maximum file size in tokens for attaching to a vector store | 2,000,000 tokens |
+| Maximum number of messages per thread | 100,000 |
+| Maximum size of `text` content per message | 1,500,000 characters |
+| Maximum number of tools registered per agent | 128 |
 
 ## Quotas and limits for Azure OpenAI models
 
diff --git a/articles/ai-foundry/openai/how-to/fine-tuning-deploy.md b/articles/ai-foundry/openai/how-to/fine-tuning-deploy.md
@@ -6,7 +6,7 @@ manager: nitinme
 ms.service: azure-ai-openai
 ms.custom: build-2023, build-2023-dataai, devx-track-python, references_regions
 ms.topic: how-to
-ms.date: 07/02/2025
+ms.date: 07/25/2025
 author: mrbullwinkle
 ms.author: mbullwin
 ---
@@ -272,7 +272,7 @@ curl -X PUT "https://management.azure.com/subscriptions/<SUBSCRIPTION>/resourceG
 }'
 ```
 
-### Deploy a model with Azure CLI
+## [CLI](#tab/cli)
 
 The following example shows how to use the Azure CLI to deploy your customized model. With the Azure CLI, you must specify a name for the deployment of your customized model. For more information about how to use the Azure CLI to deploy customized models, see [`az cognitiveservices account deployment`](/cli/azure/cognitiveservices/account/deployment).
 
@@ -297,6 +297,7 @@ az cognitiveservices account deployment create
     --sku-capacity "1" 
     --sku-name "Standard"
 ```
+
 ---
 
 [!INCLUDE [Fine-tuning deletion](../includes/fine-tune.md)]
@@ -343,6 +344,11 @@ curl $AZURE_OPENAI_ENDPOINT/openai/deployments/<deployment_name>/chat/completion
   -H "api-key: $AZURE_OPENAI_API_KEY" \
   -d '{"messages":[{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},{"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},{"role": "user", "content": "Do other Azure services support this too?"}]}'
 ```
+
+## [CLI](#tab/cli)
+
+Azure CLI is only for control plane operations such as resource creation and [model deployment](/cli/azure/cognitiveservices/account/deployment). For inference operations, use the [REST API](/azure/ai-foundry/openai/reference-preview-latest), or the [language based SDKs](../supported-languages.md).
+
 ---
 
 ### Prompt caching
diff --git a/articles/ai-services/speech-service/quickstarts/setup-platform.md b/articles/ai-services/speech-service/quickstarts/setup-platform.md
@@ -6,7 +6,7 @@ author: eric-urban
 manager: nitinme
 ms.service: azure-ai-speech
 ms.topic: quickstart
-ms.date: 7/12/2025
+ms.date: 7/25/2025
 ms.author: eur
 ms.custom: devx-track-python, devx-track-js, devx-track-csharp, mode-other, devx-track-dotnet, devx-track-extended-java, devx-track-go, ignite-2023, linux-related-content
 zone_pivot_groups: programming-languages-ai-services
@@ -49,7 +49,7 @@ zone_pivot_groups: programming-languages-ai-services
 
 ## Code samples
 
-In depth samples are available in the [Azure-Samples/cognitive-services-speech-sdk](https://aka.ms/csspeech/samples) repository on GitHub. There are samples for C# (including UWP and Unity), C++, Java, JavaScript (including Browser and Node.js), Objective-C, Python, and Swift. Code samples for Go are available in the [Microsoft/cognitive-services-speech-sdk-go](https://github.com/Microsoft/cognitive-services-speech-sdk-go) repository on GitHub.
+Code samples are available in the [Azure-Samples/cognitive-services-speech-sdk](https://aka.ms/csspeech/samples) repository on GitHub. There are samples for C# (including Universal Windows Platform (UWP) and Unity), C++, Java, JavaScript (including Browser and Node.js), Objective-C, Python, and Swift. Code samples for Go are available in the [Microsoft/cognitive-services-speech-sdk-go](https://github.com/Microsoft/cognitive-services-speech-sdk-go) repository on GitHub.
 
 ## Related content
 
diff --git a/articles/machine-learning/data-science-virtual-machine/dsvm-tutorial-resource-manager.md b/articles/machine-learning/data-science-virtual-machine/dsvm-tutorial-resource-manager.md
@@ -28,7 +28,7 @@ If your environment meets the prerequisites and you know how to use ARM template
 
 ## Prerequisites
 
-* An Azure subscription. If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/services/machine-learning/) before you begin.
+* An Azure subscription. If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/products/machine-learning/) before you begin.
 
 * You need the [Azure CLI](/cli/azure/install-azure-cli) to use the CLI commands in this document from your **local environment**.
 
diff --git a/articles/machine-learning/data-science-virtual-machine/reference-ubuntu-vm.md b/articles/machine-learning/data-science-virtual-machine/reference-ubuntu-vm.md
@@ -28,7 +28,7 @@ available in the `py38_pytorch` environment.
 
 H2O is a fast, in-memory, distributed machine learning and predictive analytics platform. A Python package is installed in both the root and py35 Anaconda environments. An R package is also installed.
 
-To open H2O from the command line, run `java -jar /dsvm/tools/h2o/current/h2o.jar`. You can configure various available[command-line options](http://docs.h2o.ai/h2o/latest-stable/h2o-docs/starting-h2o.html#from-the-command-line). Browse to the Flow web UI to `http://localhost:54321` to get started. JupyterHub offers sample notebooks.
+To open H2O from the command line, run `java -jar /dsvm/tools/h2o/current/h2o.jar`. You can configure various available command-line options. Browse to the Flow web UI to `http://localhost:54321` to get started. JupyterHub offers sample notebooks.
 
 ### TensorFlow
 
diff --git a/articles/machine-learning/prompt-flow/troubleshoot-guidance.md b/articles/machine-learning/prompt-flow/troubleshoot-guidance.md
@@ -146,7 +146,7 @@ You may encounter 409 error from Azure OpenAI, it means you have reached the rat
 
         In this case, if you find the message `request canceled` in the logs, it might be because the OpenAI API call is taking too long and exceeding the timeout limit.
 
-        An OpenAI API timeout could be caused by a network issue or a complex request that requires more processing time. For more information, see [OpenAI API timeout](https://help.openai.com/en/articles/6897186-timeout).
+        An OpenAI API timeout could be caused by a network issue or a complex request that requires more processing time. For more information, see [OpenAI API timeout](https://platform.openai.com/docs/actions/production#timeouts).
 
         Wait a few seconds and retry your request. This action usually resolves any network issues.