MicrosoftDocs
diff --git a/‎articles/ai-services/.openpublishing.redirection.ai-services.json‎
Lines changed: 5 additions & 0 deletions b/‎articles/ai-services/.openpublishing.redirection.ai-services.json‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎articles/ai-services/document-intelligence/studio-overview.md‎
Lines changed: 3 additions & 3 deletions b/‎articles/ai-services/document-intelligence/studio-overview.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎articles/ai-services/openai/assistants-quickstart.md‎
Lines changed: 2 additions & 2 deletions b/‎articles/ai-services/openai/assistants-quickstart.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎articles/ai-services/openai/gpt-v-quickstart.md‎
Lines changed: 2 additions & 2 deletions b/‎articles/ai-services/openai/gpt-v-quickstart.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎articles/ai-services/openai/how-to/batch.md‎
Lines changed: 2 additions & 2 deletions b/‎articles/ai-services/openai/how-to/batch.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎articles/ai-services/openai/how-to/fine-tuning.md‎
Lines changed: 1 addition & 1 deletion b/‎articles/ai-services/openai/how-to/fine-tuning.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/ai-services/openai/how-to/prompt-caching.md‎
Lines changed: 6 additions & 8 deletions b/‎articles/ai-services/openai/how-to/prompt-caching.md‎
Lines changed: 6 additions & 8 deletions
diff --git a/‎articles/ai-services/openai/how-to/weights-and-biases-integration.md‎
Lines changed: 1 addition & 1 deletion b/‎articles/ai-services/openai/how-to/weights-and-biases-integration.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/ai-services/openai/includes/assistants-studio.md‎
Lines changed: 1 addition & 1 deletion b/‎articles/ai-services/openai/includes/assistants-studio.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/ai-services/openai/includes/batch/batch-studio.md‎
Lines changed: 1 addition & 1 deletion b/‎articles/ai-services/openai/includes/batch/batch-studio.md‎
Lines changed: 1 addition & 1 deletion
@@ -775,6 +775,11 @@
       "redirect_url": "/azure/ai-services/speech-service/video-translation-get-started",
       "redirect_document_id": true
     },
+    {
+      "source_path_from_root": "/articles/ai-services/speech-service/custom-speech-ai-studio.md",
+      "redirect_url": "/azure/ai-services/speech-service/custom-speech-ai-foundry-portal",
+      "redirect_document_id": true
+    },
     {
       "source_path_from_root": "/articles/ai-services/qnamaker/how-to/migrate-to-openai.md",
       "redirect_url": "/azure/ai-services/qnamaker/overview/overview",
 
@@ -27,7 +27,7 @@ The studio is an online tool to visually explore, understand, train, and integra
 * Train custom extraction models to extract fields from documents.
 * Get sample code for the language specific `SDKs` to integrate into your applications.
 
-Currently, we're undergoing the migration of features from the [Document Intelligence Studio](https://documentintelligence.ai.azure.com/studio) to the new [AI Foundry](https://ai.azure.com/explore/aiservices/vision). There are some differences in the offerings for the two studios, which determine the correct studio for your use case.
+Currently, we're undergoing the migration of features from the [Document Intelligence Studio](https://documentintelligence.ai.azure.com/studio) to the new [AI Foundry portal](https://ai.azure.com/explore/aiservices/vision). There are some differences in the offerings for the two studios, which determine the correct studio for your use case.
 
 ## Choosing the correct studio experience
 
@@ -37,7 +37,7 @@ There are currently two studios, the [Azure AI Foundry](https://ai.azure.com/exp
 
 Document Intelligence Studio is the legacy experience that contains all features released on or before July 2024. For any of the v2.1, v3.0, v3.1 features, continue to use the Document Intelligence Studio. Studios provide a visual experience for labeling, training, and validating custom models. For custom document field extraction models, use the Document Intelligence Studio for template and neural models. Custom classification models can only be trained and used on Document Intelligence Studio. Use Document Intelligence Studio if you want to try out GA versions of the models from version 2.1, v3.0 and v3.1.
 
-### When to use [AI Foundry](https://ai.azure.com/explore/aiservices/vision)
+### When to use [AI Foundry portal](https://ai.azure.com/explore/aiservices/vision)
 
 Start with the new Azure AI Foundry and try any of the prebuilt document models from `2024-02-29-preview` version including general extraction models like Read or Layout. If you want to build and test a new [Document Field Extraction](https://ai.azure.com/explore/aiservices/vision/document/extraction) model, try our generative AI model, only available in the new AI Foundry.
 
@@ -210,5 +210,5 @@ Learn how to [connect your AI services hub](../../ai-studio/ai-services/how-to/c
 ## Next steps
 
 * Visit [Document Intelligence Studio](https://formrecognizer.appliedai.azure.com/studio).
-* Visit [AI Foundry](https://ai.azure.com/explore/aiservices/vision).
+* Visit [AI Foundry portal](https://ai.azure.com/explore/aiservices/vision).
 * Get started with [Document Intelligence Studio quickstart](quickstarts/try-document-intelligence-studio.md).
@@ -18,9 +18,9 @@ recommendations: false
 
 Azure OpenAI Assistants (Preview) allows you to create AI assistants tailored to your needs through custom instructions and augmented by advanced tools like code interpreter, and custom functions.
 
-::: zone pivot="programming-language-ai-studio"
+::: zone pivot="ai-foundry-portal"
 
-[!INCLUDE [AI Foundry](includes/assistants-ai-studio.md)]
+[!INCLUDE [AI Foundry portal](includes/assistants-ai-studio.md)]
 
 ::: zone-end
 
 
@@ -22,9 +22,9 @@ Get started using GPT-4 Turbo with images with the Azure OpenAI Service.
 >
 > The latest vision-capable models are `gpt-4o` and `gpt-4o mini`. These are in public preview. The latest available GA model is `gpt-4` version `turbo-2024-04-09`.
 
-::: zone pivot="programming-language-studio"
+::: zone pivot="ai-foundry-portal"
 
-[!INCLUDE [Studio quickstart](includes/gpt-v-studio.md)]
+[!INCLUDE [AI Foundry portal quickstart](includes/gpt-v-studio.md)]
 
 ::: zone-end
 
 
@@ -89,9 +89,9 @@ In the AI Foundry portal the deployment type will appear as `Global-Batch`.
 > [!TIP]
 > We recommend enabling **dynamic quota** for all global batch model deployments to help avoid job failures due to insufficient enqueued token quota. Dynamic quota allows your deployment to opportunistically take advantage of more quota when extra capacity is available. When dynamic quota is set to off, your deployment will only be able to process requests up to the enqueued token limit that was defined when you created the deployment.
 
-::: zone pivot="programming-language-ai-studio"
+::: zone pivot="ai-foundry-portal"
 
-[!INCLUDE [Studio](../includes/batch/batch-studio.md)]
+[!INCLUDE [AI Foundry portal](../includes/batch/batch-studio.md)]
 
 ::: zone-end
 
 
@@ -10,7 +10,7 @@ ms.topic: how-to
 ms.date: 11/11/2024
 author: mrbullwinkle
 ms.author: mbullwin
-zone_pivot_groups: openai-fine-tuning-newest
+zone_pivot_groups: openai-fine-tuning
 ---
 
 # Customize a model with fine-tuning
 
@@ -14,7 +14,9 @@ recommendations: false
 
 # Prompt caching
 
-Prompt caching allows you to reduce overall request latency and cost for longer prompts that have identical content at the beginning of the prompt. *"Prompt"* in this context is referring to the input you send to the model as part of your chat completions request. Rather than reprocess the same input tokens over and over again, the model is able to retain a temporary cache of processed input data to improve overall performance. Prompt caching has no impact on the output content returned in the model response beyond a reduction in latency and cost. For supported models, cached tokens are billed at a [50% discount on input token pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/).
+Prompt caching allows you to reduce overall request latency and cost for longer prompts that have identical content at the beginning of the prompt. *"Prompt"* in this context is referring to the input you send to the model as part of your chat completions request. Rather than reprocess the same input tokens over and over again, the service is able to retain a temporary cache of processed input token computations to improve overall performance. Prompt caching has no impact on the output content returned in the model response beyond a reduction in latency and cost. For supported models, cached tokens are billed at a [50% discount on input token pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) for Standard deployment types and up to [100% discount on input tokens](/azure/ai-services/openai/concepts/provisioned-throughput) for Provisioned deployment types. 
+
+Caches are typically cleared within 5-10 minutes of inactivity and are always removed within one hour of the cache's last use. Prompt caches are not shared between Azure subscriptions. 
 
 ## Supported models
 
@@ -28,7 +30,7 @@ Currently only the following models support prompt caching with Azure OpenAI:
 
 ## API support
 
-Official support for prompt caching was first added in API version `2024-10-01-preview`. At this time, only `o1-preview-2024-09-12` and `o1-mini-2024-09-12` models support the `cached_tokens` API response parameter.
+Official support for prompt caching was first added in API version `2024-10-01-preview`. At this time, only the o1 model family supports the `cached_tokens` API response parameter.
 
 ## Getting started
 
@@ -37,7 +39,7 @@ For a request to take advantage of prompt caching the request must be both:
 - A minimum of 1,024 tokens in length.
 - The first 1,024 tokens in the prompt must be identical.
 
-When a match is found between a prompt and the current content of the prompt cache, it's referred to as a cache hit. Cache hits will show up as [`cached_tokens`](/azure/ai-services/openai/reference-preview#cached_tokens) under [`prompt_token_details`](/azure/ai-services/openai/reference-preview#properties-for-prompt_tokens_details) in the chat completions response.
+When a match is found between the token computations in a prompt and the current content of the prompt cache, it's referred to as a cache hit. Cache hits will show up as [`cached_tokens`](/azure/ai-services/openai/reference-preview#cached_tokens) under [`prompt_token_details`](/azure/ai-services/openai/reference-preview#properties-for-prompt_tokens_details) in the chat completions response.
 
 ```json
 {
@@ -83,8 +85,4 @@ To improve the likelihood of cache hits occurring, you should structure your req
 
 ## Can I disable prompt caching?
 
-Prompt caching is enabled by default. There is no opt-out option.
-
-## How does prompt caching work for Provisioned deployments?
-
-For supported models on provisioned deployments, we discount up to 100% of cached input tokens. For more information, see our [Provisioned Throughput documentation](/azure/ai-services/openai/concepts/provisioned-throughput). 
+Prompt caching is enabled by default for all supported models. There is no opt-out support for prompt caching. 
@@ -87,7 +87,7 @@ Give your Azure OpenAI resource the **Key Vault Secrets Officer** role.
 
 ## Link Weights & Biases with Azure OpenAI
 
-1. Navigate to [AI Foundry](https://ai.azure.com) and select your Azure OpenAI fine-tuning resource.
+1. Navigate to [AI Foundry portal](https://ai.azure.com) and select your Azure OpenAI fine-tuning resource.
 
     :::image type="content" source="../media/how-to/weights-and-biases/manage-integrations.png" alt-text="Screenshot of the manage integrations button." lightbox="../media/how-to/weights-and-biases/manage-integrations.png":::
 
 
@@ -41,7 +41,7 @@ Use the **Assistant setup** pane to create a new AI assistant or to select an ex
 | **Deployment** | This is where you set which model deployment to use with your assistant. |
 | **Functions**| Create custom function definitions for the models to formulate API calls and structure data outputs based on your specifications |
 | **Code interpreter** | Code interpreter provides access to a sandboxed Python environment that can be used to allow the model to test and execute code. |
-| **Files** | You can upload up to 20 files, with a max file size of 512 MB to use with tools. You can upload up to 10,000 files using [AI Foundry](../assistants-quickstart.md?pivots=programming-language-ai-studio). |
+| **Files** | You can upload up to 20 files, with a max file size of 512 MB to use with tools. You can upload up to 10,000 files using [AI Foundry portal](../assistants-quickstart.md?pivots=ai-foundry-portal). |
 
 ### Tools
 
 
@@ -70,7 +70,7 @@ For this article, we'll create a file named `test.jsonl` and will copy the conte
 
 Once your input file is prepared, you first need to upload the file to then be able to kick off a batch job. File upload can be done both programmatically or via the Studio.
 
-1. Sign in to [AI Foundry](https://ai.azure.com).
+1. Sign in to [AI Foundry portal](https://ai.azure.com).
 2. Select the Azure OpenAI resource where you have a global batch model deployment available.
 3. Select **Batch jobs** > **+Create batch jobs**.
Original file line number	Diff line number	Diff line change
`@@ -22,9 +22,9 @@ Get started using GPT-4 Turbo with images with the Azure OpenAI Service.`
`22`	`22`	`>`
`23`	`23`	> The latest vision-capable models are `gpt-4o` and `gpt-4o mini`. These are in public preview. The latest available GA model is `gpt-4` version `turbo-2024-04-09`.
`24`	`24`
`25`		`-::: zone pivot="programming-language-studio"`
	`25`	`+::: zone pivot="ai-foundry-portal"`
`26`	`26`
`27`		`-[!INCLUDE [Studio quickstart](includes/gpt-v-studio.md)]`
	`27`	`+[!INCLUDE [AI Foundry portal quickstart](includes/gpt-v-studio.md)]`
`28`	`28`
`29`	`29`	`::: zone-end`
`30`	`30`