You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/document-intelligence/studio-overview.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,7 @@ The studio is an online tool to visually explore, understand, train, and integra
27
27
* Train custom extraction models to extract fields from documents.
28
28
* Get sample code for the language specific `SDKs` to integrate into your applications.
29
29
30
-
Currently, we're undergoing the migration of features from the [Document Intelligence Studio](https://documentintelligence.ai.azure.com/studio) to the new [AI Foundry](https://ai.azure.com/explore/aiservices/vision). There are some differences in the offerings for the two studios, which determine the correct studio for your use case.
30
+
Currently, we're undergoing the migration of features from the [Document Intelligence Studio](https://documentintelligence.ai.azure.com/studio) to the new [AI Foundry portal](https://ai.azure.com/explore/aiservices/vision). There are some differences in the offerings for the two studios, which determine the correct studio for your use case.
31
31
32
32
## Choosing the correct studio experience
33
33
@@ -37,7 +37,7 @@ There are currently two studios, the [Azure AI Foundry](https://ai.azure.com/exp
37
37
38
38
Document Intelligence Studio is the legacy experience that contains all features released on or before July 2024. For any of the v2.1, v3.0, v3.1 features, continue to use the Document Intelligence Studio. Studios provide a visual experience for labeling, training, and validating custom models. For custom document field extraction models, use the Document Intelligence Studio for template and neural models. Custom classification models can only be trained and used on Document Intelligence Studio. Use Document Intelligence Studio if you want to try out GA versions of the models from version 2.1, v3.0 and v3.1.
39
39
40
-
### When to use [AI Foundry](https://ai.azure.com/explore/aiservices/vision)
40
+
### When to use [AI Foundry portal](https://ai.azure.com/explore/aiservices/vision)
41
41
42
42
Start with the new Azure AI Foundry and try any of the prebuilt document models from `2024-02-29-preview` version including general extraction models like Read or Layout. If you want to build and test a new [Document Field Extraction](https://ai.azure.com/explore/aiservices/vision/document/extraction) model, try our generative AI model, only available in the new AI Foundry.
43
43
@@ -210,5 +210,5 @@ Learn how to [connect your AI services hub](../../ai-studio/ai-services/how-to/c
Copy file name to clipboardExpand all lines: articles/ai-services/openai/assistants-quickstart.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,9 +18,9 @@ recommendations: false
18
18
19
19
Azure OpenAI Assistants (Preview) allows you to create AI assistants tailored to your needs through custom instructions and augmented by advanced tools like code interpreter, and custom functions.
Copy file name to clipboardExpand all lines: articles/ai-services/openai/gpt-v-quickstart.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,9 +22,9 @@ Get started using GPT-4 Turbo with images with the Azure OpenAI Service.
22
22
>
23
23
> The latest vision-capable models are `gpt-4o` and `gpt-4o mini`. These are in public preview. The latest available GA model is `gpt-4` version `turbo-2024-04-09`.
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/batch.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -89,9 +89,9 @@ In the AI Foundry portal the deployment type will appear as `Global-Batch`.
89
89
> [!TIP]
90
90
> We recommend enabling **dynamic quota** for all global batch model deployments to help avoid job failures due to insufficient enqueued token quota. Dynamic quota allows your deployment to opportunistically take advantage of more quota when extra capacity is available. When dynamic quota is set to off, your deployment will only be able to process requests up to the enqueued token limit that was defined when you created the deployment.
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/prompt-caching.md
+6-8Lines changed: 6 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,9 @@ recommendations: false
14
14
15
15
# Prompt caching
16
16
17
-
Prompt caching allows you to reduce overall request latency and cost for longer prompts that have identical content at the beginning of the prompt. *"Prompt"* in this context is referring to the input you send to the model as part of your chat completions request. Rather than reprocess the same input tokens over and over again, the model is able to retain a temporary cache of processed input data to improve overall performance. Prompt caching has no impact on the output content returned in the model response beyond a reduction in latency and cost. For supported models, cached tokens are billed at a [50% discount on input token pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/).
17
+
Prompt caching allows you to reduce overall request latency and cost for longer prompts that have identical content at the beginning of the prompt. *"Prompt"* in this context is referring to the input you send to the model as part of your chat completions request. Rather than reprocess the same input tokens over and over again, the service is able to retain a temporary cache of processed input token computations to improve overall performance. Prompt caching has no impact on the output content returned in the model response beyond a reduction in latency and cost. For supported models, cached tokens are billed at a [50% discount on input token pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) for Standard deployment types and up to [100% discount on input tokens](/azure/ai-services/openai/concepts/provisioned-throughput) for Provisioned deployment types.
18
+
19
+
Caches are typically cleared within 5-10 minutes of inactivity and are always removed within one hour of the cache's last use. Prompt caches are not shared between Azure subscriptions.
18
20
19
21
## Supported models
20
22
@@ -28,7 +30,7 @@ Currently only the following models support prompt caching with Azure OpenAI:
28
30
29
31
## API support
30
32
31
-
Official support for prompt caching was first added in API version `2024-10-01-preview`. At this time, only `o1-preview-2024-09-12` and `o1-mini-2024-09-12` models support the `cached_tokens` API response parameter.
33
+
Official support for prompt caching was first added in API version `2024-10-01-preview`. At this time, only the o1 model family supports the `cached_tokens` API response parameter.
32
34
33
35
## Getting started
34
36
@@ -37,7 +39,7 @@ For a request to take advantage of prompt caching the request must be both:
37
39
- A minimum of 1,024 tokens in length.
38
40
- The first 1,024 tokens in the prompt must be identical.
39
41
40
-
When a match is found between a prompt and the current content of the prompt cache, it's referred to as a cache hit. Cache hits will show up as [`cached_tokens`](/azure/ai-services/openai/reference-preview#cached_tokens) under [`prompt_token_details`](/azure/ai-services/openai/reference-preview#properties-for-prompt_tokens_details) in the chat completions response.
42
+
When a match is found between the token computations in a prompt and the current content of the prompt cache, it's referred to as a cache hit. Cache hits will show up as [`cached_tokens`](/azure/ai-services/openai/reference-preview#cached_tokens) under [`prompt_token_details`](/azure/ai-services/openai/reference-preview#properties-for-prompt_tokens_details) in the chat completions response.
41
43
42
44
```json
43
45
{
@@ -83,8 +85,4 @@ To improve the likelihood of cache hits occurring, you should structure your req
83
85
84
86
## Can I disable prompt caching?
85
87
86
-
Prompt caching is enabled by default. There is no opt-out option.
87
-
88
-
## How does prompt caching work for Provisioned deployments?
89
-
90
-
For supported models on provisioned deployments, we discount up to 100% of cached input tokens. For more information, see our [Provisioned Throughput documentation](/azure/ai-services/openai/concepts/provisioned-throughput).
88
+
Prompt caching is enabled by default for all supported models. There is no opt-out support for prompt caching.
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/weights-and-biases-integration.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -87,7 +87,7 @@ Give your Azure OpenAI resource the **Key Vault Secrets Officer** role.
87
87
88
88
## Link Weights & Biases with Azure OpenAI
89
89
90
-
1. Navigate to [AI Foundry](https://ai.azure.com) and select your Azure OpenAI fine-tuning resource.
90
+
1. Navigate to [AI Foundry portal](https://ai.azure.com) and select your Azure OpenAI fine-tuning resource.
91
91
92
92
:::image type="content" source="../media/how-to/weights-and-biases/manage-integrations.png" alt-text="Screenshot of the manage integrations button." lightbox="../media/how-to/weights-and-biases/manage-integrations.png":::
Copy file name to clipboardExpand all lines: articles/ai-services/openai/includes/assistants-studio.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,7 +41,7 @@ Use the **Assistant setup** pane to create a new AI assistant or to select an ex
41
41
|**Deployment**| This is where you set which model deployment to use with your assistant. |
42
42
|**Functions**| Create custom function definitions for the models to formulate API calls and structure data outputs based on your specifications |
43
43
|**Code interpreter**| Code interpreter provides access to a sandboxed Python environment that can be used to allow the model to test and execute code. |
44
-
|**Files**| You can upload up to 20 files, with a max file size of 512 MB to use with tools. You can upload up to 10,000 files using [AI Foundry](../assistants-quickstart.md?pivots=programming-language-ai-studio). |
44
+
|**Files**| You can upload up to 20 files, with a max file size of 512 MB to use with tools. You can upload up to 10,000 files using [AI Foundry portal](../assistants-quickstart.md?pivots=ai-foundry-portal). |
Copy file name to clipboardExpand all lines: articles/ai-services/openai/includes/batch/batch-studio.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -70,7 +70,7 @@ For this article, we'll create a file named `test.jsonl` and will copy the conte
70
70
71
71
Once your input file is prepared, you first need to upload the file to then be able to kick off a batch job. File upload can be done both programmatically or via the Studio.
72
72
73
-
1. Sign in to [AI Foundry](https://ai.azure.com).
73
+
1. Sign in to [AI Foundry portal](https://ai.azure.com).
74
74
2. Select the Azure OpenAI resource where you have a global batch model deployment available.
0 commit comments