You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-foundry/agents/how-to/use-your-own-resources.md
+25-3Lines changed: 25 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ services: cognitive-services
6
6
manager: nitinme
7
7
ms.service: azure-ai-agent-service
8
8
ms.topic: how-to
9
-
ms.date: 06/18/2025
9
+
ms.date: 07/23/2025
10
10
author: aahill
11
11
ms.author: aahi
12
12
ms.reviewer: fosteramanda
@@ -19,7 +19,29 @@ Use this article if you want to set up your Foundry project with your own resour
19
19
20
20
## Limitations
21
21
22
-
**Use Azure Cosmos DB for NoSQL to store threads**
22
+
There are some limitations you should be aware of when you plan to use existing resources with the Azure AI Foundry Agent Service.
23
+
24
+
### If you are using a hub-based project or Azure OpenAI Assistants
25
+
26
+
At this time, there is no direct upgrade path to migrate existing agents or their associated data assets such as files, threads, or vector stores from a hub-based project to an Azure AI Foundry project. There is also no upgrade path to convert existing Azure OpenAI Assistants into Foundry Agents, nor a way to automatically migrate Assistants' files, threads, or vector stores.
27
+
28
+
You can reuse your existing model deployments and quota from Azure AI Services or Azure OpenAI resources within a Foundry project.
29
+
30
+
### SDK usage with hub-based projects
31
+
32
+
Starting in May 2025, the Azure AI Agent Service uses an endpoint for [Foundry projects](../../what-is-azure-ai-foundry.md#project-types) instead of the connection string that was used for hub-based projects before this time. Connection strings are no longer supported in current versions of the SDKs and REST API. We recommend creating a new foundry project.
33
+
34
+
If you want to continue using your hub-based project and connection string, you will need to:
35
+
* Use the connection string for your project located under **Connection string** in the overview of your project.
36
+
37
+
:::image type="content" source="../../media/quickstarts/azure-ai-sdk/connection-string.png" alt-text="A screenshot showing the legacy connection string for a hub-based project.":::
38
+
39
+
* Use one of the previous versions of the SDK and the associated sample code:
40
+
*[C#](https://github.com/Azure/azure-sdk-for-net/tree/feature/azure-ai-agents/sdk/ai/Azure.AI.Projects/samples): `1.0.0-beta.2` or earlier
41
+
*[Python](https://github.com/Azure/azure-sdk-for-python/tree/feature/azure-ai-projects-beta10/sdk/ai/azure-ai-projects/samples/agents): `1.0.0b10` or earlier
42
+
43
+
### Azure Cosmos DB for NoSQL to store threads
44
+
23
45
- Your existing Azure Cosmos DB for NoSQL account used in a [standard setup](#choose-basic-or-standard-agent-setup) must have a total throughput limit of at least 3000 RU/s. Both provisioned throughput and serverless are supported.
24
46
- Three containers will be provisioned in your existing Cosmos DB account, each requiring 1000 RU/s
25
47
@@ -70,7 +92,7 @@ Includes everything in the basic setup and fine-grained control over your data b
70
92
71
93
## Basic agent setup: Use an existing Azure OpenAI resource
72
94
73
-
Replace the parameter value for `existingAoaiResourceId` with the full arm resource ID of the Azure OpenAI resource you want to use.
95
+
Replace the parameter value for `existingAoaiResourceId`in the [template](https://github.com/azure-ai-foundry/foundry-samples/tree/main/samples/microsoft/infrastructure-setup/42-basic-agent-setup-with-customization) with the full arm resource ID of the Azure OpenAI resource you want to use.
74
96
75
97
1. To get the Azure OpenAI account resource ID, sign in to the Azure CLI and select the subscription with your AI Services account:
> Starting in May 2025, the Azure AI Agent Service uses an endpoint for [Foundry projects](../../what-is-azure-ai-foundry.md#project-types) instead of the connection string that was previously used for hub-based projects. If you're using a hub-based project, you won't be able to use the current versions of the SDK and REST API. For more information, see [SDK usage with hub-based projects](../how-to/use-your-own-resources.md#sdk-usage-with-hub-based-projects).
Set this endpoint as an environment variable named `PROJECT_ENDPOINT` in a `.env` file.
65
+
Save the name of your model deployment name as an environment variable named `MODEL_DEPLOYMENT_NAME`.
66
66
67
67
> [!IMPORTANT]
68
68
> * This quickstart code uses environment variables for sensitive configuration. Never commit your `.env` file to version control by making sure `.env` is listed in your `.gitignore` file.
Copy file name to clipboardExpand all lines: articles/ai-foundry/concepts/rbac-azure-ai-foundry.md
+4-8Lines changed: 4 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -270,7 +270,7 @@ Here's a table of the built-in roles and their permissions for the hub:
270
270
| --- | --- |
271
271
| Owner | Full access to the hub, including the ability to manage and create new hubs and assign permissions. This role is automatically assigned to the hub creator|
272
272
| Contributor | User has full access to the hub, including the ability to create new hubs, but isn't able to manage hub permissions on the existing resource. |
273
-
| Azure AI Administrator (preview) | This role is automatically assigned to the system-assigned managed identity for the hub. The Azure AI Administrator role has the minimum permissions needed for the managed identity to perform its tasks. For more information, see [Azure AI Administrator role (preview)](#azure-ai-administrator-role-preview). |
273
+
| Azure AI Administrator | This role is automatically assigned to the system-assigned managed identity for the hub. The Azure AI Administrator role has the minimum permissions needed for the managed identity to perform its tasks. For more information, see [Azure AI Administrator role](#azure-ai-administrator-role). |
274
274
| Azure AI Developer | Perform all actions except create new hubs and manage the hub permissions. For example, users can create projects, compute, and connections. Users can assign permissions within their project. Users can interact with existing Azure AI resources such as Azure OpenAI, Azure AI Search, and Azure AI services. |
275
275
| Azure AI Inference Deployment Operator | Perform all actions required to create a resource deployment within a resource group. |
276
276
| Reader | Read only access to the hub. This role is automatically assigned to all project members within the hub. |
@@ -279,14 +279,10 @@ The key difference between Contributor and Azure AI Developer is the ability to
279
279
280
280
Only the Owner and Contributor roles allow you to make a hub. At this time, custom roles can't grant you permission to make hubs.
281
281
282
-
### Azure AI Administrator role (preview)
282
+
### Azure AI Administrator role
283
283
284
284
Before 11/19/2024, the system-assigned managed identity created for the hub was automatically assigned the __Contributor__ role for the resource group that contains the hub and projects. Hubs created after this date have the system-assigned managed identity assigned to the __Azure AI Administrator__ role. This role is more narrowly scoped to the minimum permissions needed for the managed identity to perform its tasks.
285
285
286
-
The __Azure AI Administrator__ role is currently in public preview.
The __Azure AI Administrator__ role has the following permissions:
291
287
292
288
```json
@@ -419,7 +415,7 @@ Here's a table of the built-in roles and their permissions for the project:
419
415
| --- | --- |
420
416
| Owner | Full access to the project, including the ability to assign permissions to project users. |
421
417
| Contributor | User has full access to the project but can't assign permissions to project users. |
422
-
| Azure AI Administrator (preview) | This role is automatically assigned to the system-assigned managed identity for the hub. The Azure AI Administrator role has the minimum permissions needed for the managed identity to perform its tasks. For more information, see [Azure AI Administrator role (preview)](#azure-ai-administrator-role-preview). |
418
+
| Azure AI Administrator | This role is automatically assigned to the system-assigned managed identity for the hub. The Azure AI Administrator role has the minimum permissions needed for the managed identity to perform its tasks. For more information, see [Azure AI Administrator role](#azure-ai-administrator-role). |
423
419
| Azure AI Developer | User can perform most actions, including create deployments, but can't assign permissions to project users. |
424
420
| Azure AI Inference Deployment Operator | Perform all actions required to create a resource deployment within a resource group. |
425
421
| Reader | Read only access to the project. |
@@ -767,4 +763,4 @@ If you create a new hub and encounter errors with the new default role assignmen
767
763
- [How to create an Azure AI Foundry project](../how-to/create-projects.md)
768
764
- [How to create a connection in Azure AI Foundry portal](../how-to/connections-add.md)
@@ -194,7 +194,7 @@ When developing with the OpenAI SDK, you can instrument your code so traces are
194
194
195
195
## Trace to console
196
196
197
-
It may be useful to also trace your application and send the traces to the local execution console. Such approach may result beneficial when running unit tests or integration tests in your application using an automated CI/CD pipeline. Traces can be sent to the console and captured by your CI/CD tool to further analysis.
197
+
It may be useful to also trace your application and send the traces to the local execution console. Such approach may be beneficial when running unit tests or integration tests in your application using an automated CI/CD pipeline. Traces can be sent to the console and captured by your CI/CD tool to further analysis.
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/how-to/fine-tune-test.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -90,7 +90,7 @@ The following example shows how to use the REST API to create a model deployment
90
90
91
91
92
92
```bash
93
-
curl -X POST "https://management.azure.com/subscriptions/<SUBSCRIPTION>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<RESOURCE_NAME>/deployments/<MODEL_DEPLOYMENT_NAME>api-version=2025-04-01-preview" \
93
+
curl -X POST "https://management.azure.com/subscriptions/<SUBSCRIPTION>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<RESOURCE_NAME>/deployments/<MODEL_DEPLOYMENT_NAME>?api-version=2025-04-01-preview" \
94
94
-H "Authorization: Bearer <TOKEN>" \
95
95
-H "Content-Type: application/json" \
96
96
-d '{
@@ -203,7 +203,7 @@ To use the [Deployments - Delete REST API](/rest/api/aiservices/accountmanagemen
203
203
Below is the REST API example to delete a deployment:
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/how-to/fine-tuning-deploy.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -197,7 +197,7 @@ The following example shows how to use the REST API to create a model deployment
197
197
198
198
199
199
```bash
200
-
curl -X POST "https://management.azure.com/subscriptions/<SUBSCRIPTION>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<RESOURCE_NAME>/deployments/<MODEL_DEPLOYMENT_NAME>api-version=2024-10-21" \
200
+
curl -X POST "https://management.azure.com/subscriptions/<SUBSCRIPTION>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<RESOURCE_NAME>/deployments/<MODEL_DEPLOYMENT_NAME>?api-version=2024-10-21" \
201
201
-H "Authorization: Bearer <TOKEN>" \
202
202
-H "Content-Type: application/json" \
203
203
-d '{
@@ -231,7 +231,7 @@ The only limitations are that the new region must also support fine-tuning and w
231
231
Below is an example of deploying a model that was fine-tuned in one subscription/region to another.
232
232
233
233
```bash
234
-
curl -X PUT "https://management.azure.com/subscriptions/<SUBSCRIPTION>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<RESOURCE_NAME>/deployments/<MODEL_DEPLOYMENT_NAME>api-version=2024-10-21" \
234
+
curl -X PUT "https://management.azure.com/subscriptions/<SUBSCRIPTION>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<RESOURCE_NAME>/deployments/<MODEL_DEPLOYMENT_NAME>?api-version=2024-10-21" \
235
235
-H "Authorization: Bearer <TOKEN>" \
236
236
-H "Content-Type: application/json" \
237
237
-d '{
@@ -401,7 +401,7 @@ To delete a deployment, use the [Deployments - Delete REST API](/rest/api/aiserv
401
401
Below is the REST API example to delete a deployment:
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/how-to/prompt-caching.md
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,15 +6,15 @@ services: cognitive-services
6
6
manager: nitinme
7
7
ms.service: azure-ai-openai
8
8
ms.topic: how-to
9
-
ms.date: 07/23/2025
9
+
ms.date: 07/24/2025
10
10
author: mrbullwinkle
11
11
ms.author: mbullwin
12
12
recommendations: false
13
13
---
14
14
15
15
# Prompt caching
16
16
17
-
Prompt caching allows you to reduce overall request latency and cost for longer prompts that have identical content at the beginning of the prompt. *"Prompt"* in this context is referring to the input you send to the model as part of your chat completions request. Rather than reprocess the same input tokens over and over again, the service is able to retain a temporary cache of processed input token computations to improve overall performance. Prompt caching has no impact on the output content returned in the model response beyond a reduction in latency and cost. For supported models, cached tokens are billed at a [discount on input token pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) for Standard deployment types and up to [100% discount on input tokens](/azure/ai-services/openai/concepts/provisioned-throughput) for Provisioned deployment types. If you provide the `user` parameter, it's combined with a prefix hash, allowing you to influence routing and improve cache hit rates. This is especially beneficial when many requests share long, common prefixes.
17
+
Prompt caching allows you to reduce overall request latency and cost for longer prompts that have identical content at the beginning of the prompt. *"Prompt"* in this context is referring to the input you send to the model as part of your chat completions request. Rather than reprocess the same input tokens over and over again, the service is able to retain a temporary cache of processed input token computations to improve overall performance. Prompt caching has no impact on the output content returned in the model response beyond a reduction in latency and cost. For supported models, cached tokens are billed at a [discount on input token pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) for Standard deployment types and up to [100% discount on input tokens](/azure/ai-services/openai/concepts/provisioned-throughput) for Provisioned deployment types.
18
18
19
19
Caches are typically cleared within 5-10 minutes of inactivity and are always removed within one hour of the cache's last use. Prompt caches aren't shared between Azure subscriptions.
20
20
@@ -27,13 +27,15 @@ Caches are typically cleared within 5-10 minutes of inactivity and are always re
27
27
28
28
Official support for prompt caching was first added in API version `2024-10-01-preview`. At this time, only the o-series model family supports the `cached_tokens` API response parameter.
29
29
30
-
## Get started
30
+
## Getting started
31
31
32
32
For a request to take advantage of prompt caching the request must be both:
33
33
34
34
- A minimum of 1,024 tokens in length.
35
35
- The first 1,024 tokens in the prompt must be identical.
36
36
37
+
Requests are routed based on a hash of the initial prefix of a prompt.
38
+
37
39
When a match is found between the token computations in a prompt and the current content of the prompt cache, it's referred to as a cache hit. Cache hits will show up as [`cached_tokens`](/azure/ai-services/openai/reference-preview#cached_tokens) under [`prompt_tokens_details`](/azure/ai-services/openai/reference-preview#properties-for-prompt_tokens_details) in the chat completions response.
38
40
39
41
```json
@@ -63,6 +65,8 @@ After the first 1,024 tokens cache hits will occur for every 128 additional iden
63
65
64
66
A single character difference in the first 1,024 tokens will result in a cache miss which is characterized by a `cached_tokens` value of 0. Prompt caching is enabled by default with no additional configuration needed for supported models.
65
67
68
+
If you provide the [`user`](/azure/ai-foundry/openai/reference-preview-latest#request-body-2) parameter, it's combined with the prefix hash, allowing you to influence routing and improve cache hit rates. This is especially beneficial when many requests share long, common prefixes.
69
+
66
70
## What is cached?
67
71
68
72
o1-series models feature support varies by model. For more information, see our dedicated [reasoning models guide](./reasoning.md).
0 commit comments