Skip to content

Commit 9e5ebac

Browse files
committed
Initial pass at introducing Developer Tier for FT AOAI models.
1 parent cfb815f commit 9e5ebac

File tree

5 files changed

+230
-86
lines changed

5 files changed

+230
-86
lines changed

articles/ai-services/openai/how-to/deployment-types.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ ms.author: mbullwin
1414

1515
Azure OpenAI provides customers with choices on the hosting structure that fits their business and usage patterns. The service offers two main types of deployments: **standard** and **provisioned**. For a given deployment type, customers can align their workloads with their data processing requirements by choosing an Azure geography (`Standard` or `Provisioned-Managed`), Microsoft specified data zone (`DataZone-Standard` or `DataZone Provisioned-Managed`), or Global (`Global-Standard` or `Global Provisioned-Managed`) processing options.
1616

17+
For fine-tuned models, an additional `Developer` deployment type provides a cost-efficient means of custom model evaluation, but without data residency.
18+
1719
All deployments can perform the exact same inference operations, however the billing, scale, and performance are substantially different. As part of your solution design, you will need to make two key decisions:
1820

1921
- **Data processing location**
@@ -147,6 +149,15 @@ You can use the following policy to disable access to any Azure OpenAI deploymen
147149
}
148150
```
149151

152+
## Developer (for fine-tuned models)
153+
154+
> [!IMPORTANT]
155+
> Data stored at rest remains in the designated Azure geography, while data may be processed for inferencing in any Azure OpenAI location. [Learn more about data residency](https://azure.microsoft.com/explore/global-infrastructure/data-residency/).
156+
157+
**SKU name in code:** `Developer`
158+
159+
Fine-tuned models support a Developer deployment specifically designed to support custom model evaluation. It offers no data residency guarantees nor does it offer an SLA. To learn more about using the Developer deployment type, see the [fine-tuning guide](./fine-tuning-test.md).
160+
150161
## Deploy models
151162

152163
:::image type="content" source="../media/deployment-types/deploy-models-new.png" alt-text="Screenshot that shows the model deployment dialog in Azure AI Foundry portal with three deployment types highlighted." lightbox="../media/deployment-types/deploy-models-new.png":::
Lines changed: 210 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,210 @@
1+
---
2+
title: 'Test a fine-tuned model'
3+
titleSuffix: Azure OpenAI
4+
description: Learn how to test your fine-tuned model with Azure OpenAI Service by using Python, the REST APIs, or Azure AI Foundry portal.
5+
manager: nitinme
6+
ms.service: azure-ai-openai
7+
ms.custom: build-2025
8+
ms.topic: how-to
9+
ms.date: 05/20/2025
10+
author: voutilad
11+
ms.author: davevoutila
12+
---
13+
14+
# Deploy a fine-tuned model for testing (Preview)
15+
16+
After you've fine-tuned a model, you may want to test its quality via the Chat Completions api or the [Evaluations](./evaluations.md) service.
17+
18+
A Developer Tier deployment allows you to deploy your new model without the hourly hosting fee incurred by Standard or Global deployments. The only charges incurred are per-token. Consult the [pricing page](https://aka.ms/aoaipricing) for the most up-to-date pricing.
19+
20+
> Developer Tier offers no availability SLA and no [data residency](https://aka.ms/data-residency). If you require an SLA or data residency, choose an alternative [deployment type](./deployment-types.md) for testing your model.
21+
22+
## Deploy your fine-tuned model
23+
24+
## [Portal](#tab/portal)
25+
26+
To deploy your model candidate, select the fine-tuned model to deploy, and then select **Deploy**.
27+
28+
The **Deploy model** dialog box opens. In the dialog box, enter your **Deployment name** and then select **Developer** from the deployment type drop-down. Select **Create** to start the deployment of your custom model.
29+
30+
:::image type="content" source="../media/fine-tuning/fine-tuning-deploy/deploy-dev-dialogue.png" alt-text="Screenshot that shows how to deploy a custom model in Azure AI Foundry portal." lightbox="../media/fine-tuning/fine-tuning-test/deploy-dev-dialogue.png":::
31+
32+
You can monitor the progress of your new deployment on the **Deployments** pane in Azure AI Foundry portal.
33+
34+
## [Python](#tab/python)
35+
36+
```python
37+
import json
38+
import os
39+
import requests
40+
41+
token = os.getenv("<TOKEN>")
42+
subscription = "<YOUR_SUBSCRIPTION_ID>"
43+
resource_group = "<YOUR_RESOURCE_GROUP_NAME>"
44+
resource_name = "<YOUR_AZURE_OPENAI_RESOURCE_NAME>"
45+
model_deployment_name = "gpt41-mini-candidate-01" # custom deployment name that you will use to reference the model when making inference calls.
46+
47+
deploy_params = {'api-version': "2024-10-21"}
48+
deploy_headers = {'Authorization': 'Bearer {}'.format(token), 'Content-Type': 'application/json'}
49+
50+
deploy_data = {
51+
"sku": {"name": "developer", "capacity": 50},
52+
"properties": {
53+
"model": {
54+
"format": "OpenAI",
55+
"name": <"fine_tuned_model">, #retrieve this value from the previous call, it will look like gpt41-mini-candidate-01.ft-b044a9d3cf9c4228b5d393567f693b83
56+
"version": "1"
57+
}
58+
}
59+
}
60+
deploy_data = json.dumps(deploy_data)
61+
62+
request_url = f'https://management.azure.com/subscriptions/{subscription}/resourceGroups/{resource_group}/providers/Microsoft.CognitiveServices/accounts/{resource_name}/deployments/{model_deployment_name}'
63+
64+
print('Creating a new deployment...')
65+
66+
r = requests.put(request_url, params=deploy_params, headers=deploy_headers, data=deploy_data)
67+
68+
print(r)
69+
print(r.reason)
70+
print(r.json())
71+
72+
```
73+
74+
|variable | Definition|
75+
|--------------|-----------|
76+
| token | There are multiple ways to generate an authorization token. The easiest method for initial testing is to launch the Cloud Shell from the [Azure portal](https://portal.azure.com). Then run [`az account get-access-token`](/cli/azure/account#az-account-get-access-token()). You can use this token as your temporary authorization token for API testing. We recommend storing this in a new environment variable. |
77+
| subscription | The subscription ID for the associated Azure OpenAI resource. |
78+
| resource_group | The resource group name for your Azure OpenAI resource. |
79+
| resource_name | The Azure OpenAI resource name. |
80+
| model_deployment_name | The custom name for your new fine-tuned model deployment. This is the name that will be referenced in your code when making chat completion calls. |
81+
| fine_tuned_model | Retrieve this value from your fine-tuning job results in the previous step. It will look like `gpt41-mini-candidate-01.ft-b044a9d3cf9c4228b5d393567f693b83`. You will need to add that value to the deploy_data json. Alternatively you can also deploy a checkpoint, by passing the checkpoint ID which will appear in the format `ftchkpt-e559c011ecc04fc68eaa339d8227d02d` |
82+
83+
## [REST](#tab/rest)
84+
85+
The following example shows how to use the REST API to create a model deployment for your customized model. The REST API generates a name for the deployment of your customized model.
86+
87+
88+
```bash
89+
curl -X POST "https://management.azure.com/subscriptions/<SUBSCRIPTION>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<RESOURCE_NAME>/deployments/<MODEL_DEPLOYMENT_NAME>api-version=2024-10-21" \
90+
-H "Authorization: Bearer <TOKEN>" \
91+
-H "Content-Type: application/json" \
92+
-d '{
93+
"sku": {"name": "developer", "capacity": 50},
94+
"properties": {
95+
"model": {
96+
"format": "OpenAI",
97+
"name": "<FINE_TUNED_MODEL>",
98+
"version": "1"
99+
}
100+
}
101+
}'
102+
```
103+
104+
|variable | Definition|
105+
|--------------|-----------|
106+
| token | There are multiple ways to generate an authorization token. The easiest method for initial testing is to launch the Cloud Shell from the [Azure portal](https://portal.azure.com). Then run [`az account get-access-token`](/cli/azure/account#az-account-get-access-token()). You can use this token as your temporary authorization token for API testing. We recommend storing this in a new environment variable. |
107+
| subscription | The subscription ID for the associated Azure OpenAI resource. |
108+
| resource_group | The resource group name for your Azure OpenAI resource. |
109+
| resource_name | The Azure OpenAI resource name. |
110+
| model_deployment_name | The custom name for your new fine-tuned model deployment. This is the name that will be referenced in your code when making chat completion calls. |
111+
| fine_tuned_model | Retrieve this value from your fine-tuning job results in the previous step. It will look like `gpt-35-turbo-0125.ft-b044a9d3cf9c4228b5d393567f693b83`. You'll need to add that value to the deploy_data json. Alternatively you can also deploy a checkpoint, by passing the checkpoint ID which will appear in the format `ftchkpt-e559c011ecc04fc68eaa339d8227d02d` |
112+
113+
114+
### Deploy a model with Azure CLI
115+
116+
The following example shows how to use the Azure CLI to deploy your customized model. With the Azure CLI, you must specify a name for the deployment of your customized model. For more information about how to use the Azure CLI to deploy customized models, see [`az cognitiveservices account deployment`](/cli/azure/cognitiveservices/account/deployment).
117+
118+
To run this Azure CLI command in a console window, you must replace the following _\<placeholders>_ with the corresponding values for your customized model:
119+
120+
| Placeholder | Value |
121+
| --- | --- |
122+
| _\<YOUR_AZURE_SUBSCRIPTION>_ | The name or ID of your Azure subscription. |
123+
| _\<YOUR_RESOURCE_GROUP>_ | The name of your Azure resource group. |
124+
| _\<YOUR_RESOURCE_NAME>_ | The name of your Azure OpenAI resource. |
125+
| _\<YOUR_DEPLOYMENT_NAME>_ | The name you want to use for your model deployment. |
126+
| _\<YOUR_FINE_TUNED_MODEL_ID>_ | The name of your customized model. |
127+
128+
```azurecli
129+
az cognitiveservices account deployment create
130+
--resource-group <YOUR_RESOURCE_GROUP>
131+
--name <YOUR_RESOURCE_NAME>
132+
--deployment-name <YOUR_DEPLOYMENT_NAME>
133+
--model-name <YOUR_FINE_TUNED_MODEL_ID>
134+
--model-version "1"
135+
--model-format OpenAI
136+
--sku-capacity "50"
137+
--sku-name "Developer"
138+
```
139+
---
140+
141+
[!INCLUDE [Fine-tuning deletion](../includes/fine-tune.md)]
142+
143+
## Use your deployed fine-tuned model
144+
145+
## [Portal](#tab/portal)
146+
147+
After your custom model deploys, you can use it like any other deployed model. You can use the **Playgrounds** in the [Azure AI Foundry portal](https://ai.azure.com) to experiment with your new deployment. You can continue to use the same parameters with your custom model, such as `temperature` and `max_tokens`, as you can with other deployed models.
148+
149+
:::image type="content" source="../media/quickstarts/playground-load-new.png" alt-text="Screenshot of the Playground pane in Azure AI Foundry portal, with sections highlighted." lightbox="../media/quickstarts/playground-load-new.png":::
150+
151+
You can also use the [Evaluations](./evaluations.md) service to create and run model evaluations against your deployed model candidate as well as other model versions.
152+
153+
## [Python](#tab/python)
154+
155+
```python
156+
import os
157+
from openai import AzureOpenAI
158+
159+
client = AzureOpenAI(
160+
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
161+
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
162+
api_version="2024-02-01"
163+
)
164+
165+
response = client.chat.completions.create(
166+
model="gpt41-mini-candidate-01", # model = "Custom deployment name you chose for your fine-tuning model"
167+
messages=[
168+
{"role": "system", "content": "You are a helpful assistant."},
169+
{"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
170+
{"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
171+
{"role": "user", "content": "Do other Azure AI services support this too?"}
172+
]
173+
)
174+
175+
print(response.choices[0].message.content)
176+
```
177+
178+
## [REST](#tab/rest)
179+
180+
```bash
181+
curl $AZURE_OPENAI_ENDPOINT/openai/deployments/<deployment_name>/chat/completions?api-version=2024-10-21 \
182+
-H "Content-Type: application/json" \
183+
-H "api-key: $AZURE_OPENAI_API_KEY" \
184+
-d '{"messages":[{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},{"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},{"role": "user", "content": "Do other Azure AI services support this too?"}]}'
185+
```
186+
---
187+
188+
## Clean up your deployment
189+
190+
Developer deployments will auto-delete on their own regardless of activity. To delete a deployment manually, use the [Deployments - Delete REST API](/rest/api/aiservices/accountmanagement/deployments/delete?view=rest-aiservices-accountmanagement-2024-10-01&tabs=HTTP&preserve-view=true) and send an HTTP DELETE to the deployment resource. Like with creating deployments, you must include the following parameters:
191+
192+
- Azure subscription ID
193+
- Azure resource group name
194+
- Azure OpenAI resource name
195+
- Name of the deployment to delete
196+
197+
Below is the REST API example to delete a deployment:
198+
199+
```bash
200+
curl -X DELETE "https://management.azure.com/subscriptions/<SUBSCRIPTION>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<RESOURCE_NAME>/deployments/<MODEL_DEPLOYMENT_NAME>api-version=2024-10-21" \
201+
-H "Authorization: Bearer <TOKEN>"
202+
```
203+
204+
You can also delete a deployment in Azure AI Foundry portal, or use [Azure CLI](/cli/azure/cognitiveservices/account/deployment?preserve-view=true#az-cognitiveservices-account-deployment-delete).
205+
206+
207+
## Next steps
208+
209+
- [Azure OpenAI Quotas & limits](./quota.md)
210+
- [Azure OpenAI deployment types](./deployment-types.md)

articles/ai-services/openai/includes/fine-tuning-python.md

Lines changed: 3 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -274,59 +274,11 @@ Look for your loss to decrease over time, and your accuracy to increase. If you
274274

275275
## Deploy a fine-tuned model
276276

277-
When the fine-tuning job succeeds, the value of the `fine_tuned_model` variable in the response body is set to the name of your customized model. Your model is now also available for discovery from the [list Models API](/rest/api/azureopenai/models/list). However, you can't issue completion calls to your customized model until your customized model is deployed. You must deploy your customized model to make it available for use with completion calls.
277+
Once you're satisified with the metrics from your fine-tuning job, or you just want to move onto inference, you must deploy the model.
278278

279-
Unlike the previous SDK commands, deployment must be done using the control plane API which requires separate authorization, a different API path, and a different API version.
280-
281-
|variable | Definition|
282-
|--------------|-----------|
283-
| token | There are multiple ways to generate an authorization token. The easiest method for initial testing is to launch the Cloud Shell from the [Azure portal](https://portal.azure.com). Then run [`az account get-access-token`](/cli/azure/account#az-account-get-access-token()). You can use this token as your temporary authorization token for API testing. We recommend storing this in a new environment variable. |
284-
| subscription | The subscription ID for the associated Azure OpenAI resource. |
285-
| resource_group | The resource group name for your Azure OpenAI resource. |
286-
| resource_name | The Azure OpenAI resource name. |
287-
| model_deployment_name | The custom name for your new fine-tuned model deployment. This is the name that will be referenced in your code when making chat completion calls. |
288-
| fine_tuned_model | Retrieve this value from your fine-tuning job results in the previous step. It will look like `gpt-35-turbo-0125.ft-b044a9d3cf9c4228b5d393567f693b83`. You will need to add that value to the deploy_data json. Alternatively you can also deploy a checkpoint, by passing the checkpoint ID which will appear in the format `ftchkpt-e559c011ecc04fc68eaa339d8227d02d` |
289-
290-
```python
291-
import json
292-
import os
293-
import requests
294-
295-
token= os.getenv("<TOKEN>")
296-
subscription = "<YOUR_SUBSCRIPTION_ID>"
297-
resource_group = "<YOUR_RESOURCE_GROUP_NAME>"
298-
resource_name = "<YOUR_AZURE_OPENAI_RESOURCE_NAME>"
299-
model_deployment_name ="gpt-35-turbo-ft" # custom deployment name that you will use to reference the model when making inference calls.
300-
301-
deploy_params = {'api-version': "2024-10-01"} # control plane API version rather than dataplane API for this call
302-
deploy_headers = {'Authorization': 'Bearer {}'.format(token), 'Content-Type': 'application/json'}
303-
304-
deploy_data = {
305-
"sku": {"name": "standard", "capacity": 1},
306-
"properties": {
307-
"model": {
308-
"format": "OpenAI",
309-
"name": <"fine_tuned_model">, #retrieve this value from the previous call, it will look like gpt-35-turbo-0125.ft-b044a9d3cf9c4228b5d393567f693b83
310-
"version": "1"
311-
}
312-
}
313-
}
314-
deploy_data = json.dumps(deploy_data)
315-
316-
request_url = f'https://management.azure.com/subscriptions/{subscription}/resourceGroups/{resource_group}/providers/Microsoft.CognitiveServices/accounts/{resource_name}/deployments/{model_deployment_name}'
317-
318-
print('Creating a new deployment...')
319-
320-
r = requests.put(request_url, params=deploy_params, headers=deploy_headers, data=deploy_data)
321-
322-
print(r)
323-
print(r.reason)
324-
print(r.json())
325-
326-
```
327-
328-
Learn more about cross region deployment and use the deployed model [here](../how-to/fine-tuning-deploy.md#use-your-deployed-fine-tuned-model).
279+
If you're deploying for further validation, consider deploying for [testing](../how-to/fine-tuning-test.md?tabs=python) using a Developer deployment.
329280

281+
If you're ready to deploy for production or have particular data residency needs, follow our [deployment guide](../how-to/fine-tuning-deploy.md?tabs=python).
330282

331283
## Continuous fine-tuning
332284

articles/ai-services/openai/includes/fine-tuning-rest.md

Lines changed: 3 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -223,36 +223,11 @@ Look for your loss to decrease over time, and your accuracy to increase. If you
223223

224224
## Deploy a fine-tuned model
225225

226-
[!INCLUDE [Fine-tuning deletion](fine-tune.md)]
226+
Once you're satisified with the metrics from your fine-tuning job, or you just want to move onto inference, you must deploy the model.
227227

228-
The following example shows how to use the REST API to create a model deployment for your customized model. The REST API generates a name for the deployment of your customized model.
228+
If you're deploying for further validation, consider deploying for [testing](../how-to/fine-tuning-test.md?tabs=rest) using a Developer deployment.
229229

230-
|variable | Definition|
231-
|--------------|-----------|
232-
| token | There are multiple ways to generate an authorization token. The easiest method for initial testing is to launch the Cloud Shell from the [Azure portal](https://portal.azure.com). Then run [`az account get-access-token`](/cli/azure/account#az-account-get-access-token()). You can use this token as your temporary authorization token for API testing. We recommend storing this in a new environment variable. |
233-
| subscription | The subscription ID for the associated Azure OpenAI resource. |
234-
| resource_group | The resource group name for your Azure OpenAI resource. |
235-
| resource_name | The Azure OpenAI resource name. |
236-
| model_deployment_name | The custom name for your new fine-tuned model deployment. This is the name that will be referenced in your code when making chat completion calls. |
237-
| fine_tuned_model | Retrieve this value from your fine-tuning job results in the previous step. It will look like `gpt-35-turbo-0125.ft-b044a9d3cf9c4228b5d393567f693b83`. You'll need to add that value to the deploy_data json. Alternatively you can also deploy a checkpoint, by passing the checkpoint ID which will appear in the format `ftchkpt-e559c011ecc04fc68eaa339d8227d02d` |
238-
239-
```bash
240-
curl -X POST "https://management.azure.com/subscriptions/<SUBSCRIPTION>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<RESOURCE_NAME>/deployments/<MODEL_DEPLOYMENT_NAME>api-version=2024-10-21" \
241-
-H "Authorization: Bearer <TOKEN>" \
242-
-H "Content-Type: application/json" \
243-
-d '{
244-
"sku": {"name": "standard", "capacity": 1},
245-
"properties": {
246-
"model": {
247-
"format": "OpenAI",
248-
"name": "<FINE_TUNED_MODEL>",
249-
"version": "1"
250-
}
251-
}
252-
}'
253-
```
254-
255-
Learn more about cross region deployment and use the deployed model [here](../how-to/fine-tuning-deploy.md#use-your-deployed-fine-tuned-model).
230+
If you're ready to deploy for production or have particular data residency needs, follow our [deployment guide](../how-to/fine-tuning-deploy.md?tabs=rest).
256231

257232

258233
## Continuous fine-tuning

0 commit comments

Comments
 (0)