You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-deploy-models-phi-3-5-moe.md
+18-19Lines changed: 18 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,7 +41,7 @@ You can learn more about the models in their respective model card:
41
41
42
42
## Prerequisites
43
43
44
-
To use Phi-3.5 MoE chat model with Azure AI Studio, you need the following prerequisites:
44
+
To use Phi-3.5 MoE chat model with Azure Machine Learning, you need the following prerequisites:
45
45
46
46
### A model deployment
47
47
@@ -52,7 +52,7 @@ Phi-3.5 MoE chat model can be deployed to our self-hosted managed inference solu
52
52
For deployment to a self-hosted managed compute, you must have enough quota in your subscription. If you don't have enough quota available, you can use our temporary quota access by selecting the option **I want to use shared quota and I acknowledge that this endpoint will be deleted in 168 hours.**
53
53
54
54
> [!div class="nextstepaction"]
55
-
> [Deploy the model to managed compute](../concepts/deployments-overview.md)
55
+
> [Deploy the model to managed compute](concept-model-catalog.md#deploy-models-for-inference-with-managed-compute)
56
56
57
57
### The inference package installed
58
58
@@ -75,7 +75,7 @@ Read more about the [Azure AI inference package and reference](https://aka.ms/az
75
75
In this section, you use the [Azure AI model inference API](https://aka.ms/azureai/modelinference) with a chat completions model for chat.
76
76
77
77
> [!TIP]
78
-
> The [Azure AI model inference API](https://aka.ms/azureai/modelinference) allows you to talk with most models deployed in Azure AI Studio with the same code and structure, including Phi-3.5 MoE chat model.
78
+
> The [Azure AI model inference API](https://aka.ms/azureai/modelinference) allows you to talk with most models deployed in Azure Machine Learning studio with the same code and structure, including Phi-3.5 MoE chat model.
79
79
80
80
### Create a client to consume the model
81
81
@@ -216,7 +216,7 @@ print_stream(result)
216
216
217
217
#### Explore more parameters supported by the inference client
218
218
219
-
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference).
219
+
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](reference-model-inference-api.md).
220
220
221
221
```python
222
222
from azure.ai.inference.models import ChatCompletionsResponseFormat
@@ -291,7 +291,7 @@ You can learn more about the models in their respective model card:
291
291
292
292
## Prerequisites
293
293
294
-
To use Phi-3.5 MoE chat model with Azure AI Studio, you need the following prerequisites:
294
+
To use Phi-3.5 MoE chat model with Azure Machine Learning studio, you need the following prerequisites:
295
295
296
296
### A model deployment
297
297
@@ -302,7 +302,7 @@ Phi-3.5 MoE chat model can be deployed to our self-hosted managed inference solu
302
302
For deployment to a self-hosted managed compute, you must have enough quota in your subscription. If you don't have enough quota available, you can use our temporary quota access by selecting the option **I want to use shared quota and I acknowledge that this endpoint will be deleted in 168 hours.**
303
303
304
304
> [!div class="nextstepaction"]
305
-
> [Deploy the model to managed compute](../concepts/deployments-overview.md)
305
+
> [Deploy the model to managed compute](concept-model-catalog.md#deploy-models-for-inference-with-managed-compute)
In this section, you use the [Azure AI model inference API](https://aka.ms/azureai/modelinference) with a chat completions model for chat.
324
324
325
325
> [!TIP]
326
-
> The [Azure AI model inference API](https://aka.ms/azureai/modelinference) allows you to talk with most models deployed in Azure AI Studio with the same code and structure, including Phi-3.5 MoE chat model.
326
+
> The [Azure AI model inference API](https://aka.ms/azureai/modelinference) allows you to talk with most models deployed in Azure Machine Learning studio with the same code and structure, including Phi-3.5 MoE chat model.
327
327
328
328
### Create a client to consume the model
329
329
@@ -476,7 +476,7 @@ for await (const event of sses) {
476
476
477
477
#### Explore more parameters supported by the inference client
478
478
479
-
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference).
479
+
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API](reference-model-inference-api.md).
480
480
481
481
```javascript
482
482
var messages = [
@@ -558,7 +558,7 @@ You can learn more about the models in their respective model card:
558
558
559
559
## Prerequisites
560
560
561
-
To use Phi-3.5 MoE chat model with Azure AI Studio, you need the following prerequisites:
561
+
To use Phi-3.5 MoE chat model with Azure Machine Learning studio, you need the following prerequisites:
562
562
563
563
### A model deployment
564
564
@@ -569,7 +569,7 @@ Phi-3.5 MoE chat model can be deployed to our self-hosted managed inference solu
569
569
For deployment to a self-hosted managed compute, you must have enough quota in your subscription. If you don't have enough quota available, you can use our temporary quota access by selecting the option **I want to use shared quota and I acknowledge that this endpoint will be deleted in 168 hours.**
570
570
571
571
> [!div class="nextstepaction"]
572
-
> [Deploy the model to managed compute](../concepts/deployments-overview.md)
572
+
> [Deploy the model to managed compute](concept-model-catalog.md#deploy-models-for-inference-with-managed-compute)
573
573
574
574
### The inference package installed
575
575
@@ -613,7 +613,7 @@ using System.Reflection;
613
613
In this section, you use the [Azure AI model inference API](https://aka.ms/azureai/modelinference) with a chat completions model for chat.
614
614
615
615
> [!TIP]
616
-
> The [Azure AI model inference API](https://aka.ms/azureai/modelinference) allows you to talk with most models deployed in Azure AI Studio with the same code and structure, including Phi-3.5 MoE chat model.
616
+
> The [Azure AI model inference API](https://aka.ms/azureai/modelinference) allows you to talk with most models deployed in Azure Machine Learning studio with the same code and structure, including Phi-3.5 MoE chat model.
#### Explore more parameters supported by the inference client
760
760
761
-
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference).
761
+
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API](reference-model-inference-api.md).
762
762
763
763
```csharp
764
764
requestOptions = new ChatCompletionsOptions()
@@ -837,7 +837,7 @@ You can learn more about the models in their respective model card:
837
837
838
838
## Prerequisites
839
839
840
-
To use Phi-3.5 MoE chat model with Azure AI Studio, you need the following prerequisites:
840
+
To use Phi-3.5 MoE chat model with Azure Machine Learning studio, you need the following prerequisites:
841
841
842
842
### A model deployment
843
843
@@ -848,7 +848,7 @@ Phi-3.5 MoE chat model can be deployed to our self-hosted managed inference solu
848
848
For deployment to a self-hosted managed compute, you must have enough quota in your subscription. If you don't have enough quota available, you can use our temporary quota access by selecting the option **I want to use shared quota and I acknowledge that this endpoint will be deleted in168 hours.**
849
849
850
850
> [!div class="nextstepaction"]
851
-
> [Deploy the model to managed compute](../concepts/deployments-overview.md)
851
+
> [Deploy the model to managed compute](concept-model-catalog.md#deploy-models-for-inference-with-managed-compute)
852
852
853
853
### AREST client
854
854
@@ -862,7 +862,7 @@ Models deployed with the [Azure AI model inference API](https://aka.ms/azureai/m
862
862
In this section, you use the [Azure AI model inference API](https://aka.ms/azureai/modelinference) with a chat completions model for chat.
863
863
864
864
> [!TIP]
865
-
> The [Azure AI model inference API](https://aka.ms/azureai/modelinference) allows you to talk with most models deployed in Azure AI Studio with the same code and structure, including Phi-3.5 MoE chat model.
865
+
> The [Azure AI model inference API](https://aka.ms/azureai/modelinference) allows you to talk with most models deployed in Azure Machine Learning studio with the same code and structure, including Phi-3.5 MoE chat model.
866
866
867
867
### Create a client to consume the model
868
868
@@ -1023,7 +1023,7 @@ The last message in the stream has `finish_reason` set, indicating the reason fo
1023
1023
1024
1024
#### Explore more parameters supported by the inference client
1025
1025
1026
-
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference).
1026
+
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API](reference-model-inference-api.md).
1027
1027
1028
1028
```json
1029
1029
{
@@ -1145,9 +1145,8 @@ It is a good practice to start with a low number of instances and scale up as ne
1145
1145
1146
1146
## Related content
1147
1147
1148
-
1149
-
* [Azure AI Model Inference API](../reference/reference-model-inference-api.md)
1148
+
* [Azure AI Model Inference API](reference-model-inference-api.md)
1150
1149
* [Deploy models as serverless APIs](deploy-models-serverless.md)
1151
1150
* [Consume serverless API endpoints from a different Azure AI Studio project or hub](deploy-models-serverless-connect.md)
1152
1151
* [Region availability for models in serverless API endpoints](deploy-models-serverless-availability.md)
1153
-
* [Plan and manage costs (marketplace)](costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace)
1152
+
* [Plan and manage costs (marketplace)](costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace)
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-deploy-models-phi-3-5-vision.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -297,7 +297,7 @@ import IPython.display as Disp
297
297
Disp.Image(requests.get(image_url).content)
298
298
```
299
299
300
-
:::image type="content" source="../media/how-to/sdks/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="../media/how-to/sdks/slms-chart-example.jpg":::
300
+
:::image type="content" source="media/how-to-deploy-models-phi-3-5-vision/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="media/how-to-deploy-models-phi-3-5-vision/slms-chart-example.jpg":::
301
301
302
302
Now, create a chat completion request with the image:
303
303
@@ -631,7 +631,7 @@ img.src = data_url;
631
631
document.body.appendChild(img);
632
632
```
633
633
634
-
:::image type="content" source="../media/how-to/sdks/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="../media/how-to/sdks/slms-chart-example.jpg":::
634
+
:::image type="content" source="media/how-to-deploy-models-phi-3-5-vision/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="media/how-to-deploy-models-phi-3-5-vision/slms-chart-example.jpg":::
635
635
636
636
Now, create a chat completion request with the image:
:::image type="content" source="../media/how-to/sdks/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="../media/how-to/sdks/slms-chart-example.jpg":::
982
+
:::image type="content" source="media/how-to-deploy-models-phi-3-5-vision/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="media/how-to-deploy-models-phi-3-5-vision/slms-chart-example.jpg":::
983
983
984
984
Now, create a chat completion request with the image:
985
985
@@ -1225,7 +1225,7 @@ The last message in the stream has `finish_reason` set, indicating the reason fo
1225
1225
1226
1226
#### Explore more parameters supported by the inference client
1227
1227
1228
-
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference).
1228
+
Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](reference-model-inference-api.md).
1229
1229
1230
1230
```json
1231
1231
{
@@ -1332,11 +1332,11 @@ Phi-3.5-vision-Instruct can reason across text and images and generate text comp
1332
1332
To see this capability, download an image and encode the information as `base64`string. The resulting data should be inside of a [data URL](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs):
1333
1333
1334
1334
> [!TIP]
1335
-
> You will need to construct the data URL using an scripting or programming language. This tutorial use [this sample image](../media/how-to/sdks/slms-chart-example.jpg) inJPEGformat. A data URL has a format as follows:`data:image/jpg;base64,0xABCDFGHIJKLMNOPQRSTUVWXYZ...`.
1335
+
> You will need to construct the data URL using an scripting or programming language. This tutorial use [this sample image](media/how-to-deploy-models-phi-3-5-vision/slms-chart-example.jpg) inJPEGformat. A data URL has a format as follows:`data:image/jpg;base64,0xABCDFGHIJKLMNOPQRSTUVWXYZ...`.
1336
1336
1337
1337
Visualize the image:
1338
1338
1339
-
:::image type="content" source="../media/how-to/sdks/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="../media/how-to/sdks/slms-chart-example.jpg":::
1339
+
:::image type="content" source="media/how-to-deploy-models-phi-3-5-vision/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="media/how-to-deploy-models-phi-3-5-vision/slms-chart-example.jpg":::
1340
1340
1341
1341
Now, create a chat completion request with the image:
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-deploy-models-phi-3-vision.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -296,7 +296,7 @@ import IPython.display as Disp
296
296
Disp.Image(requests.get(image_url).content)
297
297
```
298
298
299
-
:::image type="content" source="media/how-to-deploy-models-phi-3-visions/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="media/how-to-deploy-models-phi-3-vision/slms-chart-example.jpg":::
299
+
:::image type="content" source="media/how-to-deploy-models-phi-3-vision/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="media/how-to-deploy-models-phi-3-vision/slms-chart-example.jpg":::
300
300
301
301
Now, create a chat completion request with the image:
302
302
@@ -630,7 +630,7 @@ img.src = data_url;
630
630
document.body.appendChild(img);
631
631
```
632
632
633
-
:::image type="content" source="media/how-to-deploy-models-phi-3-visions/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="media/how-to-deploy-models-phi-3-vision/slms-chart-example.jpg":::
633
+
:::image type="content" source="media/how-to-deploy-models-phi-3-vision/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="media/how-to-deploy-models-phi-3-vision/slms-chart-example.jpg":::
634
634
635
635
Now, create a chat completion request with the image:
:::image type="content" source="media/how-to-deploy-models-phi-3-visions/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="media/how-to-deploy-models-phi-3-vision/slms-chart-example.jpg":::
981
+
:::image type="content" source="media/how-to-deploy-models-phi-3-vision/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="media/how-to-deploy-models-phi-3-vision/slms-chart-example.jpg":::
982
982
983
983
Now, create a chat completion request with the image:
984
984
@@ -1331,11 +1331,11 @@ Phi-3-vision-128k-Instruct can reason across text and images and generate text c
1331
1331
To see this capability, download an image and encode the information as `base64`string. The resulting data should be inside of a [data URL](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs):
1332
1332
1333
1333
> [!TIP]
1334
-
> You will need to construct the data URL using an scripting or programming language. This tutorial use [this sample image](media/how-to-deploy-models-phi-3-visions/) inJPEGformat. A data URL has a format as follows:`data:image/jpg;base64,0xABCDFGHIJKLMNOPQRSTUVWXYZ...`.
1334
+
> You will need to construct the data URL using an scripting or programming language. This tutorial use [this sample image](media/how-to-deploy-models-phi-3-vision/) inJPEGformat. A data URL has a format as follows:`data:image/jpg;base64,0xABCDFGHIJKLMNOPQRSTUVWXYZ...`.
1335
1335
1336
1336
Visualize the image:
1337
1337
1338
-
:::image type="content" source="media/how-to-deploy-models-phi-3-visions/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="media/how-to-deploy-models-phi-3-vision/slms-chart-example.jpg":::
1338
+
:::image type="content" source="media/how-to-deploy-models-phi-3-vision/slms-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="media/how-to-deploy-models-phi-3-vision/slms-chart-example.jpg":::
1339
1339
1340
1340
Now, create a chat completion request with the image:
0 commit comments