You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> Jais doesn't support JSON output formatting (`response_format = { "type": "json_object" }`). You can always prompt the model to generate JSON outputs. However, such outputs are not guaranteed to be valid JSON.
222
+
> Jais models don't support JSON output formatting (`response_format = { "type": "json_object" }`). You can always prompt the model to generate JSON outputs. However, such outputs are not guaranteed to be valid JSON.
223
223
224
224
If you want to pass a parameter that isn't in the list of supported parameters, you can pass it to the underlying model using *extra parameters*. See [Pass extra parameters to the model](#pass-extra-parameters-to-the-model).
225
225
@@ -482,7 +482,7 @@ var response = await client.path("/chat/completions").post({
482
482
```
483
483
484
484
> [!WARNING]
485
-
> Jais doesn't support JSON output formatting (`response_format = { "type":"json_object" }`). You can always prompt the model to generate JSON outputs. However, such outputs are not guaranteed to be valid JSON.
485
+
> Jais models don't support JSON output formatting (`response_format = { "type":"json_object" }`). You can always prompt the model to generate JSON outputs. However, such outputs are not guaranteed to be valid JSON.
486
486
487
487
If you want to pass a parameter that isn't in the list of supported parameters, you can pass it to the underlying model using *extra parameters*. See [Pass extra parameters to the model](#pass-extra-parameters-to-the-model).
488
488
@@ -580,7 +580,7 @@ Deployment to a serverless API endpoint doesn't require quota from your subscrip
580
580
581
581
### The inference package installed
582
582
583
-
You can consume predictions from this model by using the `Azure.AI.Inference` package from [Nuget](https://www.nuget.org/). To install this package, you need the following prerequisites:
583
+
You can consume predictions from this model by using the `Azure.AI.Inference` package from [NuGet](https://www.nuget.org/). To install this package, you need the following prerequisites:
584
584
585
585
* The endpoint URL. To construct the client library, you need to pass in the endpoint URL. The endpoint URL has the form `https://your-host-name.your-azure-region.inference.ai.azure.com`, where `your-host-name` is your unique model deployment host name and `your-azure-region` is the Azure region where the model is deployed (for example, eastus2).
586
586
* Depending on your model deployment and authentication preference, you need either a key to authenticate against the service, or Microsoft Entra IDcredentials. The key is a 32-character string.
@@ -606,7 +606,7 @@ using Azure.Identity;
606
606
using Azure.AI.Inference;
607
607
```
608
608
609
-
This example also use the following namespaces but you may not always need them:
609
+
This example also uses the following namespaces but you may not always need them:
> Jais doesn't support JSON output formatting (`response_format = { "type": "json_object" }`). You can always prompt the model to generate JSONoutputs. However, such outputs are not guaranteed to be valid JSON.
778
+
> Jais models don't support JSON output formatting (`response_format = { "type": "json_object" }`). You can always prompt the model to generate JSONoutputs. However, such outputs are not guaranteed to be valid JSON.
779
779
780
780
If you want to pass a parameter that isn't in the list of supported parameters, you can pass it to the underlying model using *extra parameters*. See [Pass extra parameters to the model](#pass-extra-parameters-to-the-model).
781
781
@@ -1088,7 +1088,7 @@ Explore other parameters that you can specify in the inference client. For a ful
1088
1088
```
1089
1089
1090
1090
> [!WARNING]
1091
-
> Jais doesn't support JSON output formatting (`response_format = { "type": "json_object" }`). You can always prompt the model to generate JSON outputs. However, such outputs are not guaranteed to be valid JSON.
1091
+
> Jais models don't support JSON output formatting (`response_format = { "type": "json_object" }`). You can always prompt the model to generate JSON outputs. However, such outputs are not guaranteed to be valid JSON.
1092
1092
1093
1093
If you want to pass a parameter that isn't in the list of supported parameters, you can pass it to the underlying model using *extra parameters*. See [Pass extra parameters to the model](#pass-extra-parameters-to-the-model).
1094
1094
@@ -1165,14 +1165,14 @@ The following example shows how to handle events when the model detects harmful
1165
1165
1166
1166
## More inference examples
1167
1167
1168
-
For more examples of how to use Jais, see the following examples and tutorials:
1168
+
For more examples of how to use Jais models, see the following examples and tutorials:
## Cost and quota considerations for Jais family ofmodels deployed as serverless API endpoints
1175
+
## Cost and quota considerations for Jais models deployed as serverless API endpoints
1176
1176
1177
1177
Quota is managed per deployment. Each deployment has a rate limit of200,000 tokens per minute and 1,000API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
1178
1178
@@ -1189,4 +1189,4 @@ For more information on how to track costs, see [Monitor costs for models offere
1189
1189
* [Deploy models as serverless APIs](deploy-models-serverless.md)
1190
1190
* [Consume serverless API endpoints from a different Azure AI Studio project or hub](deploy-models-serverless-connect.md)
1191
1191
* [Region availability for models in serverless API endpoints](deploy-models-serverless-availability.md)
1192
-
* [Plan and manage costs (marketplace)](costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace)
1192
+
* [Plan and manage costs (marketplace)](costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace)
> Meta Llama doesn't support JSON output formatting (`response_format = { "type": "json_object" }`). You can always prompt the model to generate JSON outputs. However, such outputs are not guaranteed to be valid JSON.
276
+
> Meta Llama models don't support JSON output formatting (`response_format = { "type": "json_object" }`). You can always prompt the model to generate JSON outputs. However, such outputs are not guaranteed to be valid JSON.
277
277
278
278
If you want to pass a parameter that isn't in the list of supported parameters, you can pass it to the underlying model using *extra parameters*. See [Pass extra parameters to the model](#pass-extra-parameters-to-the-model).
279
279
@@ -610,7 +610,7 @@ var response = await client.path("/chat/completions").post({
610
610
```
611
611
612
612
> [!WARNING]
613
-
> Meta Llama doesn't support JSON output formatting (`response_format = { "type":"json_object" }`). You can always prompt the model to generate JSON outputs. However, such outputs are not guaranteed to be valid JSON.
613
+
> Meta Llama models don't support JSON output formatting (`response_format = { "type":"json_object" }`). You can always prompt the model to generate JSON outputs. However, such outputs are not guaranteed to be valid JSON.
614
614
615
615
If you want to pass a parameter that isn't in the list of supported parameters, you can pass it to the underlying model using *extra parameters*. See [Pass extra parameters to the model](#pass-extra-parameters-to-the-model).
616
616
@@ -765,7 +765,7 @@ For deployment to a self-hosted managed compute, you must have enough quota in y
765
765
766
766
### The inference package installed
767
767
768
-
You can consume predictions from this model by using the `Azure.AI.Inference` package from [Nuget](https://www.nuget.org/). To install this package, you need the following prerequisites:
768
+
You can consume predictions from this model by using the `Azure.AI.Inference` package from [NuGet](https://www.nuget.org/). To install this package, you need the following prerequisites:
769
769
770
770
* The endpoint URL. To construct the client library, you need to pass in the endpoint URL. The endpoint URL has the form `https://your-host-name.your-azure-region.inference.ai.azure.com`, where `your-host-name` is your unique model deployment host name and `your-azure-region` is the Azure region where the model is deployed (for example, eastus2).
771
771
* Depending on your model deployment and authentication preference, you need either a key to authenticate against the service, or Microsoft Entra IDcredentials. The key is a 32-character string.
@@ -791,7 +791,7 @@ using Azure.Identity;
791
791
using Azure.AI.Inference;
792
792
```
793
793
794
-
This example also use the following namespaces but you may not always need them:
794
+
This example also uses the following namespaces but you may not always need them:
> Meta Llama doesn't support JSON output formatting (`response_format = { "type": "json_object" }`). You can always prompt the model to generate JSONoutputs. However, such outputs are not guaranteed to be valid JSON.
976
+
> Meta Llama models don't support JSON output formatting (`response_format = { "type": "json_object" }`). You can always prompt the model to generate JSONoutputs. However, such outputs are not guaranteed to be valid JSON.
977
977
978
978
If you want to pass a parameter that isn't in the list of supported parameters, you can pass it to the underlying model using *extra parameters*. See [Pass extra parameters to the model](#pass-extra-parameters-to-the-model).
979
979
@@ -1348,7 +1348,7 @@ Explore other parameters that you can specify in the inference client. For a ful
1348
1348
```
1349
1349
1350
1350
> [!WARNING]
1351
-
> Meta Llama doesn't support JSON output formatting (`response_format = { "type": "json_object" }`). You can always prompt the model to generate JSON outputs. However, such outputs are not guaranteed to be valid JSON.
1351
+
> Meta Llama models don't support JSON output formatting (`response_format = { "type": "json_object" }`). You can always prompt the model to generate JSON outputs. However, such outputs are not guaranteed to be valid JSON.
1352
1352
1353
1353
If you want to pass a parameter that isn't in the list of supported parameters, you can pass it to the underlying model using *extra parameters*. See [Pass extra parameters to the model](#pass-extra-parameters-to-the-model).
1354
1354
@@ -1441,7 +1441,7 @@ The following example shows how to handle events when the model detects harmful
1441
1441
1442
1442
## More inference examples
1443
1443
1444
-
For more examples of how to use Meta Llama, see the following examples and tutorials:
1444
+
For more examples of how to use Meta Llama models, see the following examples and tutorials:
## Cost and quota considerations for Meta Llama family of models deployed as serverless API endpoints
1456
+
## Cost and quota considerations for Meta Llama models deployed as serverless API endpoints
1457
1457
1458
1458
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
1459
1459
@@ -1463,7 +1463,7 @@ Each time a project subscribes to a given offer from the Azure Marketplace, a ne
1463
1463
1464
1464
For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace).
1465
1465
1466
-
## Cost and quota considerations for Meta Llama family of models deployed to managed compute
1466
+
## Cost and quota considerations for Meta Llama models deployed to managed compute
1467
1467
1468
1468
Meta Llama models deployed to managed compute are billed based on core hours of the associated compute instance. The cost of the compute instance is determined by the size of the instance, the number of instances running, and the run duration.
@@ -962,7 +962,7 @@ Deployment to a serverless API endpoint doesn't require quota from your subscrip
962
962
963
963
### The inference package installed
964
964
965
-
You can consume predictions from this model by using the `Azure.AI.Inference` package from [Nuget](https://www.nuget.org/). To install this package, you need the following prerequisites:
965
+
You can consume predictions from this model by using the `Azure.AI.Inference` package from [NuGet](https://www.nuget.org/). To install this package, you need the following prerequisites:
966
966
967
967
* The endpoint URL. To construct the client library, you need to pass in the endpoint URL. The endpoint URL has the form `https://your-host-name.your-azure-region.inference.ai.azure.com`, where `your-host-name` is your unique model deployment host name and `your-azure-region` is the Azure region where the model is deployed (for example, eastus2).
968
968
* Depending on your model deployment and authentication preference, you need either a key to authenticate against the service, or Microsoft Entra IDcredentials. The key is a 32-character string.
@@ -988,7 +988,7 @@ using Azure.Identity;
988
988
using Azure.AI.Inference;
989
989
```
990
990
991
-
This example also use the following namespaces but you may not always need them:
991
+
This example also uses the following namespaces but you may not always need them:
992
992
993
993
994
994
```csharp
@@ -2010,7 +2010,7 @@ The following example shows how to handle events when the model detects harmful
2010
2010
2011
2011
## More inference examples
2012
2012
2013
-
For more examples of how to use Mistral, see the following examples and tutorials:
2013
+
For more examples of how to use Mistral models, see the following examples and tutorials:
## Cost and quota considerations for Mistral family ofmodels deployed as serverless API endpoints
2027
+
## Cost and quota considerations for Mistral models deployed as serverless API endpoints
2028
2028
2029
2029
Quota is managed per deployment. Each deployment has a rate limit of200,000 tokens per minute and 1,000API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
2030
2030
@@ -2041,4 +2041,4 @@ For more information on how to track costs, see [Monitor costs for models offere
2041
2041
* [Deploy models as serverless APIs](deploy-models-serverless.md)
2042
2042
* [Consume serverless API endpoints from a different Azure AI Studio project or hub](deploy-models-serverless-connect.md)
2043
2043
* [Region availability for models in serverless API endpoints](deploy-models-serverless-availability.md)
2044
-
* [Plan and manage costs (marketplace)](costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace)
2044
+
* [Plan and manage costs (marketplace)](costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace)
0 commit comments