You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/reference-model-inference-api.md
+48-7Lines changed: 48 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ ms.custom:
17
17
18
18
# Azure AI Model Inference API | Azure Machine Learning
19
19
20
-
The Azure AI Model Inference is an API that exposes a common set of capabilities for foundational models and that can be used by developers to consume predictions from a diverse set of models in a uniform and consistent way. Developers can talk with different models deployed in Azure Machine Learning without changing the underlying code they are using.
20
+
The Azure AI Model Inference is an API that exposes a common set of capabilities for foundational models and that can be used by developers to consume predictions from a diverse set of models in a uniform and consistent way. Developers can talk with different models deployed in Azure AI Studio without changing the underlying code they are using.
21
21
22
22
## Benefits
23
23
@@ -31,7 +31,7 @@ While foundational models excel in specific domains, they lack a uniform set of
31
31
> * Use smaller models that can run faster on specific tasks.
32
32
> * Compose multiple models to develop intelligent experiences.
33
33
34
-
Having a uniform way to consume foundational models allow developers to realize all those benefits without changing a single line of code on their applications.
34
+
Having a uniform way to consume foundational models allow developers to realize all those benefits without sacrificing portability or changing the underlying code.
35
35
36
36
## Availability
37
37
@@ -42,11 +42,11 @@ Models deployed to [serverless API endpoints](how-to-deploy-models-serverless.md
42
42
> [!div class="checklist"]
43
43
> *[Cohere Embed V3](how-to-deploy-models-cohere-embed.md) family of models
44
44
> *[Cohere Command R](how-to-deploy-models-cohere-command.md) family of models
45
-
> *[Llama2](how-to-deploy-models-llama.md) family of models
46
-
> *[Llama3](how-to-deploy-models-llama.md) family of models
45
+
> *[Meta Llama 2 chat](how-to-deploy-models-llama.md) family of models
46
+
> *[Meta Llama 3 instruct](how-to-deploy-models-llama.md) family of models
> *[Phi-3](how-to-deploy-models-phi-3.md) family of models
50
50
51
51
The API is compatible with Azure OpenAI model deployments.
52
52
@@ -64,6 +64,7 @@ The API indicates how developers can consume predictions for the following modal
64
64
*[Chat completions](reference-model-inference-chat-completions.md): Creates a model response for the given chat conversation.
65
65
*[Image embeddings](reference-model-inference-images-embeddings.md): Creates an embedding vector representing the input text and image.
66
66
67
+
67
68
### Extensibility
68
69
69
70
The Azure AI Model Inference API specifies a set of modalities and parameters that models can subscribe to. However, some models may have further capabilities that the ones the API indicates. On those cases, the API allows the developer to pass them as extra parameters in the payload.
@@ -152,6 +153,48 @@ __Response__
152
153
> [!TIP]
153
154
> You can inspect the property `details.loc` to understand the location of the offending parameter and `details.input` to see the value that was passed in the request.
154
155
156
+
## Content safety
157
+
158
+
The Azure AI model inference API supports Azure AI Content Safety. When using deployments with Azure AI Content Safety on, inputs and outputs pass through an ensemble of classification models aimed at detecting and preventing the output of harmful content. The content filtering system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions.
159
+
160
+
The following example shows the response for a chat completion request that has triggered content safety.
161
+
162
+
__Request__
163
+
164
+
```HTTP/1.1
165
+
POST /chat/completions?api-version=2024-04-01-preview
166
+
Authorization: Bearer <bearer-token>
167
+
Content-Type: application/json
168
+
```
169
+
170
+
```JSON
171
+
{
172
+
"messages": [
173
+
{
174
+
"role": "system",
175
+
"content": "You are a helpful assistant"
176
+
},
177
+
{
178
+
"role": "user",
179
+
"content": "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."
0 commit comments