Update reference-model-inference-api.md

santiagxf · web-flow · commit 36b19546c676 · 2024-06-18T15:03:47.000-04:00
diff --git a/articles/machine-learning/reference-model-inference-api.md b/articles/machine-learning/reference-model-inference-api.md
@@ -17,7 +17,7 @@ ms.custom:
 
 # Azure AI Model Inference API | Azure Machine Learning
 
-The Azure AI Model Inference is an API that exposes a common set of capabilities for foundational models and that can be used by developers to consume predictions from a diverse set of models in a uniform and consistent way. Developers can talk with different models deployed in Azure Machine Learning without changing the underlying code they are using.
+The Azure AI Model Inference is an API that exposes a common set of capabilities for foundational models and that can be used by developers to consume predictions from a diverse set of models in a uniform and consistent way. Developers can talk with different models deployed in Azure AI Studio without changing the underlying code they are using.
 
 ## Benefits
 
@@ -31,7 +31,7 @@ While foundational models excel in specific domains, they lack a uniform set of
 > * Use smaller models that can run faster on specific tasks.
 > * Compose multiple models to develop intelligent experiences.
 
-Having a uniform way to consume foundational models allow developers to realize all those benefits without changing a single line of code on their applications.
+Having a uniform way to consume foundational models allow developers to realize all those benefits without sacrificing portability or changing the underlying code.
 
 ## Availability
 
@@ -42,11 +42,11 @@ Models deployed to [serverless API endpoints](how-to-deploy-models-serverless.md
 > [!div class="checklist"]
 > * [Cohere Embed V3](how-to-deploy-models-cohere-embed.md) family of models
 > * [Cohere Command R](how-to-deploy-models-cohere-command.md) family of models
-> * [Llama2](how-to-deploy-models-llama.md) family of models
-> * [Llama3](how-to-deploy-models-llama.md) family of models
+> * [Meta Llama 2 chat](how-to-deploy-models-llama.md) family of models
+> * [Meta Llama 3 instruct](how-to-deploy-models-llama.md) family of models
 > * [Mistral-Small](how-to-deploy-models-mistral.md)
 > * [Mistral-Large](how-to-deploy-models-mistral.md)
-> * Phi-3 family of models
+> * [Phi-3](how-to-deploy-models-phi-3.md) family of models
 
 The API is compatible with Azure OpenAI model deployments.
 
@@ -64,6 +64,7 @@ The API indicates how developers can consume predictions for the following modal
 * [Chat completions](reference-model-inference-chat-completions.md): Creates a model response for the given chat conversation.
 * [Image embeddings](reference-model-inference-images-embeddings.md): Creates an embedding vector representing the input text and image.
 
+
 ### Extensibility
 
 The Azure AI Model Inference API specifies a set of modalities and parameters that models can subscribe to. However, some models may have further capabilities that the ones the API indicates. On those cases, the API allows the developer to pass them as extra parameters in the payload.
@@ -152,6 +153,48 @@ __Response__
 > [!TIP]
 > You can inspect the property `details.loc` to understand the location of the offending parameter and `details.input` to see the value that was passed in the request.
 
+## Content safety
+
+The Azure AI model inference API supports Azure AI Content Safety. When using deployments with Azure AI Content Safety on, inputs and outputs pass through an ensemble of classification models aimed at detecting and preventing the output of harmful content. The content filtering system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions.
+
+The following example shows the response for a chat completion request that has triggered content safety. 
+
+__Request__
+
+```HTTP/1.1
+POST /chat/completions?api-version=2024-04-01-preview
+Authorization: Bearer <bearer-token>
+Content-Type: application/json
+```
+
+```JSON
+{
+    "messages": [
+    {
+        "role": "system",
+        "content": "You are a helpful assistant"
+    },
+    {
+        "role": "user",
+        "content": "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."
+    }
+    ],
+    "temperature": 0,
+    "top_p": 1,
+}
+```
+
+__Response__
+
+```JSON
+{
+    "status": 400,
+    "code": "content_filter",
+    "message": "The response was filtered",
+    "param": "messages",
+    "type": null
+}
+```
 
 ## Getting started
 
@@ -242,5 +285,3 @@ __Response__
 }
 ```
 
-
----