Update deploy-jamba-models.md

ssalgadodev · web-flow · commit 7d65eb0b6ecc · 2024-06-18T16:53:42.000-04:00
diff --git a/articles/ai-studio/how-to/deploy-jamba-models.md b/articles/ai-studio/how-to/deploy-jamba-models.md
@@ -107,7 +107,9 @@ For more information on using the APIs, see the [reference](#reference-for-jamba
 
 Since Jamba Instruct is fine-tuned for chat completion, we support the route `/chat/completions` as part of the [Azure AI Model Inference API](../reference/reference-model-inference-api.md) for multi-turn chat or single-turn question-answering. AI21's [Jamba Instruct model](https://docs.ai21.com/reference/jamba-instruct-api) can also be used. For more information about the REST endpoint being called, visit [AI21's REST documentation](https://docs.ai21.com/reference/jamba-instruct-api).
 
-The [Azure AI Model Inference API](reference-model-inference-api.md) schema can be found in the [reference for Chat Completions](../reference/reference-model-inference-chat-completions.md) article and an [OpenAPI specification can be obtained from the endpoint itself](../reference/reference-model-inference-api.md?tabs=rest#getting-started).
+### Azure AI Model Inference API
+
+The [Azure AI Model Inference API](../reference/reference-model-inference-api.md) schema can be found in the [reference for Chat Completions](../reference/reference-model-inference-chat-completions.md) article and an [OpenAPI specification can be obtained from the endpoint itself](../reference/reference-model-inference-api.md?tabs=rest#getting-started).
 
 Single- and multi-turn chat have the same request and response format, except that question answering (single-turn) involves only a single user message in the request, while multi-turn chat requires that you send the entire chat message history in each request. In a multi-turn chat, the message thread includes all messages from the user and the model, ordered oldest to newest, alternating between `user` and `assistant` role messages, optionally starting with a system
 message to provide context. For example, the message stack for the fourth call in a chat request that includes an initial system message would look like this in pseudocode: