You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/api-management/openai-compatible-llm-api.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ ms.service: azure-api-management
5
5
author: dlepow
6
6
ms.author: danlep
7
7
ms.topic: how-to
8
-
ms.date: 05/15/2025
8
+
ms.date: 06/04/2025
9
9
ms.collection: ce-skilling-ai-copilot
10
10
ms.custom: template-how-to
11
11
---
@@ -26,7 +26,7 @@ API Management supports two types of language model APIs for this scenario. Choo
26
26
27
27
***OpenAI-compatible** - Language model endpoints that are compatible with OpenAI's API. Examples include certain models exposed by inference providers such as [Hugging Face Text Generation Inference (TGI)](https://huggingface.co/docs/text-generation-inference/en/index) and [Google Gemini API](https://ai.google.dev/gemini-api/docs).
28
28
29
-
API Management configures an OpenAI-compatible chat completions endpoint.
29
+
For an OpenAI-compatible LLM, API Management configures a chat completions endpoint.
30
30
31
31
***Passthrough** - Other language model endpoints that aren't compatible with OpenAI's API. Examples include models deployed in [Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html) or other providers.
32
32
@@ -93,7 +93,7 @@ To ensure that your LLM API is working as expected, test it in the API Managemen
93
93
94
94
## Example: Google Gemini
95
95
96
-
You can import OpenAI-compatible models from Google Gemini such as `gemini-2.0-flash`. Azure API Management can manage an OpenAI-compatible chat completion endpoint for these models.
96
+
You can import an OpenAI-compatible Google Gemini API to access models such as `gemini-2.0-flash`. For these models, Azure API Management can manage an OpenAI-compatible chat completions endpoint.
97
97
98
98
To import an OpenAI-compatible Gemini model:
99
99
@@ -109,7 +109,7 @@ To import an OpenAI-compatible Gemini model:
109
109
1. Enter a **Display name** and optional **Description** for the API.
110
110
1. In **URL**, enter the following base URL that you copied previously: `https://generativelanguage.googleapis.com/v1beta/openai`
111
111
112
-
1. In **Path**, append a path that your API Management instance uses to access the Gemini API endpoints.
112
+
1. In **Path**, append a path that your API Management instance uses to route requests to the Gemini API endpoints.
113
113
1. In **Type**, select **Create OpenAI API**.
114
114
1. In **Access key**, enter the following:
115
115
1.**Header name**: *Authorization*.
@@ -119,12 +119,12 @@ To import an OpenAI-compatible Gemini model:
119
119
120
120
### Test Gemini model
121
121
122
-
After importing the API, you can test it using the test console in the Azure portal. Choose an OpenAI-compatible model and endpoint for the test.
122
+
After importing the API, you can test the chat completions endpoint for the API.
123
123
124
124
1. Select the API you created in the previous step.
125
125
1. Select the **Test** tab.
126
126
1. Select the `POST Creates a model response for the given chat conversation` operation, which is a `POST` request to the `/chat/completions` endpoint.
127
-
1. In the **Request body** section, enter the following JSON to specify the model and an example prompt. In this example, the OpenAI-compatible `gemini-2.0-flash` model is used.
127
+
1. In the **Request body** section, enter the following JSON to specify the model and an example prompt. In this example, the `gemini-2.0-flash` model is used.
0 commit comments