Skip to content

Commit cdc52da

Browse files
committed
[APIM] Add Gemini LLM example
1 parent c24227f commit cdc52da

File tree

3 files changed

+53
-1
lines changed

3 files changed

+53
-1
lines changed
56.9 KB
Loading
59.1 KB
Loading

articles/api-management/openai-compatible-llm-api.md

Lines changed: 53 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,20 @@ Learn more about managing AI APIs in API Management:
2424

2525
API Management supports two types of language model APIs for this scenario. Choose the option suitable for your model deployment. The option determines how clients call the API and how the API Management instance routes requests to the AI service.
2626

27-
* **OpenAI-compatible** - Language model endpoints that are compatible with OpenAI's API. Examples include certain models exposed by inference providers such as [Hugging Face Text Generation Inference (TGI)](https://huggingface.co/docs/text-generation-inference/en/index).
27+
* **OpenAI-compatible** - Language model endpoints that are compatible with OpenAI's API. Examples include certain models exposed by inference providers such as [Hugging Face Text Generation Inference (TGI)](https://huggingface.co/docs/text-generation-inference/en/index) and [Google Gemini API](https://ai.google.dev/gemini-api/docs).
2828

2929
API Management configures an OpenAI-compatible chat completions endpoint.
3030

3131
* **Passthrough** - Other language model endpoints that aren't compatible with OpenAI's API. Examples include models deployed in [Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html) or other providers.
3232

3333
API Management configures wildcard operations for common HTTP verbs. Clients can append paths to the wildcard operations, and API Management passes requests to the backend.
3434

35+
When you import the API, API Management automatically configures:
36+
37+
* A [backend](backends.md) resource and a [set-backend-service](set-backend-service-policy.md) policy that direct API requests to the LLM endpoint.
38+
* (optionally) Access to the LLM backend using an access key you provide. The key is protected as a secret [named value](api-management-howto-properties.md) in API Management.
39+
* (optionally) Policies to help you monitor and manage the Azure OpenAI API.
40+
3541
## Prerequisites
3642

3743
- An existing API Management instance. [Create one if you haven't already](get-started-create-service-instance.md).
@@ -70,6 +76,8 @@ To import a language model API to API Management:
7076
1. Select **Review**.
7177
1. After settings are validated, select **Create**.
7278

79+
API Management creates the API, and configures operations for the LLM endpoints. By default, the API requires an API Management subscription.
80+
7381
## Test the LLM API
7482

7583
To ensure that your LLM API is working as expected, test it in the API Management test console.
@@ -84,5 +92,49 @@ To ensure that your LLM API is working as expected, test it in the API Managemen
8492

8593
When the test is successful, the backend responds with a successful HTTP response code and some data. Appended to the response is token usage data to help you monitor and manage your language model token consumption.
8694

95+
## Example: Google Gemini
96+
97+
You can import OpenAI-compatible models from Google Gemini such as `gemini-2.0-flash`. Azure API Management can manage an OpenAI-compatible chat completion endpoint for these models.
98+
99+
To import an OpenAI-compatible Gemini model:
100+
101+
1. Create an API key for the Gemini API at [Google AI Studio](https://aistudio.google.com/apikey) and store it in a safe location.
102+
1. Note the following base URL from the [Gemini OpenAI compatiblity documentation](https://ai.google.dev/gemini-api/docs/openai).
103+
104+
`https://generativelanguage.googleapis.com/v1beta/openai`
105+
106+
1. In the [Azure portal](https://portal.azure.com), navigate to your API Management instance.
107+
1. In the left menu, under **APIs**, select **APIs** > **+ Add API**.
108+
1. Under **Define a new API**, select **Language Model API**.
109+
1. On the **Configure API** tab:
110+
1. Enter a **Display name** and optional **Description** for the API.
111+
1. In **URL**, enter the following base URL that you copied previously: `https://generativelanguage.googleapis.com/v1beta/openai`
112+
113+
1. In **Path**, append a path that your API Management instance uses to access the Gemini API endpoints.
114+
1. In **Type**, select **Create OpenAI API**.
115+
1. In **Access key**, enter the following:
116+
1. **Header name**: *Authorization*.
117+
1. **Header value (key)**: `Bearer` followed by the API key for the Gemini API that you created previously.
118+
1. On the remaining tabs, optionally configure policies to manage token consumption, semantic caching, and AI content safety.
119+
1. Select **Create**.
120+
121+
### Test Gemini model
122+
123+
After importing the API, you can test it using the test console in the Azure portal. Choose an OpenAI-compatible model and endpoint for the test
124+
125+
1. Select the API you created in the previous step.
126+
1. Select the **Test** tab.
127+
1. Select the `POST Creates a model response for the given chat conversation` operation, which is a `POST` request to the `/chat/completions` endpoint.
128+
1. In the **Request body** section, enter the following JSON to specify the model and an example prompt. In this example, the OpenAI-compatible `gemini-2.0-flash` model is used.
129+
130+
```json
131+
{"model":"gpt-4o","messages":[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"How are you?"}],"max_tokens":50}
132+
```
133+
134+
When the test is successful, the backend responds with a successful HTTP response code and some data. Appended to the response is token usage data to help you monitor and manage your language model token consumption.
135+
136+
:::image type="content" source="media/openai-compatible-llm-api/gemini-test-small.png" lightbox="media/openai-compatible-llm-api/gemini-test.png" alt-text="Screenshot of testing a Gemini LLM API in the portal.":::
137+
138+
87139

88140
[!INCLUDE [api-management-define-api-topics.md](../../includes/api-management-define-api-topics.md)]

0 commit comments

Comments
 (0)