|
| 1 | +--- |
| 2 | +title: Import an LLM API as REST API - Azure API Management |
| 3 | +description: How to import an OpenAI-compatible LLM API or other AI model as a REST API in Azure API Management. |
| 4 | +ms.service: azure-api-management |
| 5 | +author: dlepow |
| 6 | +ms.author: danlep |
| 7 | +ms.topic: how-to |
| 8 | +ms.date: 05/14/2025 |
| 9 | +ms.collection: ce-skilling-ai-copilot |
| 10 | +ms.custom: template-how-to, build-2024 |
| 11 | +--- |
| 12 | + |
| 13 | +# Import an LLM API |
| 14 | + |
| 15 | +[!INCLUDE [api-management-availability-all-tiers](../../includes/api-management-availability-all-tiers.md)] |
| 16 | + |
| 17 | +[INTRO] |
| 18 | + |
| 19 | +Learn more about managing AI APIs in API Management: |
| 20 | + |
| 21 | +* [Generative AI gateway capabilities in Azure API Management](genai-gateway-capabilities.md) |
| 22 | + |
| 23 | +## Prerequisites |
| 24 | + |
| 25 | +- An existing API Management instance. [Create one if you haven't already](get-started-create-service-instance.md). |
| 26 | +- A self-hosted LLM with an API endpoint. You can use an OpenAI-compatible LLM that's exposed by an inference provider such as [Hugging Face Text Generation Inference (TGI)](hhttps://huggingface.co/docs/text-generation-inference/en/index). Alternatively, you can access an LLM through a provider such as [Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html). |
| 27 | + > [!NOTE] |
| 28 | + > API Management policies such as [llm-token-limit](llm-token-limit-policy.md) and [llm-emit-token-metric](llm-emit-token-metric-policy.md) are supported for APIs available through the [Azure AI Model Inference API](/azure/ai-studio/reference/reference-model-inference-api) or with OpenAI-compatible models served through third-party inference providers. |
| 29 | +
|
| 30 | + |
| 31 | +## Import LLM API using the portal |
| 32 | + |
| 33 | +Jse the following steps to import an LLM API directly to API Management. |
| 34 | + |
| 35 | +[!INCLUDE [api-management-workspace-availability](../../includes/api-management-workspace-availability.md)] |
| 36 | + |
| 37 | +Depending on the API type you select to import, API Management automatically configures different operations to call the API: |
| 38 | + |
| 39 | +* **OpenAI-compatible API** - Operations for the LLM API's chat completion endpoint |
| 40 | +* **Passthrough API** - Wildcard operations for standard verbs `GET`, `HEAD`, `OPTIONS`, and `TRACK`. When you call the API, append any required path or parameters to the API request to pass a request to an LLM API endpoint. |
| 41 | + |
| 42 | +For an OpenAI-compatible API, you can optionally configure policies to help you monitor and manage the API. |
| 43 | + |
| 44 | +To import an LLM API to API Management: |
| 45 | + |
| 46 | +1. In the [Azure portal](https://portal.azure.com), navigate to your API Management instance. |
| 47 | +1. In the left menu, under **APIs**, select **APIs** > **+ Add API**. |
| 48 | +1. Under **Define a new API**, select **OpenAI API**. |
| 49 | + |
| 50 | + :::image type="content" source="media/openai-compatible-llm-api/openai-api.png" alt-text="Screenshot of creating an OpenAI-compatible API in the portal." ::: |
| 51 | + |
| 52 | +1. On the **Configure API** tab: |
| 53 | + 1. Enter a **Display name** and optional **Description** for the API. |
| 54 | + 1. Enter the **URL** to the LLM API endpoint. |
| 55 | + 1. Optionally select one or more **Products**l to associate with the API. |
| 56 | + 1. In **Path**, append a path that your API Management instance uses to access the LLM API endpoints. |
| 57 | + 1. In **Type**, select either **Create OpenAI API** or **Create a passthrough API**. |
| 58 | + 1. In **Access key**, optionally enter the authorization header name and API key used to access the LLM API. |
| 59 | + 1. Select **Next**. |
| 60 | +1. On the **Manage token consumption** tab, optionally enter settings or accept defaults that define the following policies to help monitor and manage the API: |
| 61 | + * [Manage token consumption](llm-token-limit-policy.md) |
| 62 | + * [Track token usage](llm-token-metric-policy.md) |
| 63 | +1. On the **Apply semantic caching** tab, optionally enter settings or accept defaults that define the policies to help optimize performance and reduce latency for the API: |
| 64 | + * [Enable semantic caching of responses](azure-openai-enable-semantic-caching.md) |
| 65 | +1. On the **AI content safety**, optionally enter settings or accept defaults to configure [Azure AI Content Safety](llm-content-safety-policy.md) for the API. |
| 66 | +1. Select **Review**. |
| 67 | +1. After settings are validated, select **Create**. |
| 68 | + |
| 69 | + |
| 70 | +## Test the LLM API |
| 71 | + |
| 72 | +To ensure that your LLM API is working as expected, test it in the API Management test console. |
| 73 | +1. Select the API you created in the previous step. |
| 74 | +1. Select the **Test** tab. |
| 75 | +1. Select an operation that's compatible with the model in the LLM API. |
| 76 | + The page displays fields for parameters and headers. |
| 77 | +1. Enter parameters and headers as needed. Depending on the operation, you may need to configure or update a **Request body**. |
| 78 | + > [!NOTE] |
| 79 | + > In the test console, API Management automatically populates an **Ocp-Apim-Subscription-Key** header, and configures the subscription key of the built-in [all-access subscription](api-management-subscriptions.md#all-access-subscription). This key enables access to every API in the API Management instance. Optionally display the **Ocp-Apim-Subscription-Key** header by selecting the "eye" icon next to the **HTTP Request**. |
| 80 | +1. Select **Send**. |
| 81 | + |
| 82 | + When the test is successful, the backend responds with a successful HTTP response code and some data. Appended to the response is token usage data to help you monitor and manage your Azure OpenAI API token consumption. |
| 83 | + |
| 84 | + |
| 85 | +[!INCLUDE [api-management-define-api-topics.md](../../includes/api-management-define-api-topics.md)] |
0 commit comments