Skip to content

Commit 0edf089

Browse files
authored
Merge pull request #294435 from cephalin/aicontent
add SLM in sidecar tutorial
2 parents 2e66e3e + a0d9a5c commit 0edf089

File tree

5 files changed

+152
-5
lines changed

5 files changed

+152
-5
lines changed
122 KB
Loading
264 KB
Loading

articles/app-service/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -399,6 +399,8 @@
399399
items:
400400
- name: Deploy an application that uses OpenAI on App Service
401401
href: deploy-intelligent-apps.md
402+
- name: Run an SLM in sidecar
403+
href: tutorial-sidecar-local-small-language-model.md
402404
- name: Deploy a .NET app with Azure OpenAI and Azure SQL
403405
href: deploy-intelligent-apps-dotnet-to-azure-sql.md
404406
- name: WordPress
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
---
2+
title: 'Tutorial: Run a local SLM in a sidecar container'
3+
description: Learn how to run local SLM inferencing for your web app in a sidecar container on Azure App Service, and separate your web app and your AI model for operational efficiency.
4+
ms.topic: tutorial
5+
ms.date: 02/20/2025
6+
ms.author: cephalin
7+
author: cephalin
8+
keywords: azure app service, linux, docker, sidecar, ai, chatbot, slm, small language model, local SLM, Azure tutorial
9+
---
10+
11+
# Run a local SLM in a sidecar container in Azure App Service
12+
13+
In this tutorial, you learn how to run a small language model (SLM) as a sidecar container in Azure App Service and access it in your main Linux container. By the end of this tutorial, you'll have a fashion assistant chat application running in App Service and accessing a model locally.
14+
15+
:::image type="content" source="media/tutorial-sidecar-local-small-language-model/web-app-slm-sidecar.png" alt-text="A screenshot showing a fashion assistant chat app in Azure App Service.":::
16+
17+
Running an SLM locally is beneficial if you want to run a chatbot application without sending your business data over the internet to a cloud-based AI chatbot service.
18+
19+
- **High-performance pricing tiers**: App Service offers high-performance pricing tiers that help you run AI models at scale.
20+
- **Separation of concerns**: Running an SLM in a sidecar lets you separate AI logic from your application logic. You can maintain the discrete components separately, such as upgrading your model without affecting your application.
21+
22+
## Prerequisites
23+
24+
* An Azure account with an active subscription. If you don't have an Azure account, you [can create one for free](https://azure.microsoft.com/free/java/).
25+
* A GitHub account. you can also [get one for free](https://github.com/join).
26+
27+
## Performance considerations
28+
29+
Since AI models consume considerable resources, choose the pricing tier that gives you sufficient vCPUs and memory to run your specific model. In practice, you should also use a CPU-optimized model, since the App Service pricing tiers are CPU-only tiers.
30+
31+
This tutorial uses the [Phi-3 mini model with a 4K context length from Hugging Face](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx). It's designed to run with limited resources and provides strong math and logical reasoning for many common scenarios. It also comes with a CPU-optimized version. In App Service, we tested the model on all premium tiers and found it to perform well in the [P2mv3](https://azure.microsoft.com/pricing/details/app-service/linux/) tier. If your requirements allow, you can run it on a lower tier.
32+
33+
## 1. Inspect the sample in GitHub Codespaces
34+
35+
1. Sign in to your GitHub account and navigate to [https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/fork](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/fork).
36+
1. Select **Create fork**.
37+
1. Select **Code** > **Create codespace on main**. The codespace takes a few minutes to set up.
38+
39+
The sample repository has the following content:
40+
41+
| Content | Description |
42+
|--------------------|-----------------------------------------------------------------------------|
43+
| *src/phi-3-sidecar*| Docker image code that runs a Python FastAPI endpoint for the Phi-3 mini model. See [How does the Phi-3 sidecar container work?](#how-does-the-phi-3-sidecar-container-work) |
44+
| *src/webapp* | A front-end .NET Blazor application. See [How does the front-end app work?](#how-does-the-front-end-app-work) |
45+
| *infra* | Infrastructure-as-code for deploying a .NET web app in Azure. See [Create Azure Developer CLI templates overview](/azure/developer/azure-developer-cli/make-azd-compatible). |
46+
| *azure.yaml* | Azure Developer CLI configuration that deploys the Blazor application to App Service. See [Create Azure Developer CLI templates overview](/azure/developer/azure-developer-cli/make-azd-compatible). |
47+
48+
## 2. Deploy the front-end application
49+
50+
1. Sign into your Azure account by using the `azd auth login` command and following the prompt:
51+
52+
```bash
53+
azd auth login
54+
```
55+
56+
1. Create the App Service app and deploy the code using the `azd up` command:
57+
58+
```bash
59+
azd up
60+
```
61+
62+
The `azd up` command might take a few minutes to complete. `azd up` uses the Bicep files in your projects to create an App Service app in the **P2mv3** pricing tier, then deploys the .NET app in `src/webapp`.
63+
64+
## 3. Add the Phi-3 sidecar
65+
66+
This section assumes that you already built a Phi-3 Docker image and uploaded it to a registry. You'll use a preuploaded image in Microsoft Container Registry instead. To build and upload the image yourself, see [How to build the Phi-3 Docker image locally](#how-to-build-the-phi-3-docker-image-locally).
67+
68+
1. In the [Azure portal](https://portal.azure.com), navigate to the app's management page.
69+
1. In the app's management page, from the left menu, select **Deployment Center**.
70+
1. Select the banner **Interested in adding containers to run alongside your app? Click here to give it a try.**
71+
1. When the page reloads, select the **Containers (new)** tab.
72+
1. Select **Add** and configure the new container as follows:
73+
- **Name**: *phi-3*
74+
- **Image source**: **Other container registries**
75+
- **Image type**: **Public**
76+
- **Registry server URL**: *mcr.microsoft.com*
77+
- **Image and tag**: *appsvc/docs/sidecars/sample-experiment:phi3-python-1.0*
78+
1. Select **Apply**.
79+
80+
## 4. Verify the running app
81+
82+
1. In the AZD output, find the URL of your app and navigate to it in the browser. The URL looks like this in the AZD output:
83+
84+
<pre>
85+
Deploying services (azd deploy)
86+
87+
(✓) Done: Deploying service web
88+
- Endpoint: https://&lt;app-name>.azurewebsites.net/
89+
</pre>
90+
91+
1. Select a product, ask any question you like about it, and select **Send**.
92+
93+
:::image type="content" source="media/tutorial-sidecar-local-small-language-model/browse-app.png" alt-text="A screenshot showing an AI chat bot running within App Service.":::
94+
95+
## Frequently asked questions
96+
97+
- [How does the Phi-3 sidecar container work?](#how-does-the-phi-3-sidecar-container-work)
98+
- [How does the front-end app work?](#how-does-the-front-end-app-work)
99+
- [How to build the Phi-3 Docker image locally](#how-to-build-the-phi-3-docker-image-locally)
100+
101+
#### How does the Phi-3 sidecar container work?
102+
103+
It runs a FastAPI application that listens on port 8000, as specified in its [Dockerfile](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/src/phi-3-sidecar/Dockerfile).
104+
105+
The application uses [ONNX Runtime](https://onnxruntime.ai/docs/) to load the Phi-3 model, then forwards the HTTP POST data to the model and streams the response from the model back to the client. For more information, see [model_api.py](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/src/phi-3-sidecar/model_api.py).
106+
107+
#### How does the front-end app work?
108+
109+
It's a basic retrieval-augmented generation (RAG) application. It shows a Razor page that sends three pieces of information to the FastAPI endpoint (at `localhost:8000`) in `Send()`:
110+
111+
- Selected product
112+
- Retrieved product description data
113+
- User-submitted message
114+
115+
It then outputs the streamed response to the page. For more information, see [Home.razor](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/src/webapp/Components/Pages/Home.razor).
116+
117+
#### How to build the Phi-3 Docker image locally
118+
119+
To build the sidecar image yourself, you need to install Docker Desktop locally on your machine.
120+
121+
1. Clone the repository locally.
122+
123+
```bash
124+
git clone https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar
125+
cd ai-slm-in-app-service-sidecar
126+
```
127+
128+
1. Change into the Phi-3 image's source directory and download the model locally.
129+
130+
```bash
131+
cd src/phi-3-sidecar/
132+
huggingface-cli download microsoft/Phi-3-mini-4k-instruct-onnx --local-dir ./Phi-3-mini-4k-instruct-onnx
133+
```
134+
135+
The [Dockerfile](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/src/phi-3-sidecar/Dockerfile) is configured to copy the model from *./Phi-3-mini-4k-instruct-onnx*.
136+
137+
1. Build the Docker image. For example:
138+
139+
```bash
140+
docker build --tag phi-3 .
141+
```
142+
143+
To upload the built image to Azure Container Registry, see [Push your first image to your Azure container registry using the Docker CLI](/azure/container-registry/container-registry-get-started-docker-cli).
144+
145+
## More resources
146+
147+
- [Try out sidecars in this guided lab](https://mslabs.cloudguides.com/guides/Sidecars%20in%20Azure%20App%20Service)

articles/app-service/tutorial-sidecar.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: 'Tutorial: Configure a sidecar container'
33
description: Add sidecar containers to your Linux app in Azure App Service. Add or update services to your application without changing your application code.
44
ms.topic: tutorial
5-
ms.date: 11/19/2024
5+
ms.date: 02/20/2025
66
ms.author: cephalin
77
author: cephalin
88
keywords: azure app service, web app, linux, windows, docker, sidecar
@@ -78,16 +78,13 @@ After a few minutes, this .NET web application is deployed as MyFirstAzureWebApp
7878

7979
## 3. Add a sidecar container
8080

81-
In this section, you add a sidecar container to your Linux app. The portal experience is still being rolled out. If it's not available to you yet, continue with the **Use ARM template** tab below.
81+
In this section, you add a sidecar container to your Linux app.
8282

8383
### [Use portal UI](#tab/portal)
8484

8585
1. In the [Azure portal](https://portal.azure.com), navigate to the app's management page
8686
1. In the app's management page, from the left menu, select **Deployment Center**.
8787
1. Select the banner **Interested in adding containers to run alongside your app? Click here to give it a try.**
88-
89-
If you can't see the banner, then the portal UI isn't rolled out for your subscription yet. Select the **Use ARM template** tab here instead and continue.
90-
9188
1. When the page reloads, select the **Containers (new)** tab.
9289
1. Select **Add** and configure the new container as follows:
9390
- **Name**: *otel-collector*
@@ -272,6 +269,7 @@ You can use a similar approach to instrument apps in other language stacks. For
272269

273270
## More resources
274271

272+
- [Run a local SLM in a sidecar container in Azure App Service](tutorial-sidecar-local-small-language-model.md)
275273
- [Try out sidecars in this guided lab](https://mslabs.cloudguides.com/guides/Sidecars%20in%20Azure%20App%20Service)
276274
- [Deploy to App Service using GitHub Actions](deploy-github-actions.md)
277275
- [OpenTelemetry](https://opentelemetry.io/)

0 commit comments

Comments
 (0)