Skip to content

Commit 2e224e2

Browse files
committed
remove old slm doc
1 parent 5ca114b commit 2e224e2

9 files changed

+129
-238
lines changed
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
---
2+
author: cephalin
3+
ms.service: azure-app-service
4+
ms.topic: include
5+
ms.date: 05/07/2025
6+
ms.author: cephalin
7+
---
8+
9+
## Frequently asked questions
10+
11+
## How does pricing tier affect the performance of the SLM sidecar?
12+
13+
Since AI models consume considerable resources, choose the pricing tier that gives you sufficient vCPUs and memory to run your specific model. For this reason, the built-in AI sidecar extensions only appear when the app is in a suitable pricing tier. If you build your own SLM sidecar container, you should also use a CPU-optimized model, since the App Service pricing tiers are CPU-only tiers.
14+
15+
For example, the [Phi-3 mini model with a 4K context length from Hugging Face](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx) is designed to run with limited resources and provides strong math and logical reasoning for many common scenarios. It also comes with a CPU-optimized version. In App Service, we tested the model on all premium tiers and found it to perform well in the [P2mv3](https://azure.microsoft.com/pricing/details/app-service/linux/) tier or higher. If your requirements allow, you can run it on a lower tier.
16+
17+
### How use my own SLM sidecar?
18+
19+
The sample respository contains a sample SLM container that you can use as a sidecar. It runs a FastAPI application that listens on port 8000, as specified in its [Dockerfile](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/bring_your_own_slm/src/phi-3-sidecar/Dockerfile). The application uses [ONNX Runtime](https://onnxruntime.ai/docs/) to load the Phi-3 model, then forwards the HTTP POST data to the model and streams the response from the model back to the client. For more information, see [model_api.py](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/src/phi-3-sidecar/model_api.py).
20+
21+
To build the sidecar image yourself, you need to install Docker Desktop locally on your machine.
22+
23+
1. Clone the repository locally.
24+
25+
```bash
26+
git clone https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar
27+
cd ai-slm-in-app-service-sidecar
28+
```
29+
30+
1. Change into the Phi-3 image's source directory and download the model locally using the [Huggingface CLI](https://huggingface.co/docs/huggingface_hub/guides/cli).
31+
32+
```bash
33+
cd bring_your_own_slm/src/phi-3-sidecar
34+
huggingface-cli download microsoft/Phi-3-mini-4k-instruct-onnx --local-dir ./Phi-3-mini-4k-instruct-onnx
35+
```
36+
37+
The [Dockerfile](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/src/phi-3-sidecar/Dockerfile) is configured to copy the model from *./Phi-3-mini-4k-instruct-onnx*.
38+
39+
1. Build the Docker image. For example:
40+
41+
```bash
42+
docker build --tag phi-3 .
43+
```
44+
45+
1. Upload the built image to Azure Container Registry with [Push your first image to your Azure container registry using the Docker CLI](/azure/container-registry/container-registry-get-started-docker-cli).
46+
47+
1. In the **Deployment Center** > **Containers (new)** tab, select **Add** > **Custom container** and configure the new container as follows:
48+
- **Name**: *phi-3*
49+
- **Image source**: **Azure Container Registry**
50+
- **Registry**: your registry
51+
- **Image**: the uploaded image
52+
- **Tag**: the image tag you want
53+
- **Port**: *8000*
54+
1. Select **Apply**.
55+
56+
See [bring_your_own_slm/src/webapp](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/bring_your_own_slm/src/webapp) for a sample application that interacts with this custom sidecar container.
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
---
2+
author: cephalin
3+
ms.service: azure-app-service
4+
ms.topic: include
5+
ms.date: 05/07/2025
6+
ms.author: cephalin
7+
---
8+
9+
## Add the Phi-3 sidecar extension
10+
11+
In this section, you add the Phi-3 sidecar extension to your ASP.NET Core application hosted on Azure App Service.
12+
13+
1. Navigate to the Azure portal and go to your app's management page.
14+
2. In the left-hand menu, select **Deployment** > **Deployment Center**.
15+
3. On the **Containers** tab, select **Add** > **Sidecar extension**.
16+
4. In the sidecar extension options, select **AI: phi-3-mini-4k-instruct-q4-gguf (Experimental)**.
17+
5. Provide a name for the sidecar extension.
18+
6. Select **Save** to apply the changes.
19+
7. Wait a few minutes for the sidecar extension to deploy. Keep selecting **Refresh** until the **Status** column shows **Running**.
20+
21+
This Phi-3 sidecar extension uses a [chat completion API like OpenAI](https://platform.openai.com/docs/api-reference/chat/create) that can respond to chat completion response at `http://localhost:11434/v1/chat/completions`. For more information on how to interact with the API, see:
22+
23+
- [OpenAI documentation: Create chat completion](https://platform.openai.com/docs/api-reference/chat/create)
24+
- [OpenAI documentation: Streaming](https://platform.openai.com/docs/api-reference/chat-streaming)
25+
26+
## Test the chatbot
27+
28+
1. In your app's management page, in the left-hand menu, select **Overview**.
29+
1. Under **Default domain**, select the URL to open your web app in a browser.
30+
1. Verify that the chatbot application is running and responding to user inputs.
31+
32+
:::image type="content" source="../media/tutorial-ai-slm-dotnet/fashion-store-assistant-live.png" alt-text="screenshot showing the fashion assistant app running in the browser.":::
33+

articles/app-service/toc.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -472,8 +472,6 @@ items:
472472
items:
473473
- name: Deploy an application that uses OpenAI on App Service
474474
href: deploy-intelligent-apps.md
475-
- name: Run an SLM in sidecar
476-
href: tutorial-sidecar-local-small-language-model.md
477475
- name: Deploy a .NET app with Azure OpenAI and Azure SQL
478476
href: deploy-intelligent-apps-dotnet-to-azure-sql.md
479477
- name: Invoke OpenAPI app from Azure AI Agent

articles/app-service/tutorial-ai-slm-dotnet.md

Lines changed: 9 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Hosting your own small language model (SLM) offers several advantages:
2626

2727
## Deploy the sample application
2828

29-
1. In the browser, navigate to the [sample application repository](https://github.com/cephalin/sidecar-samples).
29+
1. In the browser, navigate to the [sample application repository](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar).
3030
2. Start a new Codespace from the repository.
3131
1. Log in with your Azure account:
3232

@@ -37,37 +37,19 @@ Hosting your own small language model (SLM) offers several advantages:
3737
1. Open the terminal in the Codespace and run the following commands:
3838
3939
```azurecli
40-
cd dotnetapp
40+
cd use_sidecar_extension/dotnetapp
4141
az webapp up --sku P3MV3 --os-type linux
4242
```
4343
4444
This startup command is a common setup for deploying ASP.NET Core applications to Azure App Service. For more information, see [Quickstart: Deploy an ASP.NET web app](quickstart-dotnetcore.md).
4545
46-
## Add the Phi-3 sidecar extension
47-
48-
In this section, you add the Phi-3 sidecar extension to your ASP.NET Core application hosted on Azure App Service.
49-
50-
1. Navigate to the Azure portal and go to your app's management page.
51-
2. In the left-hand menu, select **Deployment** > **Deployment Center**.
52-
3. On the **Containers** tab, select **Add** > **Sidecar extension**.
53-
4. In the sidecar extension options, select **AI: phi-3-mini-4k-instruct-q4-gguf (Experimental)**.
54-
5. Provide a name for the sidecar extension.
55-
6. Select **Save** to apply the changes.
56-
7. Wait a few minutes for the sidecar extension to deploy. Keep selecting **Refresh** until the **Status** column shows **Running**.
57-
58-
## Test the chatbot
59-
60-
1. In your app's management page, in the left-hand menu, select **Overview**.
61-
1. Under **Default domain**, select the URL to open your web app in a browser.
62-
1. Verify that the chatbot application is running and responding to user inputs.
63-
64-
:::image type="content" source="media/tutorial-ai-slm-dotnet/fashion-store-assistant-live.png" alt-text="screenshot showing the fashion assistant app running in the browser.":::
46+
[!INCLUDE [phi-3-extension-create-test](includes/tutorial-ai-slm/phi-3-extension-create-test.md)]
6547
6648
## How the sample application works
6749
6850
The sample application demonstrates how to integrate a .NET service with the SLM sidecar extension. The `SLMService` class encapsulates the logic for sending requests to the SLM API and processing the streamed responses. This integration enables the application to generate conversational responses dynamically.
6951
70-
Looking in https://github.com/cephalin/sidecar-samples/blob/webstacks/dotnetapp/Services/SLMService.cs, you see that:
52+
Looking in [use_sidecar_extension/dotnetapp/Services/SLMService.cs](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/use_sidecar_extension/dotnetapp/Services/SLMService.cs), you see that:
7153
7254
- The service reads the URL from `fashion.assistant.api.url`, which is set in *appsettings.json* and has the value of `http://localhost:11434/v1/chat/completions`.
7355
@@ -78,6 +60,7 @@ Looking in https://github.com/cephalin/sidecar-samples/blob/webstacks/dotnetapp/
7860
_apiUrl = configuration["FashionAssistantAPI:Url"] ?? "httpL//localhost:11434";
7961
}
8062
```
63+
8164
- The POST payload includes the system message and the prompt that's built from the selected product and the user query.
8265
8366
```csharp
@@ -119,4 +102,8 @@ Looking in https://github.com/cephalin/sidecar-samples/blob/webstacks/dotnetapp/
119102
}
120103
```
121104
105+
[!INCLUDE [faq](includes/tutorial-ai-slm/faq.md)]
106+
122107
## Next steps
108+
109+
[Tutorial: Configure a sidecar container for a Linux app in Azure App Service](tutorial-sidecar.md)

articles/app-service/tutorial-ai-slm-expressjs.md

Lines changed: 9 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Hosting your own small language model (SLM) offers several advantages:
2626

2727
## Deploy the sample application
2828

29-
1. In the browser, navigate to the [sample application repository](https://github.com/cephalin/sidecar-samples).
29+
1. In the browser, navigate to the [sample application repository](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar).
3030
2. Start a new Codespace from the repository.
3131
1. Log in with your Azure account:
3232

@@ -37,43 +37,26 @@ Hosting your own small language model (SLM) offers several advantages:
3737
1. Open the terminal in the Codespace and run the following commands:
3838
3939
```azurecli
40-
cd expressapp
40+
cd use_sidecar_extension/expressapp
4141
az webapp up --sku P3MV3
4242
```
4343
4444
This startup command is a common setup for deploying Express.js applications to Azure App Service. For more information, see [Deploy a Node.js web app in Azure](quickstart-nodejs.md).
4545
46-
## Add the Phi-3 sidecar extension
47-
48-
In this section, you add the Phi-3 sidecar extension to your Express.js application hosted on Azure App Service.
49-
50-
1. Navigate to the Azure portal and go to your app's management page.
51-
2. In the left-hand menu, select **Deployment** > **Deployment Center**.
52-
3. On the **Containers** tab, select **Add** > **Sidecar extension**.
53-
4. In the sidecar extension options, select **AI: phi-3-mini-4k-instruct-q4-gguf (Experimental)**.
54-
5. Provide a name for the sidecar extension.
55-
6. Select **Save** to apply the changes.
56-
7. Wait a few minutes for the sidecar extension to deploy. Keep selecting **Refresh** until the **Status** column shows **Running**.
57-
58-
## Test the chatbot
59-
60-
1. In your app's management page, in the left-hand menu, select **Overview**.
61-
1. Under **Default domain**, select the URL to open your web app in a browser.
62-
1. Verify that the chatbot application is running and responding to user inputs.
63-
64-
:::image type="content" source="media/tutorial-ai-slm-dotnet/fashion-store-assistant-live.png" alt-text="screenshot showing the fashion assistant app running in the browser.":::
46+
[!INCLUDE [phi-3-extension-create-test](includes/tutorial-ai-slm/phi-3-extension-create-test.md)]
6547
6648
## How the sample application works
6749
6850
The sample application demonstrates how to integrate a Express.js-based service with the SLM sidecar extension. The `SLMService` class encapsulates the logic for sending requests to the SLM API and processing the streamed responses. This integration enables the application to generate conversational responses dynamically.
6951
70-
Looking in https://github.com/cephalin/sidecar-samples/blob/webstacks/expressapp/src/services/slm_service.js, you see that:
52+
Looking in [use_sidecar_extension/expressapp/src/services/slm_service.js](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/use_sidecar_extension/expressapp/src/services/slm_service.js), you see that:
7153
7254
- The service sends a POST request to the SLM endpoint `http://127.0.0.1:11434/v1/chat/completions`.
7355
7456
```javascript
7557
this.apiUrl = 'http://127.0.0.1:11434/v1/chat/completions';
7658
```
59+
7760
- The POST payload includes the system message and the prompt that's built from the selected product and the user query.
7861
7962
```javascript
@@ -134,4 +117,8 @@ Looking in https://github.com/cephalin/sidecar-samples/blob/webstacks/expressapp
134117
});
135118
```
136119
120+
[!INCLUDE [faq](includes/tutorial-ai-slm/faq.md)]
121+
137122
## Next steps
123+
124+
[Tutorial: Configure a sidecar container for a Linux app in Azure App Service](tutorial-sidecar.md)

articles/app-service/tutorial-ai-slm-fastapi.md

Lines changed: 8 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Hosting your own small language model (SLM) offers several advantages:
2525

2626
## Deploy the sample application
2727

28-
1. In the browser, navigate to the [sample application repository](https://github.com/cephalin/sidecar-samples).
28+
1. In the browser, navigate to the [sample application repository](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar).
2929
2. Start a new Codespace from the repository.
3030
1. Log in with your Azure account:
3131

@@ -36,38 +36,20 @@ Hosting your own small language model (SLM) offers several advantages:
3636
1. Open the terminal in the Codespace and run the following commands:
3737
3838
```azurecli
39-
cd fastapiapp
39+
cd use_sidecar_extension/fastapiapp
4040
az webapp up --sku P3MV3
4141
az webapp config set --startup-file "gunicorn -w 4 -k uvicorn.workers.UvicornWorker app.main:app"
4242
```
4343
4444
This startup command is a common setup for deploying FastAPI applications to Azure App Service. For more information, see [Quickstart: Deploy a Python (Django, Flask, or FastAPI) web app to Azure App Service](quickstart-python.md).
4545
46-
## Add the Phi-3 sidecar extension
47-
48-
In this section, you add the Phi-3 sidecar extension to your FastAPI application hosted on Azure App Service.
49-
50-
1. Navigate to the Azure portal and go to your app's management page.
51-
2. In the left-hand menu, select **Deployment** > **Deployment Center**.
52-
3. On the **Containers** tab, select **Add** > **Sidecar extension**.
53-
4. In the sidecar extension options, select **AI: phi-3-mini-4k-instruct-q4-gguf (Experimental)**.
54-
5. Provide a name for the sidecar extension.
55-
6. Select **Save** to apply the changes.
56-
7. Wait a few minutes for the sidecar extension to deploy. Keep selecting **Refresh** until the **Status** column shows **Running**.
57-
58-
## Test the chatbot
59-
60-
1. In your app's management page, in the left-hand menu, select **Overview**.
61-
1. Under **Default domain**, select the URL to open your web app in a browser.
62-
1. Verify that the chatbot application is running and responding to user inputs.
63-
64-
:::image type="content" source="media/tutorial-ai-slm-dotnet/fashion-store-assistant-live.png" alt-text="screenshot showing the fashion assistant app running in the browser.":::
46+
[!INCLUDE [phi-3-extension-create-test](includes/tutorial-ai-slm/phi-3-extension-create-test.md)]
6547
6648
## How the sample application works
6749
6850
The sample application demonstrates how to integrate a FastAPI-based service with the SLM sidecar extension. The `SLMService` class encapsulates the logic for sending requests to the SLM API and processing the streamed responses. This integration enables the application to generate conversational responses dynamically.
6951
70-
Looking in https://github.com/cephalin/sidecar-samples/blob/webstacks/fastapiapp/app/services/slm_service.py, you see that:
52+
Looking in [use_sidecar_extension/fastapiapp/app/services/slm_service.py](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/use_sidecar_extension/fastapiapp/app/services/slm_service.py), you see that:
7153
7254
- The service sends a POST request to the SLM endpoint `http://localhost:11434/v1/chat/completions`.
7355
@@ -116,4 +98,8 @@ Looking in https://github.com/cephalin/sidecar-samples/blob/webstacks/fastapiapp
11698
yield content
11799
```
118100
101+
[!INCLUDE [faq](includes/tutorial-ai-slm/faq.md)]
102+
119103
## Next steps
104+
105+
[Tutorial: Configure a sidecar container for a Linux app in Azure App Service](tutorial-sidecar.md)

articles/app-service/tutorial-ai-slm-spring-boot.md

Lines changed: 9 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Hosting your own small language model (SLM) offers several advantages:
2626

2727
## Deploy the sample application
2828

29-
1. In the browser, navigate to the [sample application repository](https://github.com/cephalin/sidecar-samples).
29+
1. In the browser, navigate to the [sample application repository](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar).
3030
2. Start a new Codespace from the repository.
3131
1. Log in with your Azure account:
3232

@@ -37,36 +37,18 @@ Hosting your own small language model (SLM) offers several advantages:
3737
1. Open the terminal in the Codespace and run the following commands:
3838
3939
```azurecli
40-
cd springapp
40+
cd use_sidecar_extension/springapp
4141
./mvnw clean package
4242
az webapp up --sku P3MV3 --runtime "JAVA:21-java21" --os-type linux
4343
```
4444
45-
## Add the Phi-3 sidecar extension
46-
47-
In this section, you add the Phi-3 sidecar extension to your FastAPI application hosted on Azure App Service.
48-
49-
1. Navigate to the Azure portal and go to your app's management page.
50-
2. In the left-hand menu, select **Deployment** > **Deployment Center**.
51-
3. On the **Containers** tab, select **Add** > **Sidecar extension**.
52-
4. In the sidecar extension options, select **AI: phi-3-mini-4k-instruct-q4-gguf (Experimental)**.
53-
5. Provide a name for the sidecar extension.
54-
6. Select **Save** to apply the changes.
55-
7. Wait a few minutes for the sidecar extension to deploy. Keep selecting **Refresh** until the **Status** column shows **Running**.
56-
57-
## Test the chatbot
58-
59-
1. In your app's management page, in the left-hand menu, select **Overview**.
60-
1. Under **Default domain**, select the URL to open your web app in a browser.
61-
1. Verify that the chatbot application is running and responding to user inputs.
62-
63-
:::image type="content" source="media/tutorial-ai-slm-dotnet/fashion-store-assistant-live.png" alt-text="screenshot showing the fashion assistant app running in the browser.":::
45+
[!INCLUDE [phi-3-extension-create-test](includes/tutorial-ai-slm/phi-3-extension-create-test.md)]
6446
6547
## How the sample application works
6648
6749
The sample application demonstrates how to integrate a Java service with the SLM sidecar extension. The `ReactiveSLMService` class encapsulates the logic for sending requests to the SLM API and processing the streamed responses. This integration enables the application to generate conversational responses dynamically.
6850
69-
Looking in https://github.com/cephalin/sidecar-samples/blob/webstacks/springapp/src/main/java/com/example/springapp/service/ReactiveSLMService.java, you see that:
51+
Looking in [use_sidecar_extension/springapp/src/main/java/com/example/springapp/service/ReactiveSLMService.java](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/use_sidecar_extension/springapp/src/main/java/com/example/springapp/service/ReactiveSLMService.java), you see that:
7052
7153
- The service reads the URL from `fashion.assistant.api.url`, which is set in *application.properties* and has the value of `http://localhost:11434/v1/chat/completions`.
7254
@@ -77,6 +59,7 @@ Looking in https://github.com/cephalin/sidecar-samples/blob/webstacks/springapp/
7759
.build();
7860
}
7961
```
62+
8063
- The POST payload includes the system message and the prompt that's built from the selected product and the user query.
8164
8265
```java
@@ -116,4 +99,8 @@ Looking in https://github.com/cephalin/sidecar-samples/blob/webstacks/springapp/
11699
.map(content -> content.replace(" ", "\u00A0"));
117100
```
118101
102+
[!INCLUDE [faq](includes/tutorial-ai-slm/faq.md)]
103+
119104
## Next steps
105+
106+
[Tutorial: Configure a sidecar container for a Linux app in Azure App Service](tutorial-sidecar.md)

0 commit comments

Comments
 (0)