Skip to content

Commit d7af0a1

Browse files
Merge branch 'main' into repo_sync_working_branch
2 parents f22db9c + 8a4ba1a commit d7af0a1

File tree

111 files changed

+4857
-1662
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

111 files changed

+4857
-1662
lines changed
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
author: cephalin
3+
ms.service: azure-app-service
4+
ms.topic: include
5+
ms.date: 05/07/2025
6+
ms.author: cephalin
7+
---
8+
9+
Hosting your own small language model (SLM) offers several advantages:
10+
11+
- Full control over your data. Sensitive information isn't exposed to external services, which is critical for industries with strict compliance requirements.
12+
- Self-hosted models can be fine-tuned to meet specific use cases or domain-specific requirements.
13+
- Minimized network latency and faster response times for a better user experience.
14+
- Full control over resource allocation, ensuring optimal performance for your application.
15+
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
---
2+
author: cephalin
3+
ms.service: azure-app-service
4+
ms.topic: include
5+
ms.date: 05/07/2025
6+
ms.author: cephalin
7+
---
8+
9+
## Frequently asked questions
10+
11+
### How does pricing tier affect the performance of the SLM sidecar?
12+
13+
Since AI models consume considerable resources, choose the pricing tier that gives you sufficient vCPUs and memory to run your specific model. For this reason, the built-in AI sidecar extensions only appear when the app is in a suitable pricing tier. If you build your own SLM sidecar container, you should also use a CPU-optimized model, since the App Service pricing tiers are CPU-only tiers.
14+
15+
For example, the [Phi-3 mini model with a 4K context length from Hugging Face](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx) is designed to run with limited resources and provides strong math and logical reasoning for many common scenarios. It also comes with a CPU-optimized version. In App Service, we tested the model on all premium tiers and found it to perform well in the [P2mv3](https://azure.microsoft.com/pricing/details/app-service/linux/) tier or higher. If your requirements allow, you can run it on a lower tier.
16+
17+
### How to use my own SLM sidecar?
18+
19+
The sample repository contains a sample SLM container that you can use as a sidecar. It runs a FastAPI application that listens on port 8000, as specified in its [Dockerfile](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/bring_your_own_slm/src/phi-3-sidecar/Dockerfile). The application uses [ONNX Runtime](https://onnxruntime.ai/docs/) to load the Phi-3 model, then forwards the HTTP POST data to the model and streams the response from the model back to the client. For more information, see [model_api.py](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/src/phi-3-sidecar/model_api.py).
20+
21+
To build the sidecar image yourself, you need to install Docker Desktop locally on your machine.
22+
23+
1. Clone the repository locally.
24+
25+
```bash
26+
git clone https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar
27+
cd ai-slm-in-app-service-sidecar
28+
```
29+
30+
1. Change into the Phi-3 image's source directory and download the model locally using the [Huggingface CLI](https://huggingface.co/docs/huggingface_hub/guides/cli).
31+
32+
```bash
33+
cd bring_your_own_slm/src/phi-3-sidecar
34+
huggingface-cli download microsoft/Phi-3-mini-4k-instruct-onnx --local-dir ./Phi-3-mini-4k-instruct-onnx
35+
```
36+
37+
The [Dockerfile](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/src/phi-3-sidecar/Dockerfile) is configured to copy the model from *./Phi-3-mini-4k-instruct-onnx*.
38+
39+
1. Build the Docker image. For example:
40+
41+
```bash
42+
docker build --tag phi-3 .
43+
```
44+
45+
1. Upload the built image to Azure Container Registry with [Push your first image to your Azure container registry using the Docker CLI](/azure/container-registry/container-registry-get-started-docker-cli).
46+
47+
1. In the **Deployment Center** > **Containers (new)** tab, select **Add** > **Custom container** and configure the new container as follows:
48+
- **Name**: *phi-3*
49+
- **Image source**: **Azure Container Registry**
50+
- **Registry**: your registry
51+
- **Image**: the uploaded image
52+
- **Tag**: the image tag you want
53+
- **Port**: *8000*
54+
1. Select **Apply**.
55+
56+
See [bring_your_own_slm/src/webapp](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/bring_your_own_slm/src/webapp) for a sample application that interacts with this custom sidecar container.
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
---
2+
author: cephalin
3+
ms.service: azure-app-service
4+
ms.topic: include
5+
ms.date: 05/07/2025
6+
ms.author: cephalin
7+
---
8+
9+
## Add the Phi-3 sidecar extension
10+
11+
In this section, you add the Phi-3 sidecar extension to your ASP.NET Core application hosted on Azure App Service.
12+
13+
1. Navigate to the Azure portal and go to your app's management page.
14+
2. In the left-hand menu, select **Deployment** > **Deployment Center**.
15+
3. On the **Containers** tab, select **Add** > **Sidecar extension**.
16+
4. In the sidecar extension options, select **AI: phi-3-mini-4k-instruct-q4-gguf (Experimental)**.
17+
5. Provide a name for the sidecar extension.
18+
6. Select **Save** to apply the changes.
19+
7. Wait a few minutes for the sidecar extension to deploy. Keep selecting **Refresh** until the **Status** column shows **Running**.
20+
21+
This Phi-3 sidecar extension uses a [chat completion API like OpenAI](https://platform.openai.com/docs/api-reference/chat/create) that can respond to chat completion response at `http://localhost:11434/v1/chat/completions`. For more information on how to interact with the API, see:
22+
23+
- [OpenAI documentation: Create chat completion](https://platform.openai.com/docs/api-reference/chat/create)
24+
- [OpenAI documentation: Streaming](https://platform.openai.com/docs/api-reference/chat-streaming)
25+
26+
## Test the chatbot
27+
28+
1. In your app's management page, in the left-hand menu, select **Overview**.
29+
1. Under **Default domain**, select the URL to open your web app in a browser.
30+
1. Verify that the chatbot application is running and responding to user inputs.
31+
32+
:::image type="content" source="../../media/tutorial-ai-slm-dotnet/fashion-store-assistant-live.png" alt-text="Screenshot showing the fashion assistant app running in the browser.":::
33+
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
---
2+
author: cephalin
3+
ms.service: azure-app-service
4+
ms.topic: include
5+
ms.date: 05/08/2025
6+
ms.author: cephalin
7+
---
8+
9+
### How do sidecar containers handle internal communication?
10+
11+
Sidecar containers share the same network host as the main container, so the main container (and other sidecar containers) can reach any port on the sidecar with `localhost:<port>`. The example *startup.sh* uses `localhost:4318` to access port 4318 on the **otel-collector** sidecar.
12+
13+
In the **Edit container** dialog, the **Port** box isn't currently used by App Service. You can use it as part of the sidecar metadata, such as to indicate which port the sidecar is listening to.
14+
15+
### Can a sidecar container receive internet requests?
16+
17+
No. App Service routes internet requests only to the main container. For code-based Linux apps, the built-in Linux container is the main container, and any sidecar container ([sitecontainers](/azure/templates/microsoft.web/sites/sitecontainers)) should be added with `IsMain=false`. For custom containers, all but one of the [sitecontainers](/azure/templates/microsoft.web/sites/sitecontainers) should have `IsMain=false`.
18+
19+
For more information on configuring `IsMain`, see [Microsoft.Web sites/sitecontainers](/azure/templates/microsoft.web/sites/sitecontainers).
20+
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
---
2+
author: cephalin
3+
ms.service: azure-app-service
4+
ms.topic: include
5+
ms.date: 05/08/2025
6+
ms.author: cephalin
7+
---
8+
9+
## What's a sidecar container?
10+
11+
In Azure App Service, you can add up to nine sidecar containers for each Linux app. Sidecar containers let you deploy extra services and features to your Linux apps without making them tightly coupled to the main container (built-in or custom). For example, you can add monitoring, logging, configuration, and networking services as sidecar containers. An OpenTelemetry collector sidecar is one such monitoring example.
12+
13+
The sidecar containers run alongside the main application container in the same App Service plan.
97.7 KB
Loading

articles/app-service/toc.yml

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,11 @@ items:
4949
href: app-service-asp-net-migration.md
5050
- name: Migrate containerized .NET
5151
href: ../migrate/tutorial-app-containerization-aspnet-app-service.md?bc=/azure/bread/toc.json&toc=/azure/app-service/toc.json
52+
- name: AI
53+
items:
54+
- name: Local SLM with sidecar extension
55+
href: tutorial-ai-slm-dotnet.md
56+
5257
- name: Java
5358
items:
5459
- name: Quickstart
@@ -87,6 +92,10 @@ items:
8792
href: /azure/developer/java/migration/migrate-weblogic-to-jboss-eap-on-azure-app-service?toc=/azure/app-service/toc.json&bc=/azure/bread/toc.json
8893
- name: WebSphere
8994
href: /azure/developer/java/migration/migrate-websphere-to-jboss-eap-on-azure-app-service?toc=/azure/app-service/toc.json&bc=/azure/bread/toc.json
95+
- name: AI
96+
items:
97+
- name: Local SLM with sidecar extension
98+
href: tutorial-ai-slm-spring-boot.md
9099
- name: Node.js
91100
items:
92101
- name: Quickstart
@@ -103,6 +112,10 @@ items:
103112
href: tutorial-connect-app-access-microsoft-graph-as-user-javascript.md
104113
- name: to other Azure services with managed identity
105114
href: tutorial-connect-app-access-storage-javascript.md
115+
- name: AI
116+
items:
117+
- name: Local SLM with sidecar extension
118+
href: tutorial-ai-slm-expressjs.md
106119
- name: Python
107120
items:
108121
- name: Quickstart
@@ -119,6 +132,10 @@ items:
119132
href: tutorial-python-postgresql-app-django.md
120133
- name: using FastAPI
121134
href: tutorial-python-postgresql-app-fastapi.md
135+
- name: AI
136+
items:
137+
- name: Local SLM with sidecar extension
138+
href: tutorial-ai-slm-fastapi.md
122139
- name: PHP
123140
items:
124141
- name: Quickstart
@@ -455,8 +472,6 @@ items:
455472
items:
456473
- name: Deploy an application that uses OpenAI on App Service
457474
href: deploy-intelligent-apps.md
458-
- name: Run an SLM in sidecar
459-
href: tutorial-sidecar-local-small-language-model.md
460475
- name: Deploy a .NET app with Azure OpenAI and Azure SQL
461476
href: deploy-intelligent-apps-dotnet-to-azure-sql.md
462477
- name: Invoke OpenAPI app from Azure AI Agent
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
---
2+
title: "Tutorial: ASP.NET Core chatbot with SLM extension"
3+
description: "Learn how to deploy a ASP.NET Core application integrated with a Phi-3 sidecar extension on Azure App Service."
4+
author: cephalin
5+
ms.author: cephalin
6+
ms.date: 05/07/2025
7+
ms.topic: tutorial
8+
---
9+
10+
# Tutorial: Run chatbot in App Service with a Phi-3 sidecar extension (ASP.NET Core)
11+
12+
This tutorial guides you through deploying a ASP.NET Core chatbot application integrated with the Phi-3 sidecar extension on Azure App Service. By following the steps, you'll learn how to set up a scalable web app, add an AI-powered sidecar for enhanced conversational capabilities, and test the chatbot's functionality.
13+
14+
[!INCLUDE [advantages](includes/tutorial-ai-slm/advantages.md)]
15+
16+
## Prerequisites
17+
18+
- An [Azure account](https://azure.microsoft.com/free/) with an active subscription.
19+
- A [GitHub account](https://github.com/).
20+
21+
## Deploy the sample application
22+
23+
1. In the browser, navigate to the [sample application repository](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar).
24+
2. Start a new Codespace from the repository.
25+
1. Log in with your Azure account:
26+
27+
```azurecli
28+
az login
29+
```
30+
31+
1. Open the terminal in the Codespace and run the following commands:
32+
33+
```azurecli
34+
cd use_sidecar_extension/dotnetapp
35+
az webapp up --sku P3MV3 --os-type linux
36+
```
37+
38+
This startup command is a common setup for deploying ASP.NET Core applications to Azure App Service. For more information, see [Quickstart: Deploy an ASP.NET web app](quickstart-dotnetcore.md).
39+
40+
[!INCLUDE [phi-3-extension-create-test](includes/tutorial-ai-slm/phi-3-extension-create-test.md)]
41+
42+
## How the sample application works
43+
44+
The sample application demonstrates how to integrate a .NET service with the SLM sidecar extension. The `SLMService` class encapsulates the logic for sending requests to the SLM API and processing the streamed responses. This integration enables the application to generate conversational responses dynamically.
45+
46+
Looking in [use_sidecar_extension/dotnetapp/Services/SLMService.cs](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/use_sidecar_extension/dotnetapp/Services/SLMService.cs), you see that:
47+
48+
- The service reads the URL from `fashion.assistant.api.url`, which is set in *appsettings.json* and has the value of `http://localhost:11434/v1/chat/completions`.
49+
50+
```csharp
51+
public SLMService(HttpClient httpClient, IConfiguration configuration)
52+
{
53+
_httpClient = httpClient;
54+
_apiUrl = configuration["FashionAssistantAPI:Url"] ?? "httpL//localhost:11434";
55+
}
56+
```
57+
58+
- The POST payload includes the system message and the prompt that's built from the selected product and the user query.
59+
60+
```csharp
61+
var requestPayload = new
62+
{
63+
messages = new[]
64+
{
65+
new { role = "system", content = "You are a helpful assistant." },
66+
new { role = "user", content = prompt }
67+
},
68+
stream = true,
69+
cache_prompt = false,
70+
n_predict = 150
71+
};
72+
```
73+
74+
- The POST request streams the response line by line. Each line is parsed to extract the generated content (or token).
75+
76+
```csharp
77+
var response = await _httpClient.SendAsync(request, HttpCompletionOption.ResponseHeadersRead);
78+
response.EnsureSuccessStatusCode();
79+
80+
var stream = await response.Content.ReadAsStreamAsync();
81+
using var reader = new StreamReader(stream);
82+
83+
while (!reader.EndOfStream)
84+
{
85+
var line = await reader.ReadLineAsync();
86+
line = line?.Replace("data: ", string.Empty).Trim();
87+
if (!string.IsNullOrEmpty(line) && line != "[DONE]")
88+
{
89+
var jsonObject = JsonNode.Parse(line);
90+
var responseContent = jsonObject?["choices"]?[0]?["delta"]?["content"]?.ToString();
91+
if (!string.IsNullOrEmpty(responseContent))
92+
{
93+
yield return responseContent;
94+
}
95+
}
96+
}
97+
```
98+
99+
[!INCLUDE [faq](includes/tutorial-ai-slm/faq.md)]
100+
101+
## Next steps
102+
103+
[Tutorial: Configure a sidecar container for a Linux app in Azure App Service](tutorial-sidecar.md)

0 commit comments

Comments
 (0)