Skip to content

Commit 5063bb8

Browse files
committed
add slm extension tutorials
1 parent 7e3e5df commit 5063bb8

File tree

7 files changed

+523
-0
lines changed

7 files changed

+523
-0
lines changed
97.7 KB
Loading

articles/app-service/toc.yml

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,11 @@ items:
4949
href: app-service-asp-net-migration.md
5050
- name: Migrate containerized .NET
5151
href: ../migrate/tutorial-app-containerization-aspnet-app-service.md?bc=/azure/bread/toc.json&toc=/azure/app-service/toc.json
52+
- name: AI
53+
items:
54+
- name: Local SLM with sidecar extension
55+
href: tutorial-ai-slm-dotnet.md
56+
5257
- name: Java
5358
items:
5459
- name: Quickstart
@@ -87,6 +92,10 @@ items:
8792
href: /azure/developer/java/migration/migrate-weblogic-to-jboss-eap-on-azure-app-service?toc=/azure/app-service/toc.json&bc=/azure/bread/toc.json
8893
- name: WebSphere
8994
href: /azure/developer/java/migration/migrate-websphere-to-jboss-eap-on-azure-app-service?toc=/azure/app-service/toc.json&bc=/azure/bread/toc.json
95+
- name: AI
96+
items:
97+
- name: Local SLM with sidecar extension
98+
href: tutorial-ai-slm-spring-boot.md
9099
- name: Node.js
91100
items:
92101
- name: Quickstart
@@ -103,6 +112,10 @@ items:
103112
href: tutorial-connect-app-access-microsoft-graph-as-user-javascript.md
104113
- name: to other Azure services with managed identity
105114
href: tutorial-connect-app-access-storage-javascript.md
115+
- name: AI
116+
items:
117+
- name: Local SLM with sidecar extension
118+
href: tutorial-ai-slm-expressjs.md
106119
- name: Python
107120
items:
108121
- name: Quickstart
@@ -119,6 +132,10 @@ items:
119132
href: tutorial-python-postgresql-app-django.md
120133
- name: using FastAPI
121134
href: tutorial-python-postgresql-app-fastapi.md
135+
- name: AI
136+
items:
137+
- name: Local SLM with sidecar extension
138+
href: tutorial-ai-slm-fastapi.md
122139
- name: PHP
123140
items:
124141
- name: Quickstart
Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
```markdown
2+
---
3+
title: "Tutorial: ASP.NET Core chatbot with SLM extension"
4+
description: "Learn how to deploy a ASP.NET Core application integrated with a Phi-3 sidecar extension on Azure App Service."
5+
author: "cephalin"
6+
ms.author: "cephalin"
7+
ms.date: "2025-05-06"
8+
ms.topic: tutorial
9+
ms.service: app-service
10+
---
11+
12+
# Tutorial: Run chatbot in App Service with a Phi-3 sidecar extension (ASP.NET Core)
13+
14+
This tutorial guides you through deploying a ASP.NET Core chatbot application integrated with the Phi-3 sidecar extension on Azure App Service. By following the steps, you'll learn how to set up a scalable web app, add an AI-powered sidecar for enhanced conversational capabilities, and test the chatbot's functionality.
15+
16+
Hosting your own small language model (SLM) offers several advantages:
17+
18+
- By hosting the model yourself, you maintain full control over your data. This ensures sensitive information is not exposed to third-party services, which is critical for industries with strict compliance requirements.
19+
- Self-hosted models can be fine-tuned to meet specific use cases or domain-specific requirements.
20+
- Hosting the model close to your application or users minimizes network latency, resulting in faster response times and a better user experience.
21+
- You can scale the deployment based on your specific needs and have full control over resource allocation, ensuring optimal performance for your application.
22+
- Hosting your own model allows for greater flexibility in experimenting with new features, architectures, or integrations without being constrained by third-party service limitations.
23+
24+
## Prerequisites
25+
26+
- An [Azure account](https://azure.microsoft.com/free/) with an active subscription.
27+
- A [GitHub account](https://github.com/).
28+
29+
## Deploy the sample application
30+
31+
1. In the browser, navigate to the [sample application repository](https://github.com/cephalin/sidecar-samples).
32+
2. Start a new Codespace from the repository.
33+
1. Log in with your Azure account:
34+
35+
```azurecli
36+
az login
37+
```
38+
39+
1. Open the terminal in the Codespace and run the following commands:
40+
41+
```azurecli
42+
cd dotnetapp
43+
az webapp up --sku P3MV3 --os-type linux
44+
```
45+
46+
This startup command is a common setup for deploying ASP.NET Core applications to Azure App Service. For more information, see [Quickstart: Deploy an ASP.NET web app](quickstart-dotnetcore.md).
47+
48+
## Add the Phi-3 sidecar extension
49+
50+
In this section, you add the Phi-3 sidecar extension to your ASP.NET Core application hosted on Azure App Service.
51+
52+
1. Navigate to the Azure portal and go to your app's management page.
53+
2. In the left-hand menu, select **Deployment** > **Deployment Center**.
54+
3. On the **Containers** tab, select **Add** > **Sidecar extension**.
55+
4. In the sidecar extension options, select **AI: phi-3-mini-4k-instruct-q4-gguf (Experimental)**.
56+
5. Provide a name for the sidecar extension.
57+
6. Select **Save** to apply the changes.
58+
7. Wait a few minutes for the sidecar extension to deploy. Keep selecting **Refresh** until the **Status** column shows **Running**.
59+
60+
## Test the chatbot
61+
62+
1. In your app's management page, in the left-hand menu, select **Overview**.
63+
1. Under **Default domain**, select the URL to open your web app in a browser.
64+
1. Verify that the chatbot application is running and responding to user inputs.
65+
66+
:::image type="content" source="media/tutorial-ai-slm-dotnet/fashion-store-assistant-live.png" alt-text="screenshot showing the fashion assistant app running in the browser.":::
67+
68+
## How the sample application works
69+
70+
The sample application demonstrates how to integrate a .NET service with the SLM sidecar extension. The `SLMService` class encapsulates the logic for sending requests to the SLM API and processing the streamed responses. This integration enables the application to generate conversational responses dynamically.
71+
72+
Looking in https://github.com/cephalin/sidecar-samples/blob/webstacks/dotnetapp/Services/SLMService.cs, you see that:
73+
74+
- The service reads the URL from `fashion.assistant.api.url`, which is set in *appsettings.json* and has the value of `http://localhost:11434/v1/chat/completions`.
75+
76+
```csharp
77+
public SLMService(HttpClient httpClient, IConfiguration configuration)
78+
{
79+
_httpClient = httpClient;
80+
_apiUrl = configuration["FashionAssistantAPI:Url"] ?? "httpL//localhost:11434";
81+
}
82+
```
83+
- The POST payload includes the system message and the prompt that's built from the selected product and the user query.
84+
85+
```csharp
86+
var requestPayload = new
87+
{
88+
messages = new[]
89+
{
90+
new { role = "system", content = "You are a helpful assistant." },
91+
new { role = "user", content = prompt }
92+
},
93+
stream = true,
94+
cache_prompt = false,
95+
n_predict = 150
96+
};
97+
```
98+
99+
- The POST request streams the response line by line. Each line is parsed to extract the generated content (or token).
100+
101+
```csharp
102+
var response = await _httpClient.SendAsync(request, HttpCompletionOption.ResponseHeadersRead);
103+
response.EnsureSuccessStatusCode();
104+
105+
var stream = await response.Content.ReadAsStreamAsync();
106+
using var reader = new StreamReader(stream);
107+
108+
while (!reader.EndOfStream)
109+
{
110+
var line = await reader.ReadLineAsync();
111+
line = line?.Replace("data: ", string.Empty).Trim();
112+
if (!string.IsNullOrEmpty(line) && line != "[DONE]")
113+
{
114+
var jsonObject = JsonNode.Parse(line);
115+
var responseContent = jsonObject?["choices"]?[0]?["delta"]?["content"]?.ToString();
116+
if (!string.IsNullOrEmpty(responseContent))
117+
{
118+
yield return responseContent;
119+
}
120+
}
121+
}
122+
```
123+
124+
## Next steps
Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
```markdown
2+
---
3+
title: "Tutorial: Express.js chatbot with SLM extension"
4+
description: "Learn how to deploy a Express.js application integrated with a Phi-3 sidecar extension on Azure App Service."
5+
author: "cephalin"
6+
ms.author: "cephalin"
7+
ms.date: "2025-05-06"
8+
ms.topic: tutorial
9+
ms.service: app-service
10+
---
11+
12+
# Tutorial: Run chatbot in App Service with a Phi-3 sidecar extension (Express.js)
13+
14+
This tutorial guides you through deploying a Express.js-based chatbot application integrated with the Phi-3 sidecar extension on Azure App Service. By following the steps, you'll learn how to set up a scalable web app, add an AI-powered sidecar for enhanced conversational capabilities, and test the chatbot's functionality.
15+
16+
Hosting your own small language model (SLM) offers several advantages:
17+
18+
- By hosting the model yourself, you maintain full control over your data. This ensures sensitive information is not exposed to third-party services, which is critical for industries with strict compliance requirements.
19+
- Self-hosted models can be fine-tuned to meet specific use cases or domain-specific requirements.
20+
- Hosting the model close to your application or users minimizes network latency, resulting in faster response times and a better user experience.
21+
- You can scale the deployment based on your specific needs and have full control over resource allocation, ensuring optimal performance for your application.
22+
- Hosting your own model allows for greater flexibility in experimenting with new features, architectures, or integrations without being constrained by third-party service limitations.
23+
24+
## Prerequisites
25+
26+
- An [Azure account](https://azure.microsoft.com/free/) with an active subscription.
27+
- A [GitHub account](https://github.com/).
28+
29+
## Deploy the sample application
30+
31+
1. In the browser, navigate to the [sample application repository](https://github.com/cephalin/sidecar-samples).
32+
2. Start a new Codespace from the repository.
33+
1. Log in with your Azure account:
34+
35+
```azurecli
36+
az login
37+
```
38+
39+
1. Open the terminal in the Codespace and run the following commands:
40+
41+
```azurecli
42+
cd expressapp
43+
az webapp up --sku P3MV3
44+
```
45+
46+
This startup command is a common setup for deploying Express.js applications to Azure App Service. For more information, see [Deploy a Node.js web app in Azure](quickstart-nodejs.md).
47+
48+
## Add the Phi-3 sidecar extension
49+
50+
In this section, you add the Phi-3 sidecar extension to your Express.js application hosted on Azure App Service.
51+
52+
1. Navigate to the Azure portal and go to your app's management page.
53+
2. In the left-hand menu, select **Deployment** > **Deployment Center**.
54+
3. On the **Containers** tab, select **Add** > **Sidecar extension**.
55+
4. In the sidecar extension options, select **AI: phi-3-mini-4k-instruct-q4-gguf (Experimental)**.
56+
5. Provide a name for the sidecar extension.
57+
6. Select **Save** to apply the changes.
58+
7. Wait a few minutes for the sidecar extension to deploy. Keep selecting **Refresh** until the **Status** column shows **Running**.
59+
60+
## Test the chatbot
61+
62+
1. In your app's management page, in the left-hand menu, select **Overview**.
63+
1. Under **Default domain**, select the URL to open your web app in a browser.
64+
1. Verify that the chatbot application is running and responding to user inputs.
65+
66+
:::image type="content" source="media/tutorial-ai-slm-dotnet/fashion-store-assistant-live.png" alt-text="screenshot showing the fashion assistant app running in the browser.":::
67+
68+
## How the sample application works
69+
70+
The sample application demonstrates how to integrate a Express.js-based service with the SLM sidecar extension. The `SLMService` class encapsulates the logic for sending requests to the SLM API and processing the streamed responses. This integration enables the application to generate conversational responses dynamically.
71+
72+
Looking in https://github.com/cephalin/sidecar-samples/blob/webstacks/expressapp/src/services/slm_service.js, you see that:
73+
74+
- The service sends a POST request to the SLM endpoint `http://127.0.0.1:11434/v1/chat/completions`.
75+
76+
```javascript
77+
this.apiUrl = 'http://127.0.0.1:11434/v1/chat/completions';
78+
```
79+
- The POST payload includes the system message and the prompt that's built from the selected product and the user query.
80+
81+
```javascript
82+
const requestPayload = {
83+
messages: [
84+
{ role: 'system', content: 'You are a helpful assistant.' },
85+
{ role: 'user', content: prompt }
86+
],
87+
stream: true,
88+
cache_prompt: false,
89+
n_predict: 2048 // Increased token limit to allow longer responses
90+
};
91+
```
92+
93+
- The POST request streams the response line by line. Each line is parsed to extract the generated content (or token).
94+
95+
```javascript
96+
// Set up Server-Sent Events headers
97+
res.setHeader('Content-Type', 'text/event-stream');
98+
res.setHeader('Cache-Control', 'no-cache');
99+
res.setHeader('Connection', 'keep-alive');
100+
res.flushHeaders();
101+
102+
const response = await axios.post(this.apiUrl, requestPayload, {
103+
headers: { 'Content-Type': 'application/json' },
104+
responseType: 'stream'
105+
});
106+
107+
response.data.on('data', (chunk) => {
108+
const lines = chunk.toString().split('\n').filter(line => line.trim() !== '');
109+
110+
for (const line of lines) {
111+
let parsedLine = line;
112+
if (line.startsWith('data: ')) {
113+
parsedLine = line.replace('data: ', '').trim();
114+
}
115+
116+
if (parsedLine === '[DONE]') {
117+
return;
118+
}
119+
120+
try {
121+
const jsonObj = JSON.parse(parsedLine);
122+
if (jsonObj.choices && jsonObj.choices.length > 0) {
123+
const delta = jsonObj.choices[0].delta || {};
124+
const content = delta.content;
125+
126+
if (content) {
127+
// Use non-breaking space to preserve formatting
128+
const formattedToken = content.replace(/ /g, '\u00A0');
129+
res.write(`data: ${formattedToken}\n\n`);
130+
}
131+
}
132+
} catch (parseError) {
133+
console.warn(`Failed to parse JSON from line: ${parsedLine}`);
134+
}
135+
}
136+
});
137+
```
138+
139+
## Next steps

0 commit comments

Comments
 (0)