MicrosoftDocs
diff --git a/‎articles/app-service/includes/tutorial-ai-slm/advantages.md
Lines changed: 15 additions & 0 deletions b/‎articles/app-service/includes/tutorial-ai-slm/advantages.md
Lines changed: 15 additions & 0 deletions
diff --git a/‎articles/app-service/includes/tutorial-ai-slm/faq.md
Lines changed: 56 additions & 0 deletions b/‎articles/app-service/includes/tutorial-ai-slm/faq.md
Lines changed: 56 additions & 0 deletions
diff --git a/‎articles/app-service/includes/tutorial-ai-slm/phi-3-extension-create-test.md
Lines changed: 33 additions & 0 deletions b/‎articles/app-service/includes/tutorial-ai-slm/phi-3-extension-create-test.md
Lines changed: 33 additions & 0 deletions
diff --git a/‎articles/app-service/includes/tutorial-sidecar/common-faqs.md
Lines changed: 20 additions & 0 deletions b/‎articles/app-service/includes/tutorial-sidecar/common-faqs.md
Lines changed: 20 additions & 0 deletions
diff --git a/‎articles/app-service/includes/tutorial-sidecar/sidecar-overview.md
Lines changed: 13 additions & 0 deletions b/‎articles/app-service/includes/tutorial-sidecar/sidecar-overview.md
Lines changed: 13 additions & 0 deletions
diff --git a/‎articles/app-service/media/tutorial-ai-slm-dotnet/fashion-store-assistant-live.png
97.7 KB b/‎articles/app-service/media/tutorial-ai-slm-dotnet/fashion-store-assistant-live.png
97.7 KB
diff --git a/‎articles/app-service/toc.yml
Lines changed: 17 additions & 2 deletions b/‎articles/app-service/toc.yml
Lines changed: 17 additions & 2 deletions
diff --git a/‎articles/app-service/tutorial-ai-slm-dotnet.md
Lines changed: 103 additions & 0 deletions b/‎articles/app-service/tutorial-ai-slm-dotnet.md
Lines changed: 103 additions & 0 deletions
@@ -0,0 +1,15 @@
+---
+author: cephalin
+ms.service: azure-app-service
+ms.topic: include
+ms.date: 05/07/2025
+ms.author: cephalin
+---
+
+Hosting your own small language model (SLM) offers several advantages:
+
+- Full control over your data. Sensitive information isn't exposed to external services, which is critical for industries with strict compliance requirements.
+- Self-hosted models can be fine-tuned to meet specific use cases or domain-specific requirements. 
+- Minimized network latency and faster response times for a better user experience.
+- Full control over resource allocation, ensuring optimal performance for your application.
+
@@ -0,0 +1,56 @@
+---
+author: cephalin
+ms.service: azure-app-service
+ms.topic: include
+ms.date: 05/07/2025
+ms.author: cephalin
+---
+
+## Frequently asked questions
+
+### How does pricing tier affect the performance of the SLM sidecar?
+
+Since AI models consume considerable resources, choose the pricing tier that gives you sufficient vCPUs and memory to run your specific model. For this reason, the built-in AI sidecar extensions only appear when the app is in a suitable pricing tier. If you build your own SLM sidecar container, you should also use a CPU-optimized model, since the App Service pricing tiers are CPU-only tiers.
+
+For example, the [Phi-3 mini model with a 4K context length from Hugging Face](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx) is designed to run with limited resources and provides strong math and logical reasoning for many common scenarios. It also comes with a CPU-optimized version. In App Service, we tested the model on all premium tiers and found it to perform well in the [P2mv3](https://azure.microsoft.com/pricing/details/app-service/linux/) tier or higher. If your requirements allow, you can run it on a lower tier.
+
+### How to use my own SLM sidecar?
+
+The sample repository contains a sample SLM container that you can use as a sidecar. It runs a FastAPI application that listens on port 8000, as specified in its [Dockerfile](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/bring_your_own_slm/src/phi-3-sidecar/Dockerfile). The application uses [ONNX Runtime](https://onnxruntime.ai/docs/) to load the Phi-3 model, then forwards the HTTP POST data to the model and streams the response from the model back to the client. For more information, see [model_api.py](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/src/phi-3-sidecar/model_api.py).
+
+To build the sidecar image yourself, you need to install Docker Desktop locally on your machine.
+
+1. Clone the repository locally.
+
+    ```bash
+    git clone https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar
+    cd ai-slm-in-app-service-sidecar
+    ```
+
+1. Change into the Phi-3 image's source directory and download the model locally using the [Huggingface CLI](https://huggingface.co/docs/huggingface_hub/guides/cli).
+
+    ```bash
+    cd bring_your_own_slm/src/phi-3-sidecar
+    huggingface-cli download microsoft/Phi-3-mini-4k-instruct-onnx --local-dir ./Phi-3-mini-4k-instruct-onnx
+    ```
+    
+    The [Dockerfile](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/src/phi-3-sidecar/Dockerfile) is configured to copy the model from *./Phi-3-mini-4k-instruct-onnx*.
+    
+1. Build the Docker image. For example:
+
+    ```bash
+    docker build --tag phi-3 .
+    ```
+
+1. Upload the built image to Azure Container Registry with [Push your first image to your Azure container registry using the Docker CLI](/azure/container-registry/container-registry-get-started-docker-cli).
+
+1. In the **Deployment Center** > **Containers (new)** tab, select **Add** > **Custom container** and configure the new container as follows:
+    - **Name**: *phi-3*
+    - **Image source**: **Azure Container Registry**
+    - **Registry**: your registry
+    - **Image**: the uploaded image
+    - **Tag**: the image tag you want
+    - **Port**: *8000*
+1. Select **Apply**.
+
+See [bring_your_own_slm/src/webapp](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/bring_your_own_slm/src/webapp) for a sample application that interacts with this custom sidecar container.
@@ -0,0 +1,33 @@
+---
+author: cephalin
+ms.service: azure-app-service
+ms.topic: include
+ms.date: 05/07/2025
+ms.author: cephalin
+---
+
+## Add the Phi-3 sidecar extension
+
+In this section, you add the Phi-3 sidecar extension to your ASP.NET Core application hosted on Azure App Service.
+
+1. Navigate to the Azure portal and go to your app's management page.
+2. In the left-hand menu, select **Deployment** > **Deployment Center**.
+3. On the **Containers** tab, select **Add** > **Sidecar extension**.
+4. In the sidecar extension options, select **AI: phi-3-mini-4k-instruct-q4-gguf (Experimental)**.
+5. Provide a name for the sidecar extension.
+6. Select **Save** to apply the changes.
+7. Wait a few minutes for the sidecar extension to deploy. Keep selecting **Refresh** until the **Status** column shows **Running**.
+
+This Phi-3 sidecar extension uses a [chat completion API like OpenAI](https://platform.openai.com/docs/api-reference/chat/create) that can respond to chat completion response at `http://localhost:11434/v1/chat/completions`. For more information on how to interact with the API, see:
+
+- [OpenAI documentation: Create chat completion](https://platform.openai.com/docs/api-reference/chat/create)
+- [OpenAI documentation: Streaming](https://platform.openai.com/docs/api-reference/chat-streaming)
+
+## Test the chatbot
+
+1. In your app's management page, in the left-hand menu, select **Overview**.
+1. Under **Default domain**, select the URL to open your web app in a browser.
+1. Verify that the chatbot application is running and responding to user inputs.
+
+    :::image type="content" source="../../media/tutorial-ai-slm-dotnet/fashion-store-assistant-live.png" alt-text="Screenshot showing the fashion assistant app running in the browser.":::
+
@@ -0,0 +1,20 @@
+---
+author: cephalin
+ms.service: azure-app-service
+ms.topic: include
+ms.date: 05/08/2025
+ms.author: cephalin
+---
+
+### How do sidecar containers handle internal communication?
+
+Sidecar containers share the same network host as the main container, so the main container (and other sidecar containers) can reach any port on the sidecar with `localhost:<port>`. The example *startup.sh* uses `localhost:4318` to access port 4318 on the **otel-collector** sidecar.
+
+In the **Edit container** dialog, the **Port** box isn't currently used by App Service. You can use it as part of the sidecar metadata, such as to indicate which port the sidecar is listening to.
+
+### Can a sidecar container receive internet requests?
+
+No. App Service routes internet requests only to the main container. For code-based Linux apps, the built-in Linux container is the main container, and any sidecar container ([sitecontainers](/azure/templates/microsoft.web/sites/sitecontainers)) should be added with `IsMain=false`. For custom containers, all but one of the [sitecontainers](/azure/templates/microsoft.web/sites/sitecontainers) should have `IsMain=false`.
+
+For more information on configuring `IsMain`, see [Microsoft.Web sites/sitecontainers](/azure/templates/microsoft.web/sites/sitecontainers).
+
@@ -0,0 +1,13 @@
+---
+author: cephalin
+ms.service: azure-app-service
+ms.topic: include
+ms.date: 05/08/2025
+ms.author: cephalin
+---
+
+## What's a sidecar container?
+
+In Azure App Service, you can add up to nine sidecar containers for each Linux app. Sidecar containers let you deploy extra services and features to your Linux apps without making them tightly coupled to the main container (built-in or custom). For example, you can add monitoring, logging, configuration, and networking services as sidecar containers. An OpenTelemetry collector sidecar is one such monitoring example. 
+
+The sidecar containers run alongside the main application container in the same App Service plan.
@@ -49,6 +49,11 @@ items:
         href: app-service-asp-net-migration.md
       - name: Migrate containerized .NET
         href: ../migrate/tutorial-app-containerization-aspnet-app-service.md?bc=/azure/bread/toc.json&toc=/azure/app-service/toc.json
+    - name: AI
+      items:
+        - name: Local SLM with sidecar extension
+          href: tutorial-ai-slm-dotnet.md
+
   - name: Java
     items:
     - name: Quickstart
@@ -87,6 +92,10 @@ items:
         href: /azure/developer/java/migration/migrate-weblogic-to-jboss-eap-on-azure-app-service?toc=/azure/app-service/toc.json&bc=/azure/bread/toc.json
       - name: WebSphere
         href: /azure/developer/java/migration/migrate-websphere-to-jboss-eap-on-azure-app-service?toc=/azure/app-service/toc.json&bc=/azure/bread/toc.json
+    - name: AI
+      items:
+        - name: Local SLM with sidecar extension
+          href: tutorial-ai-slm-spring-boot.md
   - name: Node.js
     items:
     - name: Quickstart
@@ -103,6 +112,10 @@ items:
         href: tutorial-connect-app-access-microsoft-graph-as-user-javascript.md
       - name: to other Azure services with managed identity
         href: tutorial-connect-app-access-storage-javascript.md
+    - name: AI
+      items:
+        - name: Local SLM with sidecar extension
+          href: tutorial-ai-slm-expressjs.md
   - name: Python
     items:
     - name: Quickstart
@@ -119,6 +132,10 @@ items:
         href: tutorial-python-postgresql-app-django.md
       - name: using FastAPI
         href: tutorial-python-postgresql-app-fastapi.md
+    - name: AI
+      items:
+        - name: Local SLM with sidecar extension
+          href: tutorial-ai-slm-fastapi.md
   - name: PHP
     items:
     - name: Quickstart
@@ -455,8 +472,6 @@ items:
   items:
     - name: Deploy an application that uses OpenAI on App Service
       href: deploy-intelligent-apps.md
-    - name: Run an SLM in sidecar
-      href: tutorial-sidecar-local-small-language-model.md
     - name: Deploy a .NET app with Azure OpenAI and Azure SQL
       href: deploy-intelligent-apps-dotnet-to-azure-sql.md
     - name: Invoke OpenAPI app from Azure AI Agent
 
@@ -0,0 +1,103 @@
+---
+title: "Tutorial: ASP.NET Core chatbot with SLM extension"
+description: "Learn how to deploy a ASP.NET Core application integrated with a Phi-3 sidecar extension on Azure App Service."
+author: cephalin
+ms.author: cephalin
+ms.date: 05/07/2025
+ms.topic: tutorial
+---
+
+# Tutorial: Run chatbot in App Service with a Phi-3 sidecar extension (ASP.NET Core)
+
+This tutorial guides you through deploying a ASP.NET Core chatbot application integrated with the Phi-3 sidecar extension on Azure App Service. By following the steps, you'll learn how to set up a scalable web app, add an AI-powered sidecar for enhanced conversational capabilities, and test the chatbot's functionality.
+
+[!INCLUDE [advantages](includes/tutorial-ai-slm/advantages.md)]
+
+## Prerequisites
+
+- An [Azure account](https://azure.microsoft.com/free/) with an active subscription.
+- A [GitHub account](https://github.com/).
+
+## Deploy the sample application
+
+1. In the browser, navigate to the [sample application repository](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar).
+2. Start a new Codespace from the repository.
+1. Log in with your Azure account:
+
+    ```azurecli
+    az login
+    ```
+
+1. Open the terminal in the Codespace and run the following commands:
+
+    ```azurecli
+    cd use_sidecar_extension/dotnetapp
+    az webapp up --sku P3MV3 --os-type linux
+    ```
+
+This startup command is a common setup for deploying ASP.NET Core applications to Azure App Service. For more information, see [Quickstart: Deploy an ASP.NET web app](quickstart-dotnetcore.md).
+
+[!INCLUDE [phi-3-extension-create-test](includes/tutorial-ai-slm/phi-3-extension-create-test.md)]
+
+## How the sample application works
+
+The sample application demonstrates how to integrate a .NET service with the SLM sidecar extension. The `SLMService` class encapsulates the logic for sending requests to the SLM API and processing the streamed responses. This integration enables the application to generate conversational responses dynamically.
+
+Looking in [use_sidecar_extension/dotnetapp/Services/SLMService.cs](https://github.com/Azure-Samples/ai-slm-in-app-service-sidecar/blob/main/use_sidecar_extension/dotnetapp/Services/SLMService.cs), you see that:
+
+- The service reads the URL from `fashion.assistant.api.url`, which is set in *appsettings.json* and has the value of `http://localhost:11434/v1/chat/completions`.
+
+    ```csharp
+    public SLMService(HttpClient httpClient, IConfiguration configuration)
+    {
+        _httpClient = httpClient;
+        _apiUrl = configuration["FashionAssistantAPI:Url"] ?? "httpL//localhost:11434";
+    }
+    ```
+
+- The POST payload includes the system message and the prompt that's built from the selected product and the user query.
+
+    ```csharp
+    var requestPayload = new
+    {
+        messages = new[]
+        {
+            new { role = "system", content = "You are a helpful assistant." },
+            new { role = "user", content = prompt }
+        },
+        stream = true,
+        cache_prompt = false,
+        n_predict = 150
+    };
+    ```
+
+- The POST request streams the response line by line. Each line is parsed to extract the generated content (or token).
+
+    ```csharp
+    var response = await _httpClient.SendAsync(request, HttpCompletionOption.ResponseHeadersRead);
+    response.EnsureSuccessStatusCode();
+
+    var stream = await response.Content.ReadAsStreamAsync();
+    using var reader = new StreamReader(stream);
+
+    while (!reader.EndOfStream)
+    {
+        var line = await reader.ReadLineAsync();
+        line = line?.Replace("data: ", string.Empty).Trim();
+        if (!string.IsNullOrEmpty(line) && line != "[DONE]")
+        {
+            var jsonObject = JsonNode.Parse(line);
+            var responseContent = jsonObject?["choices"]?[0]?["delta"]?["content"]?.ToString();
+            if (!string.IsNullOrEmpty(responseContent))
+            {
+                yield return responseContent;
+            }
+        }
+    }
+    ```
+
+[!INCLUDE [faq](includes/tutorial-ai-slm/faq.md)]
+
+## Next steps
+
+[Tutorial: Configure a sidecar container for a Linux app in Azure App Service](tutorial-sidecar.md)