Skip to content

Commit 8cbed03

Browse files
Merge pull request #298644 from msangapu-msft/agent
draft
2 parents 5dc9001 + 0646505 commit 8cbed03

File tree

4 files changed

+321
-1
lines changed

4 files changed

+321
-1
lines changed
111 KB
Loading
63.7 KB
Loading

articles/app-service/toc.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -249,6 +249,8 @@ items:
249249
href: sre-agent-overview.md
250250
- name: Use an SRE agent
251251
href: sre-agent-usage.md
252+
- name: SRE Agent tutorial
253+
href: tutorial-sre-agent.md
252254
- name: Monitor App Service
253255
href: monitor-app-service.md
254256
- name: Monitoring data reference
@@ -628,4 +630,4 @@ items:
628630
- name: Outbound IP address
629631
href: ip-address-change-outbound.md
630632
- name: TLS/SSL address
631-
href: ip-address-change-ssl.md
633+
href: ip-address-change-ssl.md
Lines changed: 318 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,318 @@
1+
---
2+
title: 'Tutorial: Troubleshoot an App using Azure SRE Agent (preview) in Azure App Service'
3+
description: Learn how to use Azure SRE Agent and Azure App Service to identify and fix app issues with AI-assisted troubleshooting.
4+
author: msangapu-msft
5+
ms.author: msangapu
6+
ms.topic: tutorial
7+
ms.date: 05/18/2025
8+
---
9+
10+
# Troubleshoot an App Service app using Azure SRE Agent (preview)
11+
12+
> [!NOTE]
13+
> Azure SRE Agent is in preview. By using SRE Agent, you consent the product-specific [Preview Terms of Use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
14+
15+
Site Reliability Engineering (SRE) focuses on creating reliable, scalable systems through automation and proactive management. An SRE Agent brings these principles to your cloud environment by providing AI-powered monitoring, troubleshooting, and remediation capabilities. An SRE Agent automates routine operational tasks and provides reasoned insights to help you maintain application reliability while reducing manual intervention. Available as a chatbot, you can ask questions and give natural language commands to maintain your applications and services. To ensure accuracy and control, any agent action taken on your behalf requires your approval.
16+
17+
This sample app demonstrates error detection by simulating HTTP 500 failures in a controlled way. You can safely test these scenarios using Azure App Service **deployment slots**, which let you run different app configurations side by side.
18+
19+
You enable error simulation by setting the `INJECT_ERROR` app setting to `1`. When enabled, the app throws an HTTP 500 error after several button clicks, allowing you to see how the SRE Agent responds to application failures.
20+
21+
In this tutorial, you will:
22+
23+
> [!div class="checklist"]
24+
> * Create an App Service app using the Azure portal.
25+
> * Deploy a sample app from GitHub.
26+
> * Configure the app with a startup command and enable logging.
27+
> * Create a deployment slot to simulate failure.
28+
> * Set up an Azure SRE Agent to monitor the app.
29+
> * Trigger a failure by swapping to the broken slot.
30+
> * Use AI-driven chat to diagnose and resolve the issue by rolling back the swap.
31+
32+
[!INCLUDE [quickstarts-free-trial-note](~/reusable-content/ce-skilling/azure/includes/quickstarts-free-trial-note.md)]
33+
34+
## Prerequisites
35+
36+
To complete this tutorial, you need:
37+
- An [Azure subscription](https://azure.microsoft.com/free/).
38+
- `Microsoft.Authorization/roleAssignments/write` permissions to create role assignments (Role Based Access Control Administrator or User Access Administrator) for SRE Agent setup.
39+
40+
## 1. Create an App Service app
41+
42+
Start by creating a web app that the SRE Agent can monitor.
43+
44+
1. Sign in to the https://portal.azure.com.
45+
46+
1. In the top search bar, search for **App Services**, then select it from the results.
47+
48+
1. Select **+ Create** and choose **Web App**.
49+
50+
### Configure the Basics tab
51+
52+
In the *Basics* tab, provide the following details:
53+
54+
**Project details**
55+
56+
| Setting | Value |
57+
|-----------------|--------------------------------|
58+
| Subscription | Your Azure subscription |
59+
| Resource group | **Create new**`my-app-service-group` |
60+
61+
**Instance details**
62+
63+
| Setting | Value |
64+
|-----------------|--------------------------------|
65+
| Name | `my-sre-app` |
66+
| Publish | **Code** |
67+
| Runtime stack | **.NET 9 (STS)** |
68+
| Operating System| **Windows** |
69+
| Region | A region near you |
70+
71+
72+
1. Select the **Deployment** tab.
73+
74+
1. Under *Authentication settings*, enable **Basic authentication**.
75+
76+
> [!NOTE]
77+
> Basic authentication is used later for a one-time deployment from GitHub. [Disable Basic Auth](configure-basic-auth-disable.md?tabs=portal) in production.
78+
>
79+
80+
1. Select **Review and create**, then **Create** when validation passes.
81+
82+
1. Once deployment completes, you see *Your deployment is complete*.
83+
84+
## 2. Deploy the sample app
85+
86+
Now that your App Service app is created, deploy the sample application from GitHub.
87+
88+
1. In the Azure portal, navigate to your newly created App Service by selecting **Go to resource**.
89+
90+
1. In the left-hand menu, under the *Deployment* section, select **Deployment Center**.
91+
92+
1. In the *Settings* tab, configure:
93+
94+
| Property | Value |
95+
|------------|--------------------------------------------------------------|
96+
| Source | **External Git** |
97+
| Repository | `https://github.com/Azure-Samples/app-service-dotnet-agent-tutorial`|
98+
| Branch | `main` |
99+
100+
1. Select **Save** to apply the deployment settings.
101+
102+
## 3. Verify the sample app
103+
104+
After deployment, confirm that the sample app is running as expected.
105+
106+
1. In the left menu of your App Service, select **Overview**.
107+
108+
1. Select **Browse** to open the app in a new browser tab. (It might take a minute to load.)
109+
110+
1. The app displays a large counter and two buttons:
111+
112+
:::image type="content" source="media/tutorial-sre-agent/verify-sample-primary-slot.png" alt-text="Screenshot of the .NET sample in the primary slot." border="false":::
113+
114+
1. Select the *Increment* button several times to observe the counter increase.
115+
116+
## 4. Set up a deployment slot for failure simulation
117+
118+
To simulate an app failure scenario, add a secondary deployment slot.
119+
120+
1. In the left menu of your App Service, under the *Deployment* section, select **Deployment slots**.
121+
122+
1. Select **Add slot**.
123+
124+
1. Enter the following values:
125+
126+
| Property | Value | Remarks |
127+
|---------------------|--------------|------------------------------------------------------------------------------------------|
128+
| Name | `broken` | The error scenario is triggered in this slot. |
129+
| Clone settings from | `my-sre-app` | Copies configuration from the main app. |
130+
131+
1. Scroll to the bottom of the dialog window and select **Add**. Slot creation might take a minute to complete.
132+
133+
### Deploy the sample app to the slot
134+
135+
1. Once the slot is created, select the **broken** slot from the list.
136+
137+
1. In the left menu, under the *Deployment* section, select **Deployment Center**.
138+
139+
1. In the *Settings* tab, configure:
140+
141+
| Property | Value |
142+
|------------|---------------------------------------------------------------|
143+
| Source | **External Git** |
144+
| Repository | `https://github.com/Azure-Samples/app-service-dotnet-agent-tutorial` |
145+
| Branch | `main` |
146+
147+
1. Select **Save** to apply the deployment settings.
148+
149+
### Add an app setting to enable error simulation
150+
151+
To control error simulation, configure an app setting your app checks at runtime.
152+
153+
1. In the left menu of your App Service, select **Environment variables** under the *Settings* section.
154+
155+
1. At the top, make sure you have the correct slot selected (for example, **broken**).
156+
157+
1. Under the **App settings** tab, select **+ Add**.
158+
159+
1. Enter the following values:
160+
161+
| Property | Value | Remarks |
162+
|------------|---------------|--------------------------------------------------------------|
163+
| Name | `INJECT_ERROR`| Must be exactly `INJECT_ERROR` (all caps, no spaces). |
164+
| Value | `1` | Enables error simulation in the app. |
165+
166+
1. Make sure the **Deployment slot setting** box is **not** checked.
167+
168+
1. Select **Apply** to add the setting.
169+
170+
1. At the bottom of the *Environment variables* page, select **Apply** to apply the changes.
171+
172+
1. When prompted, select **Confirm** to confirm and restart the app in the selected slot.
173+
174+
## 5. Create an Azure SRE Agent
175+
176+
Now, create an Azure SRE Agent to monitor your App Service app.
177+
178+
1. In the Azure portal, search for and select **Azure SRE Agent**.
179+
180+
1. Select **+ Create**.
181+
182+
1. In the *Create agent* window, enter the following values:
183+
184+
| Property | Value | Remarks |
185+
|------------------|---------------------------|-------------------------------------------------------------------------|
186+
| Subscription | Your Azure subscription | |
187+
| Resource group | `my-sre-agent-group` | New group for the Azure SRE Agent |
188+
| Name | `my-sre-agent`| |
189+
| Region | **Sweden Central** | Required during preview; can monitor resources in any Azure region |
190+
| Choose role | **Contributor** | Grants the agent permission to take action on your behalf |
191+
192+
1. Select **Select resource groups**.
193+
194+
1. In the *Selected resource groups to monitor* window, search for and select `my-app-service-group`.
195+
196+
1. Select **Save**.
197+
198+
1. Back in the *Create agent* window, select **Create**. The agent creation process takes a few minutes to complete.
199+
200+
## 6. Chat with your agent
201+
202+
Once your SRE Agent is deployed and connected to your resource group, you can interact with it using natural language to monitor and troubleshoot your app.
203+
204+
1. In the Azure portal, search for and select **Azure SRE Agent**.
205+
206+
1. From the list of agents, select **my-app-service-sre-agent**.
207+
208+
1. Select **Chat with agent**.
209+
210+
1. In the chat box, enter the following command:
211+
212+
```text
213+
List my App Service apps
214+
```
215+
216+
1. The agent responds with a list of App Service apps deployed in the `my-app-service-group` resource group.
217+
218+
Now that the agent can see your app, you’re ready to simulate a failure and let the agent help you resolve it.
219+
220+
## 7. Break the app
221+
222+
Now simulate a failure scenario by swapping to the broken deployment slot.
223+
224+
1. In your App Service, go to the *Deployment* section in the left-hand menu and select **Deployment slots**.
225+
226+
1. Select **Swap**.
227+
228+
1. In the *Swap* dialog, configure:
229+
230+
| Property | Value | Remarks |
231+
|----------|---------------------|----------------------------------|
232+
| Source | `my-sre-app-broken` | The slot with the faulty version |
233+
| Target | `my-sre-app` | The production slot |
234+
235+
1. Scroll to the bottom and select **Start Swap**. The swap operation might take a minute to complete.
236+
237+
1. Once the swap is complete, browse to the app’s URL.
238+
239+
:::image type="content" source="media/tutorial-sre-agent/verify-sample-broken-slot.png" alt-text="Screenshot of the .NET sample in the broken slot." border="false":::
240+
241+
1. Select the "Increment" button six times.
242+
243+
1. You should see the app fail and return an HTTP 500 error.
244+
245+
1. Refresh the page (by pressing Command-R or F5) several times to generate additional HTTP 500 errors, which help the SRE Agent detect and diagnose the issue.
246+
247+
## 8. Fix the app
248+
249+
Now that the app is experiencing failures, use the SRE Agent to diagnose and resolve the issue.
250+
251+
1. In the Azure portal, search for and select **Azure SRE Agent**.
252+
253+
1. From the list of agents, select **my-app-service-sre-agent**.
254+
255+
1. Select **Chat with agent**.
256+
257+
1. In the chat box, enter the following command:
258+
259+
```text
260+
What's wrong with my-sre-app?
261+
```
262+
263+
1. The agent begins to analyze the app’s health. You should see diagnostic messages related to availability, CPU and memory usage, and the recent slot swap.
264+
265+
> Each session may vary, but you should see a message similar to:
266+
>
267+
> *“I will now perform mitigation for my-sre-app by swapping the slots back to recover the application to a healthy state. Please note that swapping slots back may not always immediately restore health. I will keep you updated on the progress.”*
268+
269+
1. After a pause, the agent prompts you to approve the rollback:
270+
271+
> *Performing Slot Swap rollback to Restore Application Availability for my-sre-app*
272+
>
273+
> **[Approve]**   **[Deny]**
274+
275+
1. Select **Approve** to initiate the rollback.
276+
277+
1. Once the rollback is complete, the agent confirms:
278+
279+
> *The slot swap for my-sre-app has been completed successfully (timestamp). The production slot has been restored. I will now continue with post-mitigation steps:*
280+
>
281+
> *I will ask you for the correct GitHub repo URL to raise an issue for the swap-related downtime.*
282+
> *I will monitor the app and provide an availability update in 5 minutes.*
283+
>
284+
> *Please provide the GitHub repository URL where you want the issue to be raised.*
285+
286+
## 9. Verify the fix
287+
288+
After the SRE Agent rolls back the slot swap, confirm that your app is functioning correctly.
289+
290+
1. Open your App Service app in a browser by selecting **Browse** from the **Overview** page.
291+
292+
1. Notice that the text "ERROR INJECTION ENABLED" no longer appears, confirming the app has reverted to its original state.
293+
294+
1. Select the **Increment** button six times to ensure no errors take place.
295+
296+
## Clean up resources
297+
298+
If you no longer need the app and agent created in this tutorial, you can delete the associated resource groups to avoid incurring charges.
299+
300+
Repeat the following steps for both of these resource groups:
301+
302+
- `my-app-service-group` (App Service resource group)
303+
- `my-sre-agent-group` (Azure SRE Agent resource group)
304+
305+
1. In the Azure portal, navigate to **Resource groups**.
306+
307+
1. Select the resource group you want to delete.
308+
309+
1. From the *Overview* tab, select **Delete resource group**.
310+
311+
1. In the confirmation dialog, enter the name of the resource group.
312+
313+
1. Select **Delete**. Deletion takes a few minutes to complete.
314+
315+
## Next steps
316+
317+
* [Overview of Azure App Service](overview.md)
318+
* [Use Azure Developer CLI for modern app development](/azure/developer/azure-developer-cli/overview)

0 commit comments

Comments
 (0)