Skip to content

Commit 6b5ff0f

Browse files
Merge pull request #265622 from eric-urban/eur/monitor-deployed-prompt-flow
monitor deployed prompt flow
2 parents 880bcdd + 4b56591 commit 6b5ff0f

File tree

2 files changed

+64
-65
lines changed

2 files changed

+64
-65
lines changed

articles/ai-studio/how-to/monitor-quality-safety.md

Lines changed: 63 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,87 @@
11
---
2-
title: Monitor quality and safety of deployed applications
2+
title: Monitor quality and safety of deployed prompt flow applications
33
titleSuffix: Azure AI Studio
4-
description: Learn how to monitor quality and safety of deployed applications with Azure AI Studio.
4+
description: Learn how to monitor quality and safety of deployed prompt flow applications with Azure AI Studio.
55
manager: scottpolly
66
ms.service: azure-ai-studio
77
ms.custom:
88
- ignite-2023
99
ms.topic: how-to
10-
ms.date: 11/15/2023
10+
ms.date: 2/7/2024
1111
ms.reviewer: fasantia
1212
ms.author: mopeakande
1313
author: msakande
1414
---
1515

16-
# Monitor quality and safety of deployed applications
16+
# Monitor quality and safety of deployed prompt flow applications
1717

18-
Monitoring models that are deployed in production is an essential part of the generative AI application lifecycle. Changes in data and consumer behavior can influence your application over time, resulting in outdated systems that negatively affect business outcomes and expose organizations to compliance, economic, and reputational risks.
18+
Monitoring models that are deployed in production is an essential part of the generative AI application lifecycle. Changes in data and consumer behavior can influence your application over time, resulting in outdated systems that negatively affect business outcomes and expose organizations to compliance, economic, and reputation risks.
1919

2020
Azure AI model monitoring for generative AI applications makes it easier for you to monitor your applications in production for safety and quality on a cadence to ensure it's delivering maximum business value.
2121

22-
Capabilities and integrations include:
23-
- Collect production data using Model data collector from a prompt flow deployment.
22+
Capabilities and integrations for monitoring a prompt flow deployment include:
23+
- Collect production data using the model data collector.
2424
- Apply Responsible AI evaluation metrics such as groundedness, coherence, fluency, relevance, and similarity, which are interoperable with prompt flow evaluation metrics.
2525
- Preconfigured alerts and defaults to run monitoring on a recurring basis.
2626
- Consume result and configure advanced behavior in Azure AI Studio.
2727

28+
## Set up monitoring for prompt flow
29+
30+
Follow these steps to set up monitoring for your prompt flow deployment:
31+
32+
1. Confirm your flow runs successfully, and that the required inputs and outputs are configured for the [metrics you want to assess](#evaluation-metrics). The minimum required parameters of collecting only inputs and outputs provide only two metrics: coherence and fluency. You must configure your flow according to the [flow and metric configuration requirements](#flow-and-metric-configuration-requirements).
33+
34+
:::image type="content" source="../media/deploy-monitor/monitor/user-experience.png" alt-text="Screenshot of prompt flow editor with deploy button." lightbox = "../media/deploy-monitor/monitor/user-experience.png":::
35+
36+
1. Deploy your flow. By default, both inferencing data collection and application insights are enabled automatically. These are required for the creation of your monitor.
37+
38+
:::image type="content" source="../media/deploy-monitor/monitor/basic-settings.png" alt-text="Screenshot of basic settings in the deployment wizard." lightbox = "../media/deploy-monitor/monitor/basic-settings.png":::
39+
40+
1. By default, all outputs of your deployment are collected using Azure AI's Model Data Collector. As an optional step, you can enter the advanced settings to confirm that your desired columns (for example, context of ground truth) are included in the endpoint response.
41+
42+
Your deployed flow needs to be configured in the following way:
43+
- Flow inputs & outputs: You need to name your flow outputs appropriately and remember these column names when creating your monitor. In this article, we use the following settings:
44+
- Inputs (required): "prompt"
45+
- Outputs (required): "completion"
46+
- Outputs (optional): "context" and/or "ground truth"
47+
48+
- Data collection: The **inferencing data collection** toggle must be enabled using Model Data Collector
49+
50+
- Outputs: In the prompt flow deployment wizard, confirm the required outputs are selected (such as completion, context, and ground_truth) that meet your metric configuration requirements.
51+
52+
1. Test your deployment in the deployment **Test** tab.
53+
54+
:::image type="content" source="../media/deploy-monitor/monitor/test-deploy.png" alt-text="Screenshot of the deployment test page." lightbox = "../media/deploy-monitor/monitor/test-deploy.png":::
55+
56+
> [!NOTE]
57+
> Monitoring requires the endpoint to be used at least 10 times to collect enough data to provide insights. If you'd like to test sooner, manually send about 50 rows in the 'test' tab before running the monitor.
58+
59+
1. Create your monitor by either enabling from the deployment details page, or the **Monitoring** tab.
60+
61+
:::image type="content" source="../media/deploy-monitor/monitor/enable-monitoring.png" alt-text="Screenshot of the button to enable monitoring." lightbox = "../media/deploy-monitor/monitor/enable-monitoring.png":::
62+
63+
1. Ensure your columns are mapped from your flow as defined in the previous requirements.
64+
65+
:::image type="content" source="../media/deploy-monitor/monitor/column-map.png" alt-text="Screenshot of columns mapped for monitoring metrics." lightbox = "../media/deploy-monitor/monitor/column-map.png":::
66+
67+
1. View your monitor in the **Monitor** tab.
68+
69+
:::image type="content" source="../media/deploy-monitor/monitor/monitor-metrics.png" alt-text="Screenshot of the monitoring result metrics." lightbox = "../media/deploy-monitor/monitor/monitor-metrics.png":::
70+
71+
By default, operational metrics such as requests per minute and request latency show up. The default safety and quality monitoring signal are configured with a 10% sample rate and run on your default workspace Azure Open AI connection.
72+
73+
Your monitor is created with default settings:
74+
- 10% sample rate
75+
- 4/5 (thresholds / recurrence)
76+
- Weekly recurrence on Monday mornings
77+
- Alerts are delivered to the inbox of the person that triggered the monitor.
78+
79+
To view more details about your monitoring metrics, you can follow the link to navigate to monitoring in Azure Machine Learning studio, which is a separate studio that allows for more customizations.
80+
81+
2882
## Evaluation metrics
2983

30-
Metrics are generated by the following state-of-the-art GPT language models configured with specific evaluation instructions (prompt templates) which act as evaluator models for sequence-to-sequence tasks. This technique has shown strong empirical results and high correlation with human judgment when compared to standard generative AI evaluation metrics. For more information about prompt flow evaluation, see [Submit bulk test and evaluate a flow](./flow-bulk-test-evaluation.md) and [evaluation and monitoring metrics for generative AI](../concepts/evaluation-metrics-built-in.md).
84+
Metrics are generated by the following state-of-the-art GPT language models configured with specific evaluation instructions (prompt templates) which act as evaluator models for sequence-to-sequence tasks. This technique has strong empirical results and high correlation with human judgment when compared to standard generative AI evaluation metrics. For more information about prompt flow evaluation, see [Submit bulk test and evaluate a flow](./flow-bulk-test-evaluation.md) and [evaluation and monitoring metrics for generative AI](../concepts/evaluation-metrics-built-in.md).
3185

3286
These GPT models are supported with monitoring and configured as your Azure OpenAI resource:
3387

@@ -53,7 +107,7 @@ When creating your flow, you need to ensure your column names are mapped. The fo
53107
|------|------------|----------|
54108
| Prompt text | The original prompt given (also known as "inputs" or "question") | Required |
55109
| Completion text | The final completion from the API call that is returned (also known as "outputs" or "answer") | Required |
56-
| Context text | Any context data that is sent to the API call, together with original prompt. For example, if you hope to get search results only from certain certified information sources/website, you can define in the evaluation steps. This is an optional step that can be configured through prompt flow. | Optional |
110+
| Context text | Any context data that is sent to the API call, together with original prompt. For example, if you hope to get search results only from certain certified information sources/website, you can define in the evaluation steps. | Optional |
57111
| Ground truth text | The user-defined text as the "source of truth" | Optional |
58112

59113
What parameters are configured in your data asset dictates what metrics you can produce, according to this table:
@@ -68,61 +122,6 @@ What parameters are configured in your data asset dictates what metrics you can
68122

69123
For more information, see [question answering metric requirements](evaluate-generative-ai-app.md#question-answering-metric-requirements).
70124

71-
## User Experience
72-
73-
Confirm your flow runs successfully, and that the required inputs and outputs are configured for the metrics you want to assess. The minimum required parameters of collecting only inputs and outputs provide only two metrics: coherence and fluency. You must configure your flow according to the [prior guidance](#flow-and-metric-configuration-requirements).
74-
75-
:::image type="content" source="../media/deploy-monitor/monitor/user-experience.png" alt-text="Screenshot of prompt flow editor with deploy button." lightbox = "../media/deploy-monitor/monitor/user-experience.png":::
76-
77-
Deploy your flow. By default, both inferencing data collection and application insights are enabled automatically. These are required for the creation of your monitor.
78-
79-
:::image type="content" source="../media/deploy-monitor/monitor/basic-settings.png" alt-text="Screenshot of basic settings in the deployment wizard." lightbox = "../media/deploy-monitor/monitor/basic-settings.png":::
80-
81-
By default, all outputs of your deployment are collected using Azure AI's Model Data Collector. As an optional step, you can enter the advanced settings to confirm that your desired columns (for example, context of ground truth) are included in the endpoint response.
82-
83-
In summary, your deployed flow needs to be configured in the following way:
84-
85-
- Flow inputs & outputs: You need to name your flow outputs appropriately and remember these column names when creating your monitor. In this article, we use the following:
86-
- Inputs (required): "prompt"
87-
- Outputs (required): "completion"
88-
- Outputs (optional): "context" and/or "ground truth"
89-
90-
- Data collection: in the "Deployment" (Step #2 of the prompt flow deployment wizard), the 'inference data collection' toggle must be enabled using Model Data Collector
91-
92-
- Outputs: In the Outputs (Step #3 of the prompt flow deployment wizard), confirm you have selected the required outputs listed above (for example, completion | context | ground_truth) that meet your metric configuration requirements.
93-
94-
Test your deployment in the deployment **Test** tab.
95-
96-
:::image type="content" source="../media/deploy-monitor/monitor/test-deploy.png" alt-text="Screenshot of the deployment test page." lightbox = "../media/deploy-monitor/monitor/test-deploy.png":::
97-
98-
99-
> [!NOTE]
100-
> Monitoring requires the endpoint to be used at least 10 times to collect enough data to provide insights. If you'd like to test sooner, manually send about 50 rows in the 'test' tab before running the monitor.
101-
102-
Create your monitor by either enabling from the deployment details page, or the **Monitoring** tab.
103-
104-
:::image type="content" source="../media/deploy-monitor/monitor/enable-monitoring.png" alt-text="Screenshot of the button to enable monitoring." lightbox = "../media/deploy-monitor/monitor/enable-monitoring.png":::
105-
106-
Ensure your columns are mapped from your flow as defined in the previous requirements.
107-
108-
:::image type="content" source="../media/deploy-monitor/monitor/column-map.png" alt-text="Screenshot of columns mapped for monitoring metrics." lightbox = "../media/deploy-monitor/monitor/column-map.png":::
109-
110-
111-
View your monitor in the **Monitor** tab.
112-
113-
:::image type="content" source="../media/deploy-monitor/monitor/monitor-metrics.png" alt-text="Screenshot of the monitoring result metrics." lightbox = "../media/deploy-monitor/monitor/monitor-metrics.png":::
114-
115-
By default, operational metrics such as requests per minute and request latency show up. The default safety and quality monitoring signal are configured with a 10% sample rate and will run on your default workspace Azure Open AI connection.
116-
117-
Your monitor is created with default settings:
118-
- 10% sample rate
119-
- 4/5 (thresholds / recurrence)
120-
- Weekly recurrence on Monday mornings
121-
- Alerts are delivered to the inbox of the person that triggered the monitor.
122-
123-
To view more details about your monitoring metrics, you can follow the link to navigate to monitoring in Azure Machine Learning studio, which is a separate studio that allows for more customizations.
124-
125-
126125
## Next steps
127126

128127
- Learn more about what you can do in [Azure AI Studio](../what-is-ai-studio.md)

articles/ai-studio/toc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -168,7 +168,7 @@
168168
- name: Deploy open models
169169
href: how-to/deploy-models-open.md
170170
displayName: oss, open source
171-
- name: Monitor quality and safety of deployed applications
171+
- name: Monitor prompt flow deployments
172172
href: how-to/monitor-quality-safety.md
173173
- name: Troubleshoot deployments and monitoring
174174
href: how-to/troubleshoot-deploy-and-monitor.md

0 commit comments

Comments
 (0)