Skip to content

Commit d7f7efd

Browse files
committed
fixes
1 parent 7fd561f commit d7f7efd

File tree

4 files changed

+5
-5
lines changed

4 files changed

+5
-5
lines changed

articles/ai-studio/concepts/evaluation-metrics-built-in.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -81,8 +81,8 @@ We support the following AI-Assisted metrics for the above task types:
8181

8282
| Task type | Question and Generated Answers Only (No context or ground truth needed) | Question and Generated Answers + Context | Question and Generated Answers + Context + Ground Truth |
8383
| --- | --- | --- | --- |
84-
| Question Answering | - Risk and safety metrics (all AI-Assisted): hateful and unfair content defect rate, sexual content defect rate, violent content defect rate, self-harm-related content defect rate, and jailbreak defect rate <br> - Generation quality metrics (all AI-Assisted): Coherence, Fluency |Previous Column Metrics <br> + <br> Generation quality metrics (all AI-Assisted): <br> - Groundedness <br> - Relevance |Previous Column Metrics <br> + <br> Generation quality metrics: <br> Similarity (AI-assisted) <br> F1-Score (traditional ML metric) |
85-
| Conversation | - Risk and safety metrics (all AI-Assisted): hateful and unfair content defect rate, sexual content defect rate, violent content defect rate, self-harm-related content defect rate, and jailbreak defect rate <br> - Generation quality metrics (all AI-Assisted): Coherence, Fluency | Previous Column Metrics <br> + <br> Generation quality metrics (all AI-Assisted): <br> - Groundedness <br> - Retrieval Score | N/A |
84+
| [Question Answering](#question-answering-single-turn) | - Risk and safety metrics (all AI-Assisted): hateful and unfair content defect rate, sexual content defect rate, violent content defect rate, self-harm-related content defect rate, and jailbreak defect rate <br> - Generation quality metrics (all AI-Assisted): Coherence, Fluency |Previous Column Metrics <br> + <br> Generation quality metrics (all AI-Assisted): <br> - Groundedness <br> - Relevance |Previous Column Metrics <br> + <br> Generation quality metrics: <br> Similarity (AI-assisted) <br> F1-Score (traditional ML metric) |
85+
| [Conversation](#conversation-single-turn-and-multi-turn) | - Risk and safety metrics (all AI-Assisted): hateful and unfair content defect rate, sexual content defect rate, violent content defect rate, self-harm-related content defect rate, and jailbreak defect rate <br> - Generation quality metrics (all AI-Assisted): Coherence, Fluency | Previous Column Metrics <br> + <br> Generation quality metrics (all AI-Assisted): <br> - Groundedness <br> - Retrieval Score | N/A |
8686

8787
> [!NOTE]
8888
> While we are providing you with a comprehensive set of built-in metrics that facilitate the easy and efficient evaluation of the quality and safety of your generative AI application, it is best practice to adapt and customize them to your specific task types. Furthermore, we empower you to introduce entirely new metrics, enabling you to measure your applications from fresh angles and ensuring alignment with your unique objectives.

articles/ai-studio/how-to/evaluate-flow-results.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ Evaluation results might have different meanings for different audiences. For ex
7979

8080
When understanding each content risk metric, you can easily view each metric definition and severity scale by selecting the metric name above the chart to see a detailed explanation in a pop-up.
8181

82-
:::image type="content" source="../media/evaluations/view-results/ risk-safety-metric-definition-popup.png" alt-text="Screenshot of risk and safety metrics detailed explanation pop-up." lightbox="../media/evaluations/view-results/ risk-safety-metric-definition-popup.png":::
82+
:::image type="content" source="../media/evaluations/view-results/risk-safety-metric-definition-popup" alt-text="Screenshot of risk and safety metrics detailed explanation pop-up." lightbox="../media/evaluations/view-results/risk-safety-metric-definition-popup.png":::
8383

8484
If there's something wrong with the run, you can also debug your evaluation run with the log and trace.
8585

articles/ai-studio/includes/evaluations/from-data/python.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ When using AI-assisted performance and quality metrics, you must specify a GPT m
4040
When using AI-assisted risk and safety metrics, you do not need to provide a connection and deployment. The Azure AI Studio safety evaluations back-end service provisions a GPT-4 model that can generate content risk severity scores and reasoning to enable you to evaluate your application for content harms.
4141

4242
> [!NOTE]
43-
> Currently AI-assisted risk and safety metrics are only available in the following regions: East US 2, France Central, UK South, Sweden Central. Groundedness measurement leveraging Azure AI Content Safety Groundedness Detection is only supported following regions: East US 2 and Sweden Central. Read more about the [supported metrics](../../../concepts/evaluation-metrics-built-in.md#metrics-for-single-turn-question-answering-without-retrieval-non-rag) and when to use which metric.
43+
> Currently AI-assisted risk and safety metrics are only available in the following regions: East US 2, France Central, UK South, Sweden Central. Groundedness measurement leveraging Azure AI Content Safety Groundedness Detection is only supported following regions: East US 2 and Sweden Central. Read more about the [supported metrics](../../../concepts/evaluation-metrics-built-in.md) and when to use which metric.
4444
4545
### Supported input data format for question answering
4646

articles/ai-studio/includes/evaluations/from-data/studio.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ For guidance on the specific data mapping requirements for each metric, refer to
146146
| Violent content | Required: list |
147147
| Sexual content | Required: list |
148148

149-
Messages: message key that follows the chat protocol format defined by Azure Open AI for [conversations](../../../concepts/evaluation-metrics-built-in.md#Conversation-single-turn-and-multi-turn). For Groundedness, Relevance and Retrieval score, the citations key is required within your messages list.
149+
Messages: message key that follows the chat protocol format defined by Azure Open AI for [conversations](../../../concepts/evaluation-metrics-built-in.md#conversation-single-turn-and-multi-turn). For Groundedness, Relevance and Retrieval score, the citations key is required within your messages list.
150150

151151
#### Review and finish
152152

0 commit comments

Comments
 (0)