Skip to content

Commit 752189d

Browse files
committed
TOC and fixes
1 parent ff0b486 commit 752189d

File tree

2 files changed

+4
-6
lines changed

2 files changed

+4
-6
lines changed

articles/ai-studio/concepts/evaluation-approach-gen-ai.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ Pre-production evaluation involves:
5454

5555
The pre-production stage acts as a final quality check, reducing the risk of deploying an AI application that does not meet the desired performance or safety standards.
5656

57-
- Bring your own data: You can evaluate your AI applications in pre-production using your own evaluation data with Azure AI Foundry or [Azure AI Evaluation SDK’s](../how-to/develop/evaluate-sdk.md) supported evaluators, including [generation quality, safety,](..evaluation-metrics-built-in) or [custom evaluators](../how-to/develop/evaluate-sdk.md#custom-evaluators), and [view results via the Azure AI Foundry portal](../how-to/evaluate-results.md).
57+
- Bring your own data: You can evaluate your AI applications in pre-production using your own evaluation data with Azure AI Foundry or [Azure AI Evaluation SDK’s](../how-to/develop/evaluate-sdk.md) supported evaluators, including [generation quality, safety,](./evaluation-metrics-built-in) or [custom evaluators](../how-to/develop/evaluate-sdk.md#custom-evaluators), and [view results via the Azure AI Foundry portal](../how-to/evaluate-results.md).
5858
- Simulators: If you don’t have evaluation data (test data), Azure AI [Evaluation SDK’s simulators](..//how-to/develop/simulator-interaction-data.md) can help by generating topic-related or adversarial queries. These simulators test the model’s response to situation-appropriate or attack-like queries (edge cases).
5959
- The [adversarial simulator](../how-to/develop/simulator-interaction-data.md#generate-adversarial-simulations-for-safety-evaluation) injects queries that mimic potential security threats or attempt jailbreaks, helping identify limitations and preparing the model for unexpected conditions.
6060
- [Context-appropriate simulators](../how-to/develop/simulator-interaction-data.md#generate-synthetic-data-and-simulate-non-adversarial-tasks) generate typical, relevant conversations you’d expect from users to test quality of responses.
@@ -80,7 +80,7 @@ Cheat sheet:
8080

8181
| Purpose | Process | Parameters |
8282
| -----| -----| ----|
83-
| What are you evaluating for? | Identify or build relevant evaluators | - [Quality and performance](./evaluation-metrics-built-in.md?tabs=warning#generation-quality-metrics) ( [Quality and performance sample notebook](https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/evaluate.py))<br> </br> - [Safety and Security](./evaluation-metrics-built-in.md?tabs=warning#risk-and-safety-metrics) ([Safety and Security sample notebook]((https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/evaluatesafetyrisks.py))) <br> </br> - [Custom](../how-to/develop/evaluate-sdk.md#custom-evaluators) ([Custom sample notebook](https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/evaluate.py)) |
83+
| What are you evaluating for? | Identify or build relevant evaluators | - [Quality and performance](./evaluation-metrics-built-in.md?tabs=warning#generation-quality-metrics) ( [Quality and performance sample notebook](https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/evaluate.py))<br> </br> - [Safety and Security](./evaluation-metrics-built-in.md?tabs=warning#risk-and-safety-metrics) ([Safety and Security sample notebook](https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/evaluatesafetyrisks.py)) <br> </br> - [Custom](../how-to/develop/evaluate-sdk.md#custom-evaluators) ([Custom sample notebook](https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/evaluate.py)) |
8484
| What data should you use? | Upload or generate relevant dataset | [Generic simulator for measuring Quality and Performance](./concept-synthetic-data.md) ([Generic simulator sample notebook](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/system/finetune/Llama-notebooks/datagen/synthetic-data-generation.ipynb)) <br></br> - [Adversarial simulator for measuring Safety and Security](../how-to/develop/simulator-interaction-data.md) ([Adversarial simulator sample notebook](https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/simulate_and_evaluate_online_endpoint.ipynb))|
8585
| What resources should conduct the evaluation? | Run evaluation | - Local run <br> </br> - Remote cloud run |
8686
| How did my model/app perform? | Analyze results | [View aggregate scores, view details, score details, compare eval runs](..//how-to/evaluate-results.md) |

articles/ai-studio/toc.yml

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -312,20 +312,18 @@ items:
312312
items:
313313
- name: Evaluations concepts
314314
items:
315-
- name: Approach to generative AI evaluations
315+
- name: Evaluation of Generative AI Models and AI Applications
316316
href: concepts/evaluation-approach-gen-ai.md
317317
- name: Evaluation and monitoring metrics for generative AI
318318
href: concepts/evaluation-metrics-built-in.md
319-
- name: Harms mitigation strategies with Azure AI
320-
href: concepts/evaluation-improvement-strategies.md
321319
- name: Manually evaluate prompts in Azure AI Studio playground
322320
href: how-to/evaluate-prompts-playground.md
323321
- name: Generate synthetic and simulated data for evaluation
324322
href: how-to/develop/simulator-interaction-data.md
325323
- name: Evaluate with the Azure AI Evaluation SDK
326324
href: how-to/develop/evaluate-sdk.md
327325
displayName: code,accuracy,metrics
328-
- name: Evaluate with Azure AI Studio
326+
- name: Run evaluations from Azure AI Studio UI
329327
href: how-to/evaluate-generative-ai-app.md
330328
- name: View evaluation results in Azure AI Studio
331329
href: how-to/evaluate-results.md

0 commit comments

Comments
 (0)