Skip to content

Commit f433403

Browse files
committed
fixes
1 parent 4ba0ad0 commit f433403

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

articles/ai-studio/concepts/evaluation-approach-gen-ai.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,10 +80,11 @@ Cheat sheet:
8080

8181
| Purpose | Process | Parameters |
8282
| -----| -----| ----|
83-
| What are you evaluating for? | Identify or build relevant evaluators | - [Quality and performance](./evaluation-metrics-built-in.md?tabs=warning#generation-quality-metrics) ( [Quality and performance sample notebook](https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/evaluate.py)) </br> - [Safety and Security](./evaluation-metrics-built-in.md?tabs=warning#risk-and-safety-metrics)) ([Safety and Security sample notebook]((https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/evaluatesafetyrisks.py))) </br> [Custom](../how-to/develop/evaluate-sdk.md#custom-evaluators) ([Custom sample notebook](https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/evaluate.py))] |
84-
| What data should you use? | Upload or generate relevant dataset | [Generic simulator for measuring Quality and Performance](./concept-synthetic-data.md) ( Generic simulator sample notebook|(https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/system/finetune/Llama-notebooks/datagen/synthetic-data-generation.ipynb)] </br> - Adversarial simulator for measuring Safety and Security [[Adversarial simulator Docs](../how-to/develop/simulator-interaction-data.md), [Adversarial simulator sample notebook](https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/simulate_and_evaluate_online_endpoint.ipynb)] |
85-
| What resources should conduct the evaluation? | Run evaluation | - Local run <br> </br> - Remote cloud run |
83+
| What are you evaluating for? | Identify or build relevant evaluators | - [Quality and performance](./evaluation-metrics-built-in.md?tabs=warning#generation-quality-metrics) ( [Quality and performance sample notebook](https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/evaluate.py)) </br> - [Safety and Security](./evaluation-metrics-built-in.md?tabs=warning#risk-and-safety-metrics)) ([Safety and Security sample notebook]((https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/evaluatesafetyrisks.py))) </br> [Custom](../how-to/develop/evaluate-sdk.md#custom-evaluators) ([Custom sample notebook](https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/evaluate.py))] |
84+
| What data should you use? | Upload or generate relevant dataset | [Generic simulator for measuring Quality and Performance](./concept-synthetic-data.md) ( [Generic simulator sample notebook|(https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/system/finetune/Llama-notebooks/datagen/synthetic-data-generation.ipynb)] </br> - Adversarial simulator for measuring Safety and Security [Adversarial simulator Docs](../how-to/develop/simulator-interaction-data.md) (Adversarial simulator sample notebook](https://github.com/Azure-Samples/rag-data-openai-python-promptflow/blob/main/src/evaluation/simulate_and_evaluate_online_endpoint.ipynb)] ) |
85+
| What resources should conduct the evaluation? | Run evaluation | - Local run </br> - Remote cloud run |
8686
| How did my model/app perform? | Analyze results | [View aggregate scores, view details, score details, compare eval runs](..//how-to/evaluate-results.md)] |
87+
| How can I improve? | Make changes to model, app, or evaluators - If evaluation results did not align to human feedback, adjust your evaluator. </br> - If evaluation results aligned to human feedback but did not meet quality/safety thresholds, apply targeted mitigations. |
8788

8889
## Related content
8990

0 commit comments

Comments
 (0)