You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-studio/how-to/develop/evaluate-sdk.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -329,7 +329,7 @@ For conversation outputs, per-turn results are stored in a list and the overall
329
329
> [!NOTE]
330
330
> We strongly recommend users to migrate their code to use the key without prefixes (for example, `groundedness.groundedness`) to allow your code to support more evaluator models.
331
331
332
-
### Risk and safety evaluators
332
+
### Risk and safety evaluators (preview)
333
333
334
334
When you use AI-assisted risk and safety metrics, a GPT model isn't required. Instead of `model_config`, provide your `azure_ai_project` information. This accesses the Azure AI project safety evaluations back-end service, which provisions a GPT model specific to harms evaluation that can generate content risk severity scores and reasoning to enable the safety evaluators.
335
335
@@ -738,13 +738,13 @@ result = evaluate(
738
738
739
739
```
740
740
741
-
## Cloud evaluation on test datasets
741
+
## Cloud evaluation (preview) on test datasets
742
742
743
743
After local evaluations of your generative AI applications, you might want to run evaluations in the cloud for pre-deployment testing, and [continuously evaluate](https://aka.ms/GenAIMonitoringDoc) your applications for post-deployment monitoring. Azure AI Projects SDK offers such capabilities via a Python APIand supports almost all of the features available in local evaluations. Follow the steps below to submit your evaluation to the cloud on your data using built-inor custom evaluators.
744
744
745
745
### Prerequisites
746
746
747
-
- Azure AI project in the same [regions](#region-support) as risk and safety evaluators. If you don't have an existing project, follow the guide [How to create Azure AI project](../create-projects.md?tabs=ai-studio) to create one.
747
+
- Azure AI project in the same [regions](#region-support) as risk and safety evaluators (preview). If you don't have an existing project, follow the guide [How to create Azure AI project](../create-projects.md?tabs=ai-studio) to create one.
748
748
749
749
> [!NOTE]
750
750
> Cloud evaluations do not support `ContentSafetyEvaluator`, and`QAEvaluator`.
After logging your custom evaluator to your Azure AI project, you can view it in your [Evaluator library](../evaluate-generative-ai-app.md#view-and-manage-the-evaluators-in-the-evaluator-library) under **Evaluation** tab of your Azure AI project.
921
921
922
-
### Cloud evaluation with Azure AI Projects SDK
922
+
### Cloud evaluation (preview) with Azure AI Projects SDK
923
923
924
924
You can submit a cloud evaluation with Azure AI Projects SDK via a Python API. See the following example to submit a cloud evaluation of your dataset using an NLP evaluator (F1 score), an AI-assisted quality evaluator (Relevance), a safety evaluator (Violence) and a custom evaluator. Putting it altogether:
@@ -28,15 +28,15 @@ In this article, you'll learn how to holistically generate high-quality datasets
28
28
29
29
## Getting started
30
30
31
-
First install and import the simulator package from the Azure AI Evaluation SDK:
31
+
First install and import the simulator package (preview) from the Azure AI Evaluation SDK:
32
32
33
33
```python
34
34
pip install azure-ai-evaluation
35
35
```
36
36
37
37
## Generate synthetic data and simulate non-adversarial tasks
38
38
39
-
Azure AI Evaluation SDK's `Simulator` provides an end-to-end synthetic data generation capability to help developers test their application's response to typical user queries in the absence of production data. AI developers can use an index or text-based query generator and fully customizable simulator to create robust test datasets around non-adversarial tasks specific to their application. The `Simulator` class is a powerful tool designed to generate synthetic conversations and simulate task-based interactions. This capability is useful for:
39
+
Azure AI Evaluation SDK's `Simulator`(preview) provides an end-to-end synthetic data generation capability to help developers test their application's response to typical user queries in the absence of production data. AI developers can use an index or text-based query generator and fully customizable simulator to create robust test datasets around non-adversarial tasks specific to their application. The `Simulator` class is a powerful tool designed to generate synthetic conversations and simulate task-based interactions. This capability is useful for:
40
40
41
41
-**Testing Conversational Applications**: Ensure your chatbots and virtual assistants respond accurately under various scenarios.
42
42
-**Training AI Models**: Generate diverse datasets to train and fine-tune machine learning models.
0 commit comments