Skip to content

Commit 3b49b4d

Browse files
committed
minor update
1 parent 4a26ea6 commit 3b49b4d

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

articles/ai-studio/how-to/develop/evaluate-sdk.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,7 @@ Our evaluators will understand that the first turn of the conversation provides
129129
130130
### Performance and quality evaluators
131131

132-
When using AI-assisted performance and quality metrics,
132+
You can use our built-in AI-assisted and NLP quality evaluators to assess the performance and quality of your generative AI application.
133133

134134
#### Set up
135135

@@ -211,7 +211,7 @@ For
211211
The result of the AI-assisted quality evaluators for a query and response pair is a dictionary containing:
212212
- `{metric_name}` provides a numerical score.
213213
- `{metric_name}_label` provides a binary label.
214-
- `{metric_name}_reason` has a text reasoning for why a certain score or label was given for each data point.
214+
- `{metric_name}_reason` explains why a certain score or label was given for each data point.
215215

216216
For NLP evaluators, only a score is given in the `{metric_name}` key.
217217

@@ -314,7 +314,7 @@ The result of the risk and safety evaluators for a query and response pair is a
314314

315315
- `{metric_name}` provides a severity label for that content risk ranging from Very low, Low, Medium, and High. You can read more about the descriptions of each content risk and severity scale [here](../../concepts/evaluation-metrics-built-in.md).
316316
- `{metric_name}_score` has a range between 0 and 7 severity level that maps to a severity label given in `{metric_name}`.
317-
- `{metric_name}_reason` has a text reasoning for why a certain severity score was given for each data point.
317+
- `{metric_name}_reason` explains why a certain severity score was given for each data point.
318318

319319

320320
For conversation outputs, per-turn results are stored in a list and the overall conversation score `'violence_score': 0.0` is averaged over the turns:

0 commit comments

Comments
 (0)