Skip to content

Commit dc46a31

Browse files
ahibrahimmw-javedCopilot
authored
Update changelog with Evaluator changes (#43718)
* update * Apply suggestion from @Copilot Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Waqas Javed <[email protected]> Co-authored-by: Copilot <[email protected]>
1 parent 59a24b3 commit dc46a31

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

sdk/evaluation/azure-ai-evaluation/CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,15 @@
66

77
- Updated `IndirectAttack` risk category for RedTeam to `IndirectJailbreak` to better reflect its purpose. This change allows users to apply cross-domain prompt injection (XPIA) attack strategies across all risk categories, enabling more comprehensive security testing of AI systems against indirect prompt injection attacks during red teaming.
88
- Added `TaskAdherence`, `SensitiveDataLeakage`, and `ProhibitedActions` as cloud-only agent safety risk categories for red teaming.
9+
- Updated all evaluators' output to be of the following schema:
10+
- `gpt_{evaluator_name}`, `{evaluator_name}`: float score,
11+
- `{evaluator_name}_result`: pass/fail based on threshold,
12+
- `{evaluator_name}_reason`, `{evaluator_name}_threshold`
13+
- `{evaluator_name}_prompt_tokens`, `{evaluator_name}_completion_tokens`, `{evaluator_name}_total_tokens`, `{evaluator_name}_finish_reason`
14+
- `{evaluator_name}_model`: model used for evaluation
15+
- `{evaluator_name}_sample_input`, `{evaluator_name}_sample_output`: input and output used for evaluation
16+
17+
This change standardizes the output format across all evaluators and follows OTel convention.
918

1019
### Bugs Fixed
1120

0 commit comments

Comments
 (0)