Update changelog with Evaluator changes (#43718)

ahibrahimm · w-javed · Copilot · web-flow · commit dc46a31ddee9 · 2025-10-30T19:59:18.000-07:00
* update

* Apply suggestion from @Copilot

Co-authored-by: Copilot &lt;175728472+Copilot@users.noreply.github.com&gt;

---------

Co-authored-by: Waqas Javed &lt;7674577+w-javed@users.noreply.github.com&gt;
Co-authored-by: Copilot &lt;175728472+Copilot@users.noreply.github.com&gt;
diff --git a/sdk/evaluation/azure-ai-evaluation/CHANGELOG.md b/sdk/evaluation/azure-ai-evaluation/CHANGELOG.md
@@ -6,6 +6,15 @@
 
 - Updated `IndirectAttack` risk category for RedTeam to `IndirectJailbreak` to better reflect its purpose. This change allows users to apply cross-domain prompt injection (XPIA) attack strategies across all risk categories, enabling more comprehensive security testing of AI systems against indirect prompt injection attacks during red teaming.
 - Added `TaskAdherence`, `SensitiveDataLeakage`, and `ProhibitedActions` as cloud-only agent safety risk categories for red teaming. 
+- Updated all evaluators' output to be of the following schema:
+  - `gpt_{evaluator_name}`, `{evaluator_name}`: float score,
+  - `{evaluator_name}_result`: pass/fail based on threshold,
+  - `{evaluator_name}_reason`, `{evaluator_name}_threshold`
+  - `{evaluator_name}_prompt_tokens`, `{evaluator_name}_completion_tokens`, `{evaluator_name}_total_tokens`, `{evaluator_name}_finish_reason`
+  - `{evaluator_name}_model`: model used for evaluation
+  - `{evaluator_name}_sample_input`, `{evaluator_name}_sample_output`: input and output used for evaluation
+  
+  This change standardizes the output format across all evaluators and follows OTel convention.
 
 ### Bugs Fixed