[EvaluationResult Convert]Counts only for primary metrics when multiple metrics and exclude errored counts for passed/failed #43878

YoYoJa · 2025-11-07T20:03:47Z

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

The pull request does not introduce [breaking changes]
CHANGELOG is updated for new features, bug fixes or other significant changes.
I have read the contribution guidelines.

General Guidelines and Best Practices

Title of the pull request is clear and informative.
There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

Pull request includes test coverage for the included changes.

Copilot

Pull Request Overview

This PR introduces filtering logic to skip non-primary metrics when calculating AOAI evaluation summaries. The primary metric is defined as the first metric in the list for evaluators that produce multiple metrics.

Added a new _is_primary_metric function to determine if a metric is a primary metric
Modified _calculate_aoai_evaluation_summary to skip counting non-primary metrics
Reordered the rouge_score metrics list to make rouge_f1_score the primary metric instead of rouge_precision

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py	Added `_is_primary_metric` function and integrated primary metric filtering into `_calculate_aoai_evaluation_summary`
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_constants.py	Updated documentation for `_EvaluatorMetricMapping` and reordered `rouge_score` metrics

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py

YoYoJa added 19 commits October 27, 2025 14:30

update

a860086

rename

3241966

run black

34acf12

merge main

587fdf5

merge main

9008ee3

fix result counts

e9e4832

update

a9b65fe

merge main

9ef58fc

merge main

de9b2ee

Fix bug

1fb7383

run black

ca96c3e

fix bug

a6f398d

merge main

f3a9d45

Add UT

552a446

fix bug: handle null value for summary counts

a0e875e

merge main and resolve conflicts

c678080

address comments

5e42a24

merge main

b778e1a

Update counts to ignore non-primary metrics when multiple metrics

47a0d8d

YoYoJa requested a review from a team as a code owner November 7, 2025 20:03

Copilot AI review requested due to automatic review settings November 7, 2025 20:03

github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Nov 7, 2025

Copilot AI reviewed Nov 7, 2025

View reviewed changes

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py Show resolved Hide resolved

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py Outdated Show resolved Hide resolved

YoYoJa added 5 commits November 7, 2025 12:58

update primary sequence

4366b44

update to get eval name then metrics

96393e4

update doc string and address comments

ec40c6a

update result count to exclude errored counts for passed/failed counts

89a2cc1

add the renamed evaluator into mappings

7ccbbae

YoYoJa changed the title ~~[EvaluationResult Contert]Counts only for primary metrics when multiple metrics~~ [EvaluationResult Convert]Counts only for primary metrics when multiple metrics Nov 8, 2025

run black

c1bd8f7

YoYoJa changed the title ~~[EvaluationResult Convert]Counts only for primary metrics when multiple metrics~~ [EvaluationResult Convert]Counts only for primary metrics when multiple metrics and exclude errored counts for passed/failed Nov 8, 2025

YoYoJa added 2 commits November 7, 2025 21:32

merge main

45c24a1

handle empty string for token counts

07e5751

posaninagendra approved these changes Nov 8, 2025

View reviewed changes

YoYoJa merged commit 8ac8b3e into main Nov 8, 2025
20 checks passed

YoYoJa deleted the jessli/UpdateMappings branch November 8, 2025 22:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[EvaluationResult Convert]Counts only for primary metrics when multiple metrics and exclude errored counts for passed/failed #43878

[EvaluationResult Convert]Counts only for primary metrics when multiple metrics and exclude errored counts for passed/failed #43878

Uh oh!

YoYoJa commented Nov 7, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[EvaluationResult Convert]Counts only for primary metrics when multiple metrics and exclude errored counts for passed/failed #43878

[EvaluationResult Convert]Counts only for primary metrics when multiple metrics and exclude errored counts for passed/failed #43878

Uh oh!

Conversation

YoYoJa commented Nov 7, 2025

Description

All SDK Contribution checklist:

General Guidelines and Best Practices

Testing Guidelines

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants