Bugfix/aoai row mismatch #42466

nagkumar91 · 2025-08-11T21:52:25Z

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

The pull request does not introduce [breaking changes]
CHANGELOG is updated for new features, bug fixes or other significant changes.
I have read the contribution guidelines.

General Guidelines and Best Practices

Title of the pull request is clear and informative.
There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

Pull request includes test coverage for the included changes.

Add pyrit and not remove the other one

…row misalignment; add missing-rows unit test

Copilot

Pull Request Overview

This pull request fixes a critical bug in the Azure AI evaluation system where evaluation results were not properly aligned with input data, leading to incorrect metrics being reported. The issue occurred because AOAI (Azure OpenAI) evaluation runs could return results in a different order than the original dataset or drop some rows entirely.

Key changes:

Added row alignment validation by tracking expected row count
Enhanced result processing to handle missing rows with proper padding
Updated test cases to validate ordering preservation and missing row handling

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
`test_aoai_evaluation_pagination.py`	Updated test cases to include expected_rows parameter and fixed test data to properly validate row ordering
`test_aoai_alignment_missing_rows.py`	New test file to validate proper handling of unordered AOAI results and missing row detection
`_evaluate_aoai.py`	Core fix implementing row alignment validation, missing row padding, and proper result ordering
`CHANGELOG.md`	Documentation of the bug fix

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py

…te/_evaluate_aoai.py Co-authored-by: Copilot <[email protected]>

singankit · 2025-08-13T00:38:47Z

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py

+    if expected is not None:
+        pre_len = len(output_df)
+        # Assumes original datasource_item_id space is 0..expected-1
+        output_df = output_df.reindex(range(expected))


How do we know for which row we did not get result ?

Nagkumar Arkalgud and others added 30 commits May 28, 2025 11:11

Prepare evals SDK Release

4318329

Fix bug

192b980

Fix for ADV_CONV for FDP projects

758adb4

Update release date

de09fd1

Merge branch 'main' into main

ef60fe6

Merge branch 'Azure:main' into main

8ca51d0

Merge branch 'Azure:main' into main

98bfc3a

Merge branch 'Azure:main' into main

a5f32e8

Merge branch 'Azure:main' into main

5fd88b6

Merge branch 'Azure:main' into main

51f2b44

Merge branch 'Azure:main' into main

a5be8b5

Merge branch 'Azure:main' into main

75965b7

Merge branch 'Azure:main' into main

d0c5e53

Merge branch 'Azure:main' into main

b790276

Merge branch 'Azure:main' into main

d5ca243

re-add pyrit to matrix

8d62e36

Change grader ids

59a70f2

Merge branch 'Azure:main' into main

4d146d7

Update unit test

f7a4c83

replace all old grader IDs in tests

79e3a40

Merge branch 'main' into main

588cbec

Update platform-matrix.json

7514472

Add pyrit and not remove the other one

Update test to ensure everything is mocked

28b2513

tox/black fixes

8603e0e

Skip that test with issues

895f226

Merge branch 'Azure:main' into main

b4b2daf

update grader ID according to API View feedback

023f07f

Update test

45b5f5d

remove string check for grader ID

1ccb4db

Merge branch 'Azure:main' into main

6fd9aa5

nagkumar91 and others added 18 commits July 24, 2025 08:15

Merge branch 'main' into feature/python_grader

4d60e43

Merge branch 'Azure:main' into main

98d1626

Add properties to evaluation upload run for FDP

9248c38

Remove debug

74b760f

Merge branch 'feature/python_grader'

23dbc85

Remove the redundant property

467ccb6

Merge branch 'Azure:main' into main

c2beee8

Fix changelog

be9a19a

Fix the multiple features added section

de3a1e1

removed the properties in update

f9faa61

Merge branch 'Azure:main' into main

69e783a

Merge branch 'Azure:main' into main

8ebea2a

Merge branch 'Azure:main' into main

3f9c818

Merge branch 'Azure:main' into main

3b3159c

Merge branch 'Azure:main' into main

d78b834

Merge branch 'Azure:main' into main

ae3fc52

Merge branch 'Azure:main' into main

706c042

fix(evaluation): pad AOAI grader results to expected_rows to prevent …

4fa1981

…row misalignment; add missing-rows unit test

github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Aug 11, 2025

Update the test and changelog

953bc77

nagkumar91 marked this pull request as ready for review August 11, 2025 22:42

Copilot AI review requested due to automatic review settings August 11, 2025 22:42

nagkumar91 requested a review from a team as a code owner August 11, 2025 22:42

Copilot AI reviewed Aug 11, 2025

View reviewed changes

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py Outdated Show resolved Hide resolved

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py Outdated Show resolved Hide resolved

nagkumar91 and others added 5 commits August 11, 2025 15:48

Update sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evalua…

fa77849

…te/_evaluate_aoai.py Co-authored-by: Copilot <[email protected]>

Update sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evalua…

4def576

…te/_evaluate_aoai.py Co-authored-by: Copilot <[email protected]>

Fix the indent

6b71370

Lint fixes

b3af918

update cspell

54f7bc1

singankit reviewed Aug 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bugfix/aoai row mismatch #42466

Bugfix/aoai row mismatch #42466

nagkumar91 commented Aug 11, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

singankit Aug 13, 2025

Uh oh!

Uh oh!

Bugfix/aoai row mismatch #42466

Are you sure you want to change the base?

Bugfix/aoai row mismatch #42466

Conversation

nagkumar91 commented Aug 11, 2025

Description

All SDK Contribution checklist:

General Guidelines and Best Practices

Testing Guidelines

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

singankit Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!