Skip to content

Reasoning support for evaluators #42482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 79 commits into
base: main
Choose a base branch
from

Conversation

nagkumar91
Copy link
Member

@nagkumar91 nagkumar91 commented Aug 12, 2025

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@Copilot Copilot AI review requested due to automatic review settings August 12, 2025 15:40
@nagkumar91 nagkumar91 requested a review from a team as a code owner August 12, 2025 15:40
@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Aug 12, 2025
Copilot

This comment was marked as outdated.

Copy link

github-actions bot commented Aug 12, 2025

API Change Check

APIView identified API level changes in this PR and created the following API reviews

azure-ai-evaluation

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for reasoning models to evaluators by introducing an is_reasoning_model keyword parameter. When set, this parameter updates the evaluator configuration appropriately for reasoning models, enabling better integration with Azure OpenAI's reasoning capabilities.

Key Changes:

  • Added is_reasoning_model parameter to all evaluators' constructors
  • Updated QAEvaluator to propagate this parameter to child evaluators
  • Added defensive parameter checking in GroundednessEvaluator for backward compatibility
  • Updated documentation across evaluators to describe the new parameter

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
_similarity/_similarity.py Added is_reasoning_model parameter and updated docstrings
_retrieval/_retrieval.py Added is_reasoning_model parameter support
_response_completeness/_response_completeness.py Added is_reasoning_model parameter and improved formatting
_relevance/_relevance.py Added is_reasoning_model parameter support
_qa/_qa.py Updated to propagate is_reasoning_model to child evaluators
_groundedness/_groundedness.py Added parameter support with backward compatibility checks
_fluency/_fluency.py Added is_reasoning_model parameter and updated docstrings
_base_prompty_eval.py Updated to pass is_reasoning_model to AsyncPrompty.load
_base_multi_eval.py Minor import formatting improvement
_coherence/_coherence.py Added is_reasoning_model parameter and updated docstrings
CHANGELOG.md Documented the new feature and bug fix

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

@@ -282,4 +304,4 @@ def _get_context_from_agent_response(self, response, tool_definitions):
logger.debug(f"Error extracting context from agent response : {str(ex)}")
context = ""

return context if context else None
return context
Copy link
Preview

Copilot AI Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function _get_context_from_agent_response should return None when context is empty, not an empty string. The original code returned context if context else None, which properly handles the case where no context is found. Returning an empty string may cause issues in downstream processing that expects None for missing context.

Suggested change
return context
return context if context else None

Copilot uses AI. Check for mistakes.

"""

_PROMPTY_FILE = "coherence.prompty"
_RESULT_KEY = "coherence"

id = "azureai://built-in/evaluators/coherence"
"""Evaluator identifier, experimental and to be used only with evaluation in cloud."""
"""Evaluator identifier, experimental to be used only with cloud evaluation"""
Copy link
Preview

Copilot AI Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring is missing a comma. It should read 'Evaluator identifier, experimental, to be used only with cloud evaluation' or 'Evaluator identifier (experimental) to be used only with cloud evaluation'.

Suggested change
"""Evaluator identifier, experimental to be used only with cloud evaluation"""
"""Evaluator identifier, experimental, to be used only with cloud evaluation"""

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Evaluation Issues related to the client library for Azure AI Evaluation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants