-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Reasoning support for evaluators #42482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add pyrit and not remove the other one
…ators; propagate via QA; ensure groundedness reload passes flag
…ning_model propagation; exclude unintended sample/log/docs files
API Change CheckAPIView identified API level changes in this PR and created the following API reviews |
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_coherence/_coherence.py
Outdated
Show resolved
Hide resolved
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_coherence/_coherence.py
Outdated
Show resolved
Hide resolved
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_coherence/_coherence.py
Outdated
Show resolved
Hide resolved
…tors/_coherence/_coherence.py Co-authored-by: Ankit Singhal <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for reasoning models to evaluators by introducing an is_reasoning_model
keyword parameter. When set, this parameter updates the evaluator configuration appropriately for reasoning models, enabling better integration with Azure OpenAI's reasoning capabilities.
Key Changes:
- Added
is_reasoning_model
parameter to all evaluators' constructors - Updated
QAEvaluator
to propagate this parameter to child evaluators - Added defensive parameter checking in
GroundednessEvaluator
for backward compatibility - Updated documentation across evaluators to describe the new parameter
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
_similarity/_similarity.py |
Added is_reasoning_model parameter and updated docstrings |
_retrieval/_retrieval.py |
Added is_reasoning_model parameter support |
_response_completeness/_response_completeness.py |
Added is_reasoning_model parameter and improved formatting |
_relevance/_relevance.py |
Added is_reasoning_model parameter support |
_qa/_qa.py |
Updated to propagate is_reasoning_model to child evaluators |
_groundedness/_groundedness.py |
Added parameter support with backward compatibility checks |
_fluency/_fluency.py |
Added is_reasoning_model parameter and updated docstrings |
_base_prompty_eval.py |
Updated to pass is_reasoning_model to AsyncPrompty.load |
_base_multi_eval.py |
Minor import formatting improvement |
_coherence/_coherence.py |
Added is_reasoning_model parameter and updated docstrings |
CHANGELOG.md |
Documented the new feature and bug fix |
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
@@ -282,4 +304,4 @@ def _get_context_from_agent_response(self, response, tool_definitions): | |||
logger.debug(f"Error extracting context from agent response : {str(ex)}") | |||
context = "" | |||
|
|||
return context if context else None | |||
return context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function _get_context_from_agent_response
should return None
when context is empty, not an empty string. The original code returned context if context else None
, which properly handles the case where no context is found. Returning an empty string may cause issues in downstream processing that expects None
for missing context.
return context | |
return context if context else None |
Copilot uses AI. Check for mistakes.
""" | ||
|
||
_PROMPTY_FILE = "coherence.prompty" | ||
_RESULT_KEY = "coherence" | ||
|
||
id = "azureai://built-in/evaluators/coherence" | ||
"""Evaluator identifier, experimental and to be used only with evaluation in cloud.""" | ||
"""Evaluator identifier, experimental to be used only with cloud evaluation""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring is missing a comma. It should read 'Evaluator identifier, experimental, to be used only with cloud evaluation' or 'Evaluator identifier (experimental) to be used only with cloud evaluation'.
"""Evaluator identifier, experimental to be used only with cloud evaluation""" | |
"""Evaluator identifier, experimental, to be used only with cloud evaluation""" |
Copilot uses AI. Check for mistakes.
Description
Please add an informative description that covers that changes made by the pull request and link all relevant issues.
If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines