Skip to content

Commit 1100a73

Browse files
abhahnAbby Hartman
andauthored
Added DocumentRetrievalEvaluator to Azure AI Evaluation to support evaluation of document search (Azure#39929)
* Added new evaluator code for Azure AI Evaluation * Added TypedDict for input validation and created json schema specs for input and output schemas * Added a temporary hack to make the example runnable; updated schema * Implementation improvements to align with applied science recommendations * Added docstrings and cleaned up input schema file * Updates based on in-person feedback * Addressed comments from the PR and SDK review * small fix for threshold dict update * Updates to support complex object inputs in DocumentRetrievalEvaluator * Silence cspell errors for metric names' * Updates to cspell.json * Some updates for style enforcement; removed json schema files * Reformatted with black * Added tests, addressed a few comments and handled some edge cases * Updates to tests and a few code fixes * Docstring updates and added samples * PR comments * A few small test updates --------- Co-authored-by: Abby Hartman <[email protected]>
1 parent 9712d82 commit 1100a73

File tree

9 files changed

+846
-1
lines changed

9 files changed

+846
-1
lines changed

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_common/_base_eval.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -495,7 +495,8 @@ def __init__(self, real_call): # DO NOT ADD TYPEHINT PROMPT FLOW WILL SCREAM AT
495495
# are known to throw at this, mash them into kwargs, and then pass them into the real call.
496496
async def __call__(
497497
self, *, query=None, response=None, context=None, conversation=None, ground_truth=None,
498-
tool_call=None, tool_definitions=None, messages=None, **kwargs
498+
tool_call=None, tool_definitions=None, messages=None, retrieval_ground_truth=None,
499+
retrieved_documents=None,**kwargs
499500
):
500501
if conversation is not None:
501502
kwargs["conversation"] = conversation
@@ -515,5 +516,9 @@ async def __call__(
515516
kwargs["tool_definitions"] = tool_definitions
516517
if messages is not None:
517518
kwargs["messages"] = messages
519+
if retrieval_ground_truth is not None:
520+
kwargs["retrieval_ground_truth"] = retrieval_ground_truth
521+
if retrieved_documents is not None:
522+
kwargs["retrieved_documents"] = retrieved_documents
518523

519524
return await self._real_call(**kwargs)
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# ---------------------------------------------------------
2+
# Copyright (c) Microsoft Corporation. All rights reserved.
3+
# ---------------------------------------------------------
4+
5+
from ._document_retrieval import DocumentRetrievalEvaluator, RetrievalGroundTruthDocument, RetrievedDocument
6+
7+
__all__ = [
8+
"DocumentRetrievalEvaluator",
9+
"RetrievalGroundTruthDocument",
10+
"RetrievedDocument"
11+
]

0 commit comments

Comments
 (0)