Skip to content

Commit 0ac6497

Browse files
Merge pull request #3923 from lgayhardt/patch-39
Update agent-evaluate-sdk.md
2 parents dbc5cce + ed062f1 commit 0ac6497

File tree

1 file changed

+6
-8
lines changed

1 file changed

+6
-8
lines changed

articles/ai-foundry/how-to/develop/agent-evaluate-sdk.md

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -73,33 +73,31 @@ import os
7373
from azure.ai.evaluation import AzureOpenAIModelConfiguration
7474
from azure.identity import DefaultAzureCredential
7575
from azure.ai.evaluation import IntentResolutionEvaluator, ResponseCompletenessEvaluator
76-
77-
76+
7877
model_config = AzureOpenAIModelConfiguration(
7978
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
8079
api_key=os.environ["AZURE_OPENAI_API_KEY"],
8180
api_version=os.environ["AZURE_OPENAI_API_VERSION"],
8281
azure_deployment=os.environ["MODEL_DEPLOYMENT_NAME"],
8382
)
84-
83+
8584
intent_resolution_evaluator = IntentResolutionEvaluator(model_config)
86-
completeness_evaluator = CompletenessEvaluator(model_config=model_config)
87-
85+
response_completeness_evaluator = ResponseCompletenessEvaluator(model_config=model_config)
86+
8887
# Evaluating query and response as strings
8988
# A positive example. Intent is identified and understood and the response correctly resolves user intent
9089
result = intent_resolution_evaluator(
9190
query="What are the opening hours of the Eiffel Tower?",
9291
response="Opening hours of the Eiffel Tower are 9:00 AM to 11:00 PM.",
9392
)
9493
print(result)
95-
94+
9695
# A negative example. Only half of the statements in the response were complete according to the ground truth
97-
result = completeness_evaluator(
96+
result = response_completeness_evaluator(
9897
response="Itinery: Day 1 take a train to visit Disneyland outside of the city; Day 2 rests in hotel.",
9998
ground_truth="Itinery: Day 1 take a train to visit the downtown area for city sightseeing; Day 2 rests in hotel."
10099
)
101100
print(result)
102-
103101
```
104102

105103
Examples of `tool_calls` and `tool_definitions` for `ToolCallAccuracyEvaluator`:

0 commit comments

Comments
 (0)