-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Tool Call Accuracy OpenAPI Tools #42494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances the Tool Call Accuracy evaluator to support OpenAPI tools in addition to regular function tools. OpenAPI tools contain multiple function definitions within a single tool definition, requiring special handling to properly validate tool calls and expand the tool definitions for evaluation.
- Adds support for OpenAPI tool definitions with embedded function collections
- Expands OpenAPI tool definitions to individual functions for validation
- Updates converters to handle OpenAPI tool extraction and model definitions
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.
File | Description |
---|---|
test_tool_call_accuracy_evaluator.py | Adds comprehensive test case for OpenAPI tool evaluation with currency lookup example |
_tool_call_accuracy.py | Implements OpenAPI tool expansion logic using itertools.chain to flatten function collections |
_models.py | Defines OpenAPIToolDefinition class and updates type annotations for tool definitions |
_ai_services.py | Adds OpenAPI tool extraction logic from thread runs with proper function mapping |
assert result[f"{key}_result"] == "pass" | ||
|
||
|
||
def test_evaluate_open_api_with_tool_defintion(self, mock_model_config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a spelling error in the function name. 'defintion' should be 'definition'.
def test_evaluate_open_api_with_tool_defintion(self, mock_model_config): | |
def test_evaluate_open_api_with_tool_definition(self, mock_model_config): |
Copilot uses AI. Check for mistakes.
"name": tool_call.details.function.name, | ||
"arguments": safe_loads(tool_call.details.function.arguments), | ||
"name": tool_call.details.get(_FUNCTION).get("name") if tool_call.details.get(_FUNCTION) else None, | ||
"arguments": safe_loads(tool_call.details.get(_FUNCTION).get("arguments") if tool_call.details.get(_FUNCTION) else None) , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line has duplicated logic for checking tool_call.details.get(_FUNCTION)
and contains a trailing space before the comma. Consider extracting the function details to a variable for better readability.
Copilot uses AI. Check for mistakes.
@@ -121,6 +121,24 @@ def _extract_function_tool_definitions(thread_run: object) -> List[ToolDefinitio | |||
parameters=parameters, | |||
) | |||
) | |||
elif tool.type == _OPENAPI: | |||
openapi_tool = tool.openapi | |||
tool_defintion = OpenAPIToolDefinition( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a spelling error in the variable name. 'tool_defintion' should be 'tool_definition'.
tool_defintion = OpenAPIToolDefinition( | |
tool_definition = OpenAPIToolDefinition( |
Copilot uses AI. Check for mistakes.
parameters=func.get("parameters"), | ||
type="function", | ||
) | ||
for func in openapi_tool.get("functions")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code is calling .get("functions")
on openapi_tool
object, but openapi_tool
appears to be an object with attributes, not a dictionary. This should likely be openapi_tool.functions
instead.
for func in openapi_tool.get("functions")] | |
for func in openapi_tool.functions] |
Copilot uses AI. Check for mistakes.
) | ||
for func in openapi_tool.get("functions")] | ||
) | ||
final_tools.append(tool_defintion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consistent with the spelling error above, 'tool_defintion' should be 'tool_definition'.
final_tools.append(tool_defintion) | |
final_tools.append(tool_definition) |
Copilot uses AI. Check for mistakes.
spec: object | ||
auth: object | ||
default_params: Optional[list[str]] = None | ||
functions: list[ToolDefinition] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use List[ToolDefinition]
instead of list[ToolDefinition]
for consistency with other type annotations in the codebase and better compatibility with older Python versions.
functions: list[ToolDefinition] | |
functions: List[ToolDefinition] |
Copilot uses AI. Check for mistakes.
description: Optional[str] = None | ||
spec: object | ||
auth: object | ||
default_params: Optional[list[str]] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Optional[List[str]]
instead of Optional[list[str]]
for consistency with other type annotations in the codebase.
default_params: Optional[list[str]] = None | |
default_params: Optional[List[str]] = None |
Copilot uses AI. Check for mistakes.
Description
Please add an informative description that covers that changes made by the pull request and link all relevant issues.
If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines