Add deep-research routing and GPT-5.2 model support to OpenAI client#550
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request adds support for OpenAI's GPT-5.2 model and implements deep-research routing capabilities, enabling the client to automatically switch to specialized deep-research models (o3-deep-research and o4-mini-deep-research) when the deep_research flag is enabled. The changes also introduce a background execution flag for long-running research tasks and enhance the code interpreter tool with configurable memory limits.
Changes:
- Added GPT-5.2 model enum and marked it as structured-output compatible
- Implemented
_resolve_deep_research_modelhelper to intelligently route requests to appropriate deep-research models based on the source model - Updated both
askandask_streammethods to honor thedeep_researchflag with automatic model switching and specialized tool configuration - Added background flag propagation to Responses API payloads for async research execution
- Configured code interpreter tools with 4GB memory limits when used in deep-research mode
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| parrot/models/openai.py | Added GPT5_2 enum value for the "gpt-5.2" model |
| parrot/clients/gpt.py | Added GPT-5.2 to structured output compatible models, implemented deep-research model resolution logic, updated tool configurations with memory limits, and added background flag support to Responses API |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @staticmethod | ||
| def _resolve_deep_research_model(model_str: str) -> str: | ||
| """Resolve the deep research model based on the requested model.""" | ||
| normalized = (model_str or "").strip() | ||
| if normalized in { | ||
| OpenAIModel.O4_MINI.value, | ||
| OpenAIModel.O4_MINI_DEEP_RESEARCH.value, | ||
| }: | ||
| return OpenAIModel.O4_MINI_DEEP_RESEARCH.value | ||
| return OpenAIModel.O3_DEEP_RESEARCH.value |
There was a problem hiding this comment.
The new GPT-5.2 model support and updated deep research routing logic lack test coverage. The codebase has comprehensive test coverage for OpenAI client functionality (see tests/test_openai_client.py), including existing deep research tests. New tests should be added to verify: 1) GPT-5.2 is properly added to STRUCTURED_OUTPUT_COMPATIBLE_MODELS and structured outputs work with it, 2) the _resolve_deep_research_model method correctly routes o4-mini to o4-mini-deep-research and other models to o3-deep-research, 3) the background flag is properly propagated through the Responses API, and 4) the memory_limit configuration is correctly added to the code_interpreter tool when deep_research is enabled.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
| if enable_code_interpreter: | ||
| research_tools.append({ | ||
| "type": "code_interpreter", | ||
| "container": {"type": "auto"} | ||
| "container": {"type": "auto", "memory_limit": "4g"} |
There was a problem hiding this comment.
The code_interpreter tool configuration includes a "container" field with "memory_limit": "4g". This syntax should be verified against OpenAI's actual API documentation for the Responses API and deep research models. While the format appears consistent with Docker memory limits used elsewhere in the codebase (see parrot/tools/codeinterpreter/executor.py), OpenAI's API may use different syntax or may not support the container.memory_limit parameter at all. If this parameter is not supported, the API call may fail with an "unknown_parameter" error.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
| if enable_code_interpreter: | ||
| research_tools.append({ | ||
| "type": "code_interpreter", | ||
| "container": {"type": "auto"} | ||
| "container": {"type": "auto", "memory_limit": "4g"} |
There was a problem hiding this comment.
The code_interpreter tool configuration includes a "container" field with "memory_limit": "4g". This syntax should be verified against OpenAI's actual API documentation for the Responses API and deep research models. While the format appears consistent with Docker memory limits used elsewhere in the codebase (see parrot/tools/codeinterpreter/executor.py), OpenAI's API may use different syntax or may not support the container.memory_limit parameter at all. If this parameter is not supported, the API call may fail with an "unknown_parameter" error.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
|
@phenobarbital I've opened a new pull request, #555, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@phenobarbital I've opened a new pull request, #556, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@phenobarbital I've opened a new pull request, #557, to work on those changes. Once the pull request is ready, I'll request review from you. |
[WIP] Update deep-research routing and GPT-5.2 model support
[WIP] Update routing for deep-research and GPT-5.2 model support
…-one [WIP] Update deep-research routing and GPT-5.2 model support
Motivation
gpt-5.2model so JSON/structured outputs work as expected.Description
GPT5_2 = "gpt-5.2"to theOpenAIModelenum and includeOpenAIModel.GPT5_2.valueinSTRUCTURED_OUTPUT_COMPATIBLE_MODELSso structured output is allowed with that model.OpenAIClient._resolve_deep_research_modelto pick betweeno3-deep-researchando4-mini-deep-researchand use it whendeep_researchis requested.askandask_streamrouting to honor thedeep_researchflag by switching models when appropriate, merging research tools into the request, and always including acode_interpretertool container with"memory_limit": "4g"whenenable_code_interpreteris set.backgroundflag is propagated into Responses API payloads by adding it to_prepare_responses_argsand settingargs['background'] = Truewhendeep_researchandbackgroundare both true.Testing
Codex Task