-
Notifications
You must be signed in to change notification settings - Fork 2.6k
GPT5 OpenAI Fix #6864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPT5 OpenAI Fix #6864
Conversation
- Added max_output_tokens parameter to GPT-5 request body using model.maxTokens - This prevents GPT-5 from defaulting to very large token limits (e.g., 120k) - Updated tests to expect max_output_tokens in GPT-5 request bodies - Fixed test for handling unhandled stream events by properly mocking SDK fallback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new minimal reasoning effort level for GPT-5 models and updates all internationalization files to include translations for this new option. The minimal effort provides the fastest time-to-first-token response from GPT-5 models.
- Adds
minimalreasoning effort support for GPT-5 models with internationalization - Updates UI components to conditionally show the minimal option for GPT-5 models
- Modifies reasoning transformation logic to handle the new minimal effort level
Reviewed Changes
Copilot reviewed 31 out of 31 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| webview-ui/src/i18n/locales/*/settings.json | Adds "minimal (fastest)" translation entries across 16 locales |
| webview-ui/src/components/settings/ThinkingBudget.tsx | Adds GPT-5 model detection and conditional minimal reasoning effort option |
| webview-ui/src/components/settings/ApiOptions.tsx | Clears reasoning effort when switching models and shows verbosity only for GPT-5 |
| src/shared/api.ts | Adds enableGpt5ReasoningSummary option to ApiHandlerOptions |
| src/core/task/Task.ts | Implements GPT-5 conversation continuity with previous_response_id metadata |
| src/api/transform/reasoning.ts | Updates reasoning transformations to exclude minimal effort from API calls |
| src/api/transform/model-params.ts | Updates type definitions for ReasoningEffortWithMinimal |
| src/api/providers/requesty.ts | Filters out minimal effort from API requests |
| src/api/providers/openai.ts | Updates type casting for reasoning effort parameters |
| src/api/providers/openai-native.ts | Major refactor for GPT-5 Responses API support with conversation continuity |
| src/api/providers/tests/openai-native.spec.ts | Comprehensive test updates for GPT-5 Responses API behavior |
| src/api/index.ts | Adds previousResponseId to metadata interface |
| packages/types/src/providers/openai.ts | Adds GPT-5 model default reasoning effort and temperature |
| packages/types/src/provider-settings.ts | Defines ReasoningEffortWithMinimal type and schema |
| packages/types/src/message.ts | Adds GPT-5 metadata schema for conversation continuity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! I've reviewed the changes and found that the PR successfully addresses the stated problem of adding explicit max_output_tokens for GPT-5 requests. The implementation goes beyond the initial scope by also adding minimal reasoning effort support and conversation continuity features.
Positive observations:
- Excellent test coverage with 40+ tests including comprehensive GPT-5 functionality
- Good separation of concerns with dedicated methods for GPT-5 handling
- Proper implementation of the max_output_tokens parameter as intended
- Comprehensive handling of various GPT-5 response formats
I've left some suggestions inline that could improve code maintainability and reduce duplication.
…an and Dutch locales
- Renamed metadata field from 'previous_response_id' to 'response_id' for clarity - Fixed logic to correctly use the response_id from the previous message as previous_response_id for the next request - This resolves the 'Previous response with id not found' errors that occurred after multiple turns in the same session
- Automatically retry without previous_response_id when it's not found (400 error) - Clear stored lastResponseId to prevent reusing stale IDs - Handle errors in both SDK and SSE fallback paths - Log warnings when retrying to help with debugging
- Add promise-based synchronization for response ID persistence - Wait for pending response ID from previous request before using it - Resolve promise when response ID is received or cleared - Add 100ms timeout to avoid blocking too long on ID resolution - Properly clean up resolver on errors to prevent memory leaks This fixes the race condition where fast nano model responses could cause the next request to be initiated before the response ID was fully persisted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good Job ROO!
- Extract usage normalization helper to reduce duplication - Suppress conversation continuity for first message (but respect explicit metadata) - Deduplicate response ID resolver logic - Remove dead enableGpt5ReasoningSummary option references - DRY up GPT-5 event/usage handling with normalizeGpt5Usage helper - Centralize default GPT-5 reasoning effort using model info - Fix Indonesian locale minimal string misplacement - Add clarifying comments for Developer prefix usage - Add TODO for future verbosity UI capability gating - Fix failing test in reasoning.spec.ts
…ndard GPT-5 SSE event types to shared processor to reduce duplication\n- Add JSDoc for response ID accessors\n- Standardize key error messages for GPT-5 Responses API fallback\n- Extract persistGpt5Metadata() in Task to simplify metadata writes\n- Add malformed JSON SSE parsing test\n
…tOpenAI incl. cache); enforce 'skip once' continuity via suppressPreviousResponseId; dedupe responseId resolver on SSE 400; feat: gate reasoning.summary by enableGpt5ReasoningSummary; centralize default reasoning effort; types/ui: add ModelInfo.supportsVerbosity and gate Verbosity UI by capability; refactor: avoid duplicate usage emission in SSE done/completed
…and expected behavior
…d align enableGpt5ReasoningSummary default docs
daniel-lxs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This reverts commit cda67a8.
- Fix manual condensing bug by setting skipPrevResponseIdOnce flag in Task.ts - Revert to string-based format for full conversations after condensing - Add proper image support with structured format when using previous_response_id - Update test suite to match new implementation where all models use Responses API - Handle 400 errors gracefully when previous_response_id is not found Fixes issues introduced in PR #6864
- Fix manual condensing bug by setting skipPrevResponseIdOnce flag in Task.ts - Revert to string-based format for full conversations after condensing - Add proper image support with structured format when using previous_response_id - Update test suite to match new implementation where all models use Responses API - Handle 400 errors gracefully when previous_response_id is not found Fixes issues introduced in PR #6864
Summary
This PR implements comprehensive GPT-5 support including the Responses API, conversation continuity, and proper token management.
Major Changes
1. Complete GPT-5 Responses API Implementation
previous_response_idfor efficient multi-turn conversationssummary: "auto"2. Token Management
max_output_tokensparameter to prevent defaulting to excessive limits (e.g., 120k)model.maxTokens3. Task Persistence Layer
previous_response_idfor conversation continuity4. UI and Settings Updates
5. Comprehensive Test Coverage
max_output_tokensinclusion6. Internationalization
Files Changed
src/api/providers/openai-native.ts(1000+ lines)src/api/providers/__tests__/openai-native.spec.ts(950+ lines)src/core/task/Task.tspackages/types/src/fileswebview-ui/src/components/settings/Testing
✅ All 40 tests in openai-native.spec.ts passing
✅ Verified GPT-5 streaming with all event types
✅ Confirmed
max_output_tokensproperly included✅ Tested conversation continuity with
previous_response_id✅ Validated fallback from SDK to SSE
Impact
This is a significant feature addition that enables full GPT-5 support with proper token management, conversation continuity, and comprehensive error handling.