Skip to content

Commit cda67a8

Browse files
GPT5 OpenAI Fix (#6864)
* fix: add explicit max_output_tokens for GPT-5 Responses API - Added max_output_tokens parameter to GPT-5 request body using model.maxTokens - This prevents GPT-5 from defaulting to very large token limits (e.g., 120k) - Updated tests to expect max_output_tokens in GPT-5 request bodies - Fixed test for handling unhandled stream events by properly mocking SDK fallback * fix: add missing translations for reasoningEffort.minimal in Indonesian and Dutch locales * fix: correct GPT-5 response ID persistence and usage - Renamed metadata field from 'previous_response_id' to 'response_id' for clarity - Fixed logic to correctly use the response_id from the previous message as previous_response_id for the next request - This resolves the 'Previous response with id not found' errors that occurred after multiple turns in the same session * feat: add robust error handling for GPT-5 previous_response_id failures - Automatically retry without previous_response_id when it's not found (400 error) - Clear stored lastResponseId to prevent reusing stale IDs - Handle errors in both SDK and SSE fallback paths - Log warnings when retrying to help with debugging * fix: handle GPT-5 response ID race condition with nano model - Add promise-based synchronization for response ID persistence - Wait for pending response ID from previous request before using it - Resolve promise when response ID is received or cleared - Add 100ms timeout to avoid blocking too long on ID resolution - Properly clean up resolver on errors to prevent memory leaks This fixes the race condition where fast nano model responses could cause the next request to be initiated before the response ID was fully persisted. * fix: address PR review comments for GPT-5 implementation - Extract usage normalization helper to reduce duplication - Suppress conversation continuity for first message (but respect explicit metadata) - Deduplicate response ID resolver logic - Remove dead enableGpt5ReasoningSummary option references - DRY up GPT-5 event/usage handling with normalizeGpt5Usage helper - Centralize default GPT-5 reasoning effort using model info - Fix Indonesian locale minimal string misplacement - Add clarifying comments for Developer prefix usage - Add TODO for future verbosity UI capability gating - Fix failing test in reasoning.spec.ts * fix(openai-native): address Roomote inline feedback\n\n- Delegate standard GPT-5 SSE event types to shared processor to reduce duplication\n- Add JSDoc for response ID accessors\n- Standardize key error messages for GPT-5 Responses API fallback\n- Extract persistGpt5Metadata() in Task to simplify metadata writes\n- Add malformed JSON SSE parsing test\n * fix(openai-native,gpt5): correct usage cost calc (use calculateApiCostOpenAI incl. cache); enforce 'skip once' continuity via suppressPreviousResponseId; dedupe responseId resolver on SSE 400; feat: gate reasoning.summary by enableGpt5ReasoningSummary; centralize default reasoning effort; types/ui: add ModelInfo.supportsVerbosity and gate Verbosity UI by capability; refactor: avoid duplicate usage emission in SSE done/completed * fix(gpt5): default enableGpt5ReasoningSummary=true to preserve tests and expected behavior * fix(gpt5): canonicalize GPT-5 metadata key to previous_response_id and align enableGpt5ReasoningSummary default docs * fix(openai-native): remove review artifact comments and guard GPT-5 in completePrompt
1 parent cdc31f7 commit cda67a8

File tree

33 files changed

+2176
-245
lines changed

33 files changed

+2176
-245
lines changed

packages/types/src/message.ts

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,17 @@ export const clineMessageSchema = z.object({
176176
contextCondense: contextCondenseSchema.optional(),
177177
isProtected: z.boolean().optional(),
178178
apiProtocol: z.union([z.literal("openai"), z.literal("anthropic")]).optional(),
179+
metadata: z
180+
.object({
181+
gpt5: z
182+
.object({
183+
previous_response_id: z.string().optional(),
184+
instructions: z.string().optional(),
185+
reasoning_summary: z.string().optional(),
186+
})
187+
.optional(),
188+
})
189+
.optional(),
179190
})
180191

181192
export type ClineMessage = z.infer<typeof clineMessageSchema>

packages/types/src/model.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,8 @@ export const modelInfoSchema = z.object({
4444
supportsImages: z.boolean().optional(),
4545
supportsComputerUse: z.boolean().optional(),
4646
supportsPromptCache: z.boolean(),
47+
// Capability flag to indicate whether the model supports an output verbosity parameter
48+
supportsVerbosity: z.boolean().optional(),
4749
supportsReasoningBudget: z.boolean().optional(),
4850
requiredReasoningBudget: z.boolean().optional(),
4951
supportsReasoningEffort: z.boolean().optional(),

packages/types/src/provider-settings.ts

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,11 @@ import { z } from "zod"
33
import { reasoningEffortsSchema, verbosityLevelsSchema, modelInfoSchema } from "./model.js"
44
import { codebaseIndexProviderSchema } from "./codebase-index.js"
55

6+
// Extended schema that includes "minimal" for GPT-5 models
7+
export const extendedReasoningEffortsSchema = z.union([reasoningEffortsSchema, z.literal("minimal")])
8+
9+
export type ReasoningEffortWithMinimal = z.infer<typeof extendedReasoningEffortsSchema>
10+
611
/**
712
* ProviderName
813
*/
@@ -76,7 +81,7 @@ const baseProviderSettingsSchema = z.object({
7681

7782
// Model reasoning.
7883
enableReasoningEffort: z.boolean().optional(),
79-
reasoningEffort: reasoningEffortsSchema.optional(),
84+
reasoningEffort: extendedReasoningEffortsSchema.optional(),
8085
modelMaxTokens: z.number().optional(),
8186
modelMaxThinkingTokens: z.number().optional(),
8287

packages/types/src/providers/openai.ts

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,32 +12,39 @@ export const openAiNativeModels = {
1212
supportsImages: true,
1313
supportsPromptCache: true,
1414
supportsReasoningEffort: true,
15+
reasoningEffort: "medium",
1516
inputPrice: 1.25,
1617
outputPrice: 10.0,
1718
cacheReadsPrice: 0.13,
1819
description: "GPT-5: The best model for coding and agentic tasks across domains",
20+
// supportsVerbosity is a new capability; ensure ModelInfo includes it
21+
supportsVerbosity: true,
1922
},
2023
"gpt-5-mini-2025-08-07": {
2124
maxTokens: 128000,
2225
contextWindow: 400000,
2326
supportsImages: true,
2427
supportsPromptCache: true,
2528
supportsReasoningEffort: true,
29+
reasoningEffort: "medium",
2630
inputPrice: 0.25,
2731
outputPrice: 2.0,
2832
cacheReadsPrice: 0.03,
2933
description: "GPT-5 Mini: A faster, more cost-efficient version of GPT-5 for well-defined tasks",
34+
supportsVerbosity: true,
3035
},
3136
"gpt-5-nano-2025-08-07": {
3237
maxTokens: 128000,
3338
contextWindow: 400000,
3439
supportsImages: true,
3540
supportsPromptCache: true,
3641
supportsReasoningEffort: true,
42+
reasoningEffort: "medium",
3743
inputPrice: 0.05,
3844
outputPrice: 0.4,
3945
cacheReadsPrice: 0.01,
4046
description: "GPT-5 Nano: Fastest, most cost-efficient version of GPT-5",
47+
supportsVerbosity: true,
4148
},
4249
"gpt-4.1": {
4350
maxTokens: 32_768,
@@ -229,5 +236,6 @@ export const openAiModelInfoSaneDefaults: ModelInfo = {
229236
export const azureOpenAiDefaultApiVersion = "2024-08-01-preview"
230237

231238
export const OPENAI_NATIVE_DEFAULT_TEMPERATURE = 0
239+
export const GPT5_DEFAULT_TEMPERATURE = 1.0
232240

233241
export const OPENAI_AZURE_AI_INFERENCE_PATH = "/models/chat/completions"

src/api/index.ts

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,13 @@ export interface SingleCompletionHandler {
4444
export interface ApiHandlerCreateMessageMetadata {
4545
mode?: string
4646
taskId: string
47+
previousResponseId?: string
48+
/**
49+
* When true, the provider must NOT fall back to internal continuity state
50+
* (e.g., lastResponseId) if previousResponseId is absent.
51+
* Used to enforce "skip once" after a condense operation.
52+
*/
53+
suppressPreviousResponseId?: boolean
4754
}
4855

4956
export interface ApiHandler {

0 commit comments

Comments
 (0)