-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Prevent Codex duplicate completions #8125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent Codex duplicate completions #8125
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! I've reviewed the changes and have some suggestions for improvement. The implementation looks solid overall - the duplicate detection mechanism using sharedPrefixLength is elegant and the tests provide good coverage of the main scenarios.
| session && | ||
| priorAssistantEntry && | ||
| priorAssistantEntry[0] >= previousProcessed - 1 && | ||
| /task completed/i.test(this.extractMessageText(priorAssistantEntry[1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentional that the session reset uses a case-insensitive regex /task completed/i? This could match unintended phrases like "subtask completed" or "task completed partially". Could we consider using a more specific pattern or exact match to avoid false positives?
| if (holdingChunks) { | ||
| holdingChunks = false | ||
| pendingChunks.push(sanitizedChunk) | ||
| for (const pending of pendingChunks.splice(0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When holdingChunks is true and chunks are accumulated, the array is cleared with pendingChunks.splice(0). Could we consider using pendingChunks.length = 0 for better performance? It's a minor optimization but cleaner.
| if (chunk.type === "text") { | ||
| let text = chunk.text | ||
| if (hasPreviousAssistant && previousCursor < previousAssistantText.length && text) { | ||
| const expected = previousAssistantText.slice(previousCursor, previousCursor + text.length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The duplicate detection logic is complex and crucial for this fix. Would it be helpful to add debug logging when duplicates are detected and trimmed? This could aid troubleshooting in production. For example, adding console.debug('Trimmed duplicate prefix of length:', overlap) when overlap > 0.
| const usageChunk = chunks.find((chunk) => chunk.type === "usage") | ||
| expect(usageChunk).toBeTruthy() | ||
| expect(fallbackSpy).not.toHaveBeenCalled() | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great test coverage! Could we add an additional test case for partial duplicates? For example, when the Codex CLI returns only part of the previous message repeated. This would ensure the sharedPrefixLength logic handles all edge cases.
| taskId: "task-123", | ||
| } | ||
|
|
||
| const makeSession = (chunks: ApiStreamChunk[]) => ({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The makeSession helper is useful. If other Codex-related tests need similar mocking in the future, would it be beneficial to extract this to a shared test utility file like __tests__/helpers/codex-mocks.ts?
Summary
Testing