Skip to content

Conversation

@Buzzwavemed
Copy link

Summary

  • trim duplicate Codex CLI prefixes so we only stream new assistant text
  • reset the Codex session when the CLI replays its prior "task completed" answer
  • cover the new behavior with targeted Codex handler unit tests

Testing

  • pnpm --filter roo-cline exec vitest run api/providers/tests/codex.spec.ts

Copy link
Contributor

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! I've reviewed the changes and have some suggestions for improvement. The implementation looks solid overall - the duplicate detection mechanism using sharedPrefixLength is elegant and the tests provide good coverage of the main scenarios.

session &&
priorAssistantEntry &&
priorAssistantEntry[0] >= previousProcessed - 1 &&
/task completed/i.test(this.extractMessageText(priorAssistantEntry[1]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional that the session reset uses a case-insensitive regex /task completed/i? This could match unintended phrases like "subtask completed" or "task completed partially". Could we consider using a more specific pattern or exact match to avoid false positives?

if (holdingChunks) {
holdingChunks = false
pendingChunks.push(sanitizedChunk)
for (const pending of pendingChunks.splice(0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When holdingChunks is true and chunks are accumulated, the array is cleared with pendingChunks.splice(0). Could we consider using pendingChunks.length = 0 for better performance? It's a minor optimization but cleaner.

if (chunk.type === "text") {
let text = chunk.text
if (hasPreviousAssistant && previousCursor < previousAssistantText.length && text) {
const expected = previousAssistantText.slice(previousCursor, previousCursor + text.length)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duplicate detection logic is complex and crucial for this fix. Would it be helpful to add debug logging when duplicates are detected and trimmed? This could aid troubleshooting in production. For example, adding console.debug('Trimmed duplicate prefix of length:', overlap) when overlap > 0.

const usageChunk = chunks.find((chunk) => chunk.type === "usage")
expect(usageChunk).toBeTruthy()
expect(fallbackSpy).not.toHaveBeenCalled()
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great test coverage! Could we add an additional test case for partial duplicates? For example, when the Codex CLI returns only part of the previous message repeated. This would ensure the sharedPrefixLength logic handles all edge cases.

taskId: "task-123",
}

const makeSession = (chunks: ApiStreamChunk[]) => ({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The makeSession helper is useful. If other Codex-related tests need similar mocking in the future, would it be beneficial to extract this to a shared test utility file like __tests__/helpers/codex-mocks.ts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

1 participant