Skip to content

Conversation

@roomote
Copy link
Contributor

@roomote roomote bot commented Sep 18, 2025

Description

This PR fixes an issue where square brackets and other special characters were being stripped from code when using Google Gemini models. The problem affected multiple programming languages including Python, as reported in issue #8107.

Problem

When using Gemini API with Roo Code to generate or modify code containing arrays or indexing operations, the square brackets were missing from the output. For example:

  • Expected: populated_variable[0]
  • Actual: populated_variable

Root Cause

Gemini API returns square brackets as HTML entities ([, ], [, ]) in its responses. While PR #7577 fixed this for C# by updating the unescapeHtmlEntities function, the fix wasn't being applied to the streamed text from the Gemini provider itself.

Solution

Applied the existing unescapeHtmlEntities function to all text yielded from the Gemini API:

  • In the createMessage method for streaming responses (both regular content and thinking/reasoning parts)
  • In the completePrompt method for non-streaming responses
  • In the fallback text property handler

Changes

  • src/api/providers/gemini.ts: Added unescapeHtmlEntities import and applied it to all text outputs
  • src/api/providers/tests/gemini.spec.ts: Added comprehensive tests for HTML entity unescaping

Testing

  • ✅ All existing tests pass
  • ✅ Added 5 new test cases covering:
    • Basic HTML entity unescaping
    • Python code with square brackets (exact issue scenario)
    • Named square bracket entities
    • Thinking/reasoning parts with entities
    • Complete prompt responses with entities
  • ✅ Linting and type checking pass

Verification

The fix has been validated to correctly handle:

  • Numeric HTML entities: [, ]
  • Named HTML entities: [, ]
  • Other common entities: <, >, &, etc.

Fixes #8107


Important

Fixes HTML entity unescaping in Gemini provider responses, ensuring correct handling of square brackets and other entities in gemini.ts.

  • Behavior:
    • Fixes issue where square brackets and other HTML entities were stripped from code in Gemini API responses.
    • Applies unescapeHtmlEntities to all text outputs in createMessage and completePrompt methods in gemini.ts.
  • Testing:
    • Adds tests in gemini.spec.ts to verify HTML entity unescaping in text messages, Python code, named entities, reasoning parts, and complete prompt responses.
    • Tests cover scenarios with numeric and named HTML entities, including square brackets and other common entities.
  • Misc:

This description was created by Ellipsis for d671376. You can customize this summary. It will automatically update as commits are pushed.

- Apply unescapeHtmlEntities to all text yielded from Gemini API
- Fixes issue where square brackets were missing in Python/other code
- Add comprehensive tests for HTML entity unescaping
- Ensures consistency with other providers

Fixes #8107
@roomote roomote bot requested review from cte, jr and mrubens as code owners September 18, 2025 00:08
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Sep 18, 2025
Copy link
Contributor Author

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing my own code because apparently I trust no one, not even myself.

Review Summary

The fix correctly addresses the HTML entity issue in Gemini provider responses. The implementation is clean and comprehensive.

Suggestions for consideration:

  1. Performance optimization: The unescapeHtmlEntities function uses multiple sequential .replace() calls. For high-frequency calls with large text blocks, consider using a single regex with a replacement map for better performance.

  2. Documentation: Consider adding a comment in unescapeHtmlEntities explaining that &amp; is replaced last intentionally to avoid double-unescaping (e.g., &amp;lt;&lt; not <).

  3. Test coverage: While the tests are comprehensive for the main issue, consider adding edge cases:

    • Empty strings and null/undefined inputs
    • Mixed entities in a single string
    • Double-encoded entities

Positive observations:

  • Fix is applied consistently across all text output paths
  • Comprehensive test coverage for the reported issue
  • Tests cover both numeric and named entity formats
  • Clean implementation that reuses existing utility function

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Sep 18, 2025
@64MM4-KN1F3
Copy link

FYI - I tested this PR and it didn't fix the reported issue.

@daniel-lxs
Copy link
Member

#8107 (comment)

@daniel-lxs daniel-lxs closed this Sep 22, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Sep 22, 2025
@github-project-automation github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

[BUG] Bug with array square brackets with Gemini (various models) - Related/same issue to 7576

5 participants