Skip to content

Conversation

@roo-code-preview
Copy link

@roo-code-preview roo-code-preview bot commented Sep 9, 2025

This PR attempts to address Issue #5621 by implementing support for Google's latest model with flexible output dimensions.

Changes Made

  • Added new model definitions: Added variants with dimensions 3072, 1536, and 768 to
  • Updated GeminiEmbedder: Added support for parameter in constructor and API calls
  • Updated OpenAICompatibleEmbedder: Enhanced to handle parameter and pass it to the Gemini API
  • Set new default: Changed default Gemini model from to
  • Maintained backward compatibility: Preserved existing model support
  • Updated tests: Enhanced test coverage to include new model and functionality

Acceptance Criteria Verification

Given a user wants to configure codebase indexing with the latest Google model
When they navigate to the "Roo-Code: Codebase Indexing Settings"
And select "Gemini" as the Embedder Provider
Then the "Select a model" dropdown should list with options for 3072, 1536, and 768 dimensions.
And the user can select any of these options and save the settings without errors.
But the existing model must remain available and fully functional.

Technical Implementation

  • Uses spread operator to conditionally include parameter in API calls
  • Maintains proper error handling and telemetry integration
  • Follows existing code patterns and TypeScript conventions
  • All tests pass and TypeScript compilation is successful

Files Modified

    • Added new model definitions
    • Added outputDimension support
    • Enhanced API call handling
    • Updated test coverage

Feedback and guidance are welcome!


Important

Adds support for Google Gemini model gemini-embedding-exp-03-07 with flexible dimensions and updates related classes and tests.

  • Behavior:
    • Adds support for Google Gemini model gemini-embedding-exp-03-07 with dimensions 3072, 1536, and 768 in gemini.ts.
    • Updates GeminiEmbedder constructor to accept outputDimension parameter.
    • Updates OpenAICompatibleEmbedder to handle outputDimension in API calls.
    • Changes default Gemini model to gemini-embedding-exp-03-07-3072.
  • Tests:
    • Updates tests in gemini.spec.ts to cover new model and outputDimension functionality.
  • Models:
    • Adds new model profiles to embeddingModels.ts for gemini-embedding-exp-03-07-3072, gemini-embedding-exp-03-07-1536, and gemini-embedding-exp-03-07-768.

This description was created by Ellipsis for c0f88e0. You can customize this summary. It will automatically update as commits are pushed.

…exp-03-07

- Add gemini-embedding-exp-03-07 model variants with flexible dimensions (3072, 1536, 768)
- Update GeminiEmbedder to support outputDimension parameter
- Update OpenAICompatibleEmbedder to handle outputDimension in API calls
- Set gemini-embedding-exp-03-07-3072 as new default model for Gemini provider
- Preserve backward compatibility with existing text-embedding-004 model
- Update tests to cover new model and outputDimension functionality

Fixes #5621
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. enhancement New feature or request labels Sep 9, 2025
@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Sep 9, 2025
Copy link
Contributor

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing my own code is like debugging in production - technically possible but morally questionable.

* Supported models:
* - text-embedding-004 (dimension: 768)
* - gemini-embedding-001 (dimension: 2048)
* - gemini-embedding-001 (dimension: 3072)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional? The comment says gemini-embedding-001 (dimension: 3072) but on line 44 we pass GEMINI_MAX_ITEM_TOKENS (2048) to the constructor. These seem to represent different concepts (embedding dimension vs max tokens), but it might be worth clarifying in a comment to avoid confusion.

* @param outputDimension Optional output dimension for flexible models
*/
constructor(apiKey: string, modelId?: string) {
constructor(apiKey: string, modelId?: string, outputDimension?: number) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we validate that the outputDimension parameter matches one of the supported dimensions for the selected model? For example, if someone uses gemini-embedding-exp-03-07-3072 with outputDimension: 768, this could lead to unexpected behavior.

Suggested change
constructor(apiKey: string, modelId?: string, outputDimension?: number) {
constructor(apiKey: string, modelId?: string, outputDimension?: number) {
if (!apiKey) {
throw new Error(t("embeddings:validation.apiKeyRequired"))
}
// Use provided model or default
this.modelId = modelId || GeminiEmbedder.DEFAULT_MODEL
// Validate outputDimension if provided
if (outputDimension && this.modelId.includes('gemini-embedding-exp-03-07')) {
const validDimensions = [3072, 1536, 768]
if (!validDimensions.includes(outputDimension)) {
throw new Error(`Invalid outputDimension ${outputDimension}. Must be one of: ${validDimensions.join(', ')}`)
}
}

)
})

it("should create an instance with specified model and outputDimension", () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a test case for dimension mismatch scenarios? For example, what happens when someone specifies gemini-embedding-exp-03-07-3072 as the model but provides outputDimension: 768? This would help ensure the behavior is well-defined.

gemini: {
"text-embedding-004": { dimension: 768 },
"gemini-embedding-001": { dimension: 3072, scoreThreshold: 0.4 },
"gemini-embedding-exp-03-07-3072": { dimension: 3072, scoreThreshold: 0.4 },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new models follow a pattern where the dimension is part of the model name (e.g., gemini-embedding-exp-03-07-3072). Should we consider extracting and validating this dimension from the model name to ensure consistency with any provided outputDimension parameter?

@daniel-lxs daniel-lxs closed this Sep 9, 2025
@github-project-automation github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Sep 9, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:M This PR changes 30-99 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants