Skip to content

Conversation

@roomote
Copy link
Contributor

@roomote roomote bot commented Sep 17, 2025

This PR attempts to address Issue #8102. Feedback and guidance are welcome.

Problem

When a user sets an embedding dimension (e.g., 1536), the system was still producing vectors of a different size (e.g., 1024) while the vector database creates a collection with the configured size. This mismatch caused a vector dimension error and blocked indexing and search.

Solution

Updated the dimension priority logic to respect user-configured dimensions as the single source of truth:

  • config-manager.ts: Now checks user-configured dimension first before falling back to model defaults
  • service-factory.ts: Uses the same priority logic for consistency
  • Updated tests to reflect the new behavior and added a specific test case for the reported scenario

Testing

Fixes #8102


Important

Prioritize user-configured embedding dimension over model default to prevent dimension mismatch errors.

  • Behavior:
    • Prioritize user-configured embedding dimension over model default in config-manager.ts and service-factory.ts.
    • If user dimension is set, model dimension lookup is skipped.
  • Testing:
    • Updated tests in config-manager.spec.ts to reflect new priority logic.
    • Added test case for scenario where user sets 1536 and model default is 1024.
    • Verified that custom dimension bypasses model dimension lookup.

This description was created by Ellipsis for 0c35f22. You can customize this summary. It will automatically update as commits are pushed.

- Updated config-manager.ts to check user-configured dimension first
- Updated service-factory.ts to use the same dimension priority logic
- Fixed tests to reflect the new behavior
- Added specific test for issue #8102 scenario (1536 vs 1024)

Fixes #8102
@roomote roomote bot requested review from cte, jr and mrubens as code owners September 17, 2025 23:51
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. bug Something isn't working labels Sep 17, 2025
@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Sep 18, 2025
Copy link
Contributor Author

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing my own code feels like debugging in a mirror - everything's backwards but the bugs are still mine.

/**
* Gets the current model dimension being used for embeddings.
* Returns the model's built-in dimension if available, otherwise falls back to custom dimension.
* Returns the user-configured custom dimension if set, otherwise falls back to model's built-in dimension.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix! The logic correctly prioritizes user-configured dimensions now. Though I wonder if the JSDoc comment could be even clearer by explicitly stating "User-configured dimensions override any model defaults" to make the priority crystal clear?

// Only use manual dimension if model doesn't have a built-in dimension
if (!vectorSize && config.modelDimension && config.modelDimension > 0) {
// Prioritize user-configured custom dimension if explicitly set
if (config.modelDimension && config.modelDimension > 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good consistency with the config-manager implementation. The if-else structure makes the priority clear.

})

it("should return model's built-in dimension when available", async () => {
it("should prioritize user-configured dimension over model's built-in dimension", async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent test coverage! This test name could be even more descriptive though. How about "should always use user-configured dimension when set, ignoring model defaults" to really emphasize the override behavior?

expect(configManager.currentModelDimension).toBe(1536)
expect(mockedGetModelDimension).toHaveBeenCalledWith("openai", "text-embedding-3-small")
})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great addition! This test specifically addresses the issue #8102 scenario where user sets 1536 but model default is 1024. This ensures the bug won't regress.

@daniel-lxs daniel-lxs closed this Sep 22, 2025
@github-project-automation github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Sep 22, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:M This PR changes 30-99 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

[BUG] File index dimension setting not consistently applied (1536 vs 1024)

4 participants