-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Description
Problem (one or two sentences)
Users need more affordable embedding options for codebase indexing. Current providers cost
0.13
−
0.13−0.18 per million tokens, which can be expensive for large codebases.
Context (who is affected and when)
Affects all users who enable codebase indexing, especially those with large codebases or frequent re-indexing needs. Cost becomes significant during initial indexing and ongoing maintenance.
Desired behavior (conceptual, not technical)
Add Nebius AI as a provider option offering Qwen/Qwen3-Embedding-8B at $0.01 per million tokens - significantly cheaper than existing options while maintaining high quality (ranked #2 on MTEB Leaderboard, just below gemini-embedding-001).
Constraints / preferences (optional)
- Provider: Nebius AI
- API Endpoint: https://api.studio.nebius.com/v1/
- Model: Qwen/Qwen3-Embedding-8B
- Embedding dimensions: 4,096
- Rate limits: 600,000 TPM (tokens per minute), 10,000 RPM (requests per minute)
- Model specs: 7B parameters, 32K max tokens
- Pricing: 0.01 USD per 1M tokens (vs 0.13-0.18 USD per 1M for current providers)
- MTEB score: 70.58 (ranked Fix vscode compatibility issue #2, vs gemini-embedding-001 at 68.37)
- Should integrate similarly to existing providers (Gemini, OpenAI, Mistral, etc.)
- Maintain consistent user experience with current provider selection flow
Request checklist
- I've searched existing Issues and Discussions for duplicates
- This describes a specific problem with clear context and impact
Roo Code Task Links (optional)
N/A
Acceptance criteria (optional)
Given a user wants to configure codebase indexing
When they open the provider selection
Then Nebius AI should appear as an available option
And selecting it should allow configuration with API key
And indexing should work with Qwen/Qwen3-Embedding-8B model at endpoint https://api.studio.nebius.com/v1/
And embeddings should have 4,096 dimensions
And cost should be $0.01/1M tokens as advertised
Proposed approach (optional)
Follow the existing embedder pattern in src/services/code-index/embedders/ - similar to how Gemini, Mistral, and OpenAI are implemented. The provider should support the standard embedding interface with proper rate limiting and error handling. Configure with base URL https://api.studio.nebius.com/v1/ and model identifier Qwen/Qwen3-Embedding-8B.
Trade-offs / risks (optional)
New provider means additional maintenance overhead
Need to verify Nebius AI API stability and reliability
Should validate actual performance matches MTEB leaderboard claims
Confirm API compatibility with OpenAI-style endpoints
Metadata
Metadata
Assignees
Labels
Type
Projects
Status