Skip to content

Conversation

@zadzanl
Copy link

@zadzanl zadzanl commented Nov 1, 2025

Feature: Deepinfra-compatible Embeddings Provider

Related GitHub Issue

Closes: #7199

Description

Fixes 7199 -
Adds Deepinfra embedding support via provider url detection.
Incompatibility is caused by DeepInfra returning embeddings as float arrays,
but Roo Code expect base64.

What changed?

  • Auto-detects DeepInfra URLs and switches to float encoding automatically
  • Fixed Qdrant bug - missing 'type' field index was causing "Bad Request" errors
  • Added 7 DeepInfra models (Qwen3 series, multilingual-e5, embeddinggemma, bge)
  • Handles both formats - float arrays for DeepInfra, base64 for all others

Why automatic detection?
The original issue asked for a manual toggle, but url auto-detection is much simpler
for users (and i'm bad at UIs 😅) - no configuration needed, works out of the box. Also,
to my knowledge, only deepinfra uses "encoding_format": "float"

Test Procedure

Acceptance Criteria Addressed (from 7199):

Given DeepInfra provider with float array responses:

  • System automatically detects DeepInfra URLs and sends encoding_format: "float"
  • Embeddings are processed directly as number arrays without base64 decoding

Error Handling:

  • Non-array responses trigger warnings and return empty embeddings as fallback
  • Error messages logged for debugging

Backward Compatibility

  • Existing providers continue using encoding_format: "base64"
  • Only Deepinfra is set to encoding_format: float

Unit Tests Added:

  • Provider validation tests for DeepInfra URL detection (deepinfra.com)
  • Encoding format tests to verify correct format is sent per provider type
  • Float array and base64 response handling tests with validation
  • Configuration validation tests for new model profiles

Manual Testing:

  1. DeepInfra Integration: Tested with google/embeddinggemma-300m,
    Qwen/Qwen3-Embedding-0.6B, and multilingual-e5-large
  2. Backward Compatibility: Existing OpenAI and Gemini providers work unchanged
  3. Qdrant Integration: Confirmed 'type' field indexing resolves "Bad Request"
    errors during code search
  4. Error Scenarios: Tested invalid URLs, malformed responses, and network failures

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue
    (see "Related GitHub Issue" above).
  • Scope: My changes are focused on the linked issue
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: New and/or updated tests have been added to cover my changes
    (if applicable).
  • Documentation Impact: I have considered if my changes require documentation
    updates (see "Documentation Updates" section below).
  • Contribution Guidelines: I have read and agree to the
    Contributor Guidelines.

Screenshots / Videos

N/A - This PR contains backend infrastructure changes with no UI modifications.
The changes are internal to the embedding system and vector db integration

Documentation Updates

Maybe? likely minor changes to Step 3 of RooCode Codebase Indexing to add notes when using Deepinfra

Additional Notes

Future Works:

  • Refactored provider detection pattern that can be extended to additional embedding providers

Get in Touch

Discord: badgambit


Important

Adds automatic DeepInfra URL detection for embedding format handling and new DeepInfra models, ensuring compatibility with existing providers.

  • Behavior:
    • Auto-detects DeepInfra URLs in openai-compatible.ts and switches to float encoding.
    • Handles float arrays for DeepInfra and base64 for others in createEmbeddings() and _embedBatchWithRetries().
  • Models:
    • Adds DeepInfra models to EMBEDDING_MODEL_PROFILES in embeddingModels.ts.
  • Tests:
    • Adds tests for DeepInfra URL detection and encoding in openai-compatible.spec.ts.
    • Validates handling of float and base64 responses.
  • Misc:
    • Updates QdrantVectorStore in qdrant-client.ts to handle named vectors and improve error handling.

This description was created by Ellipsis for 0f21a6a. You can customize this summary. It will automatically update as commits are pushed.

CommitGambit and others added 5 commits October 31, 2025 19:38
Update branch to latest - 31 Oct 2025
…ld index

Added support for DeepInfra-hosted embedding models and fix a critical bug where
the 'type' field index was missing in Qdrant, causing "Bad Request" errors
during code search operations.

Changes:
- Added DeepInfra provider detection in OpenAICompatibleEmbedder
  * Detect DeepInfra URLs (deepinfra.com)
  * Use 'float' encoding format for DeepInfra, 'base64' for other standard
    providers
  * Handle both float array and base64 string embedding responses
  * Added validation for embedding values (NaN/Infinity checking)

- Fix missing Qdrant payload index for 'type' field
  * Non-existing `type` field causes "Bad Request" during `codebase_search`
    tool invocation
  * Create keyword index for 'type' field to support metadata filtering
  * Resolves "Index required but not found for 'type' field" error

- Added 7 DeepInfra embedding model profiles:
  * Qwen/Qwen3-Embedding-0.6B (1024 dims)
  * Qwen/Qwen3-Embedding-4B (2560 dims)
  * Qwen/Qwen3-Embedding-8B (4096 dims)
  * intfloat/multilingual-e5-large-instruct (1024 dims)
  * google/embeddinggemma-300m (768 dims)
  * BAAI/bge-m3 (1024 dims)
  * BAAI/bge-large-en-v1.5 (1024 dims)

- Added some test coverage for DeepInfra
  * Provider validation
  * Encoding format tests
  * Float array and base64 response handling tests
  * Configuration validation tests

Tested with: embeddinggemma-300m, text-embedding-004, multilingual-e5-large
@zadzanl zadzanl requested review from cte, jr and mrubens as code owners November 1, 2025 20:17
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Nov 1, 2025
@roomote
Copy link

roomote bot commented Nov 1, 2025

See this task on Roo Code Cloud

Review complete - no new issues found. This merge from main is clean and maintains the DeepInfra embedding fix functionality.

Previous Reviews

Mention @roomote in a comment to trigger your PR Fixer agent and make changes to this pull request.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Nov 1, 2025
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Nov 3, 2025
@hannesrudolph hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Nov 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request PR - Needs Preliminary Review size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Status: PR [Needs Prelim Review]

Development

Successfully merging this pull request may close these issues.

Add Support for Float Encoding in OpenAI-Compatible Embeddings

3 participants