Skip to content

Conversation

@roomote
Copy link
Collaborator

@roomote roomote commented Jun 17, 2025

Summary

This PR fixes the "Bad Request" error that occurs during codebase indexing by adding comprehensive validation and enhanced error handling.

Changes Made

🔧 Core Fixes

  • Vector Dimension Validation: Added validation in QdrantVectorStore.upsertPoints() to catch mismatched vector sizes before sending to Qdrant
  • Enhanced Error Logging: Added detailed error context including vector dimensions, collection info, and sample data to help diagnose issues
  • Batch Processing Improvements: Enhanced error handling in DirectoryScanner with better retry logging and error context

🧪 Test Updates

  • Updated tests to use correct 1536-dimensional vectors matching the expected collection configuration
  • Fixed test expectations to match the new validation behavior

🛡️ Validation Improvements

  • Added validation for point structure (id, vector, payload) before processing
  • Added checks for invalid vector values (NaN, Infinity) that could cause Bad Request errors
  • Early return for empty point arrays to avoid unnecessary API calls

Root Cause Analysis

The "Bad Request" error was likely caused by:

  1. Vector dimension mismatches - vectors not matching the collection's expected dimensions
  2. Invalid vector data - NaN or Infinity values in embeddings
  3. Malformed point structures - missing required fields in the upsert payload

Testing

  • ✅ All existing tests pass
  • ✅ New validation logic tested with edge cases
  • ✅ Error handling paths verified

Impact

This fix should resolve the indexing failures reported in issue #4779 by:

  • Providing clear error messages when validation fails
  • Preventing malformed requests from reaching the Qdrant server
  • Improving debugging capabilities with detailed error context

Closes #4779


Important

Adds vector dimension validation and enhanced error handling for codebase indexing in QdrantVectorStore and DirectoryScanner.

  • Behavior:
    • Adds vector dimension validation in QdrantVectorStore.upsertPoints() to catch mismatched vector sizes.
    • Enhances error logging in QdrantVectorStore.upsertPoints() with detailed context.
    • Improves batch processing error handling in DirectoryScanner.processBatch() with retry logging.
  • Validation:
    • Validates point structure and checks for invalid vector values (NaN, Infinity) in QdrantVectorStore.upsertPoints().
    • Early return for empty point arrays in QdrantVectorStore.upsertPoints().
  • Testing:
    • Updates tests in qdrant-client.spec.ts to use correct 1536-dimensional vectors.
    • Adjusts test expectations to align with new validation logic.

This description was created by Ellipsis for 73b11b1. You can customize this summary. It will automatically update as commits are pushed.

…base indexing

- Add vector dimension validation in QdrantVectorStore.upsertPoints() to catch mismatched vector sizes before sending to Qdrant
- Add enhanced error logging with detailed context including vector dimensions, collection info, and sample data
- Improve batch processing error handling in DirectoryScanner with better retry logging and error context
- Update tests to use correct 1536-dimensional vectors matching the expected collection configuration
- Add validation for point structure (id, vector, payload) before processing
- Add checks for invalid vector values (NaN, Infinity) that could cause Bad Request errors

This should resolve the 'Error processing batch: Error:Bad Request' issue by catching and providing clear error messages for common causes like dimension mismatches and invalid vector data.
@roomote roomote requested review from cte, jr and mrubens as code owners June 17, 2025 13:17
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Jun 17, 2025

await expect(vectorStore.upsertPoints(mockPoints)).rejects.toThrow(upsertError)
await expect(vectorStore.upsertPoints(mockPoints)).rejects.toThrow(
"Failed to upsert 1 points to collection ws-a1b2c3d4e5f6g7h8: Upsert failed",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typographical note: The error message says "1 points" which seems grammatically incorrect. Consider changing it to "1 point" for singular.

Suggested change
"Failed to upsert 1 points to collection ws-a1b2c3d4e5f6g7h8: Upsert failed",
"Failed to upsert 1 point to collection ws-a1b2c3d4e5f6g7h8: Upsert failed",

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Jun 17, 2025
@daniel-lxs
Copy link
Member

Closing, see #4779 (comment)

Also this is handled on #4432.

@daniel-lxs daniel-lxs closed this Jun 17, 2025
@github-project-automation github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Jun 17, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Jun 17, 2025
@roomote roomote deleted the fix-4779 branch June 19, 2025 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Codebase indexing: Error processing batch: Error:Bad Requst

4 participants