Skip to content

feat: Add comprehensive retry mechanism and batch fallback for Gemini embeddingsΒ #36

@amondnet

Description

@amondnet

🎯 Overview

Implement comprehensive retry mechanisms for the Gemini embedding provider with intelligent batch processing and fallback strategies to improve production reliability.

Upstream Reference: zilliztech/claude-context#159

πŸ“‹ Problem Statement

The current Gemini embedding implementation lacks:

  • Retry logic for transient API failures (rate limits, timeouts, network errors)
  • Intelligent batch processing with fallback strategies
  • Error classification between retryable and non-retryable errors
  • Production-grade reliability improvements

✨ Features to Implement

1. Intelligent Retry Mechanism

  • Exponential backoff with configurable max retries (1s β†’ 2s β†’ 4s β†’ 8s, max 10s)
  • Configurable retry parameters: maxRetries (default: 3), baseDelay (default: 1000ms)
  • Zero breaking changes to existing API

2. Error Classification

  • Retryable errors:
    • Network errors: ECONNREFUSED, ETIMEDOUT, ENOTFOUND, EAI_AGAIN
    • HTTP status codes: 429, 500, 502, 503, 504
    • Error patterns: rate limit, quota exceeded, service unavailable, timeout, connection
  • Non-retryable errors: 400, 401, 403, invalid API key, malformed requests

3. Smart Batch Fallback

  • Automatic degradation from batch to individual requests when batch processing fails
  • Maintains retry logic for each individual request
  • Preserves order in results
  • Handles mixed success/failure scenarios

4. Configuration Interface

export interface GeminiEmbeddingConfig {
    model: string;
    apiKey: string;
    baseURL?: string;
    outputDimensionality?: number;
    maxRetries?: number;        // Default: 3
    baseDelay?: number;         // Default: 1000ms  
}

πŸ“ Implementation Tasks

  • Update GeminiEmbeddingConfig interface with retry parameters
  • Implement sleep(ms: number) helper method
  • Implement isRetryableError(error: any) error classification
  • Implement executeWithRetry<T>() with exponential backoff
  • Update embed() method to use retry mechanism
  • Update embedBatch() method with fallback strategy
  • Add getter methods: getRetryConfig(), setMaxRetries(), setBaseDelay()
  • Update base-embedding.ts preprocessText() with null/undefined handling
  • Add Jest configuration (jest.config.json) if not exists
  • Write comprehensive test suite (34+ test cases)
  • Update README.md with Gemini retry usage examples

πŸ§ͺ Testing Requirements

Test Coverage Areas (34+ tests):

  1. Constructor & Configuration (4 tests) - Default/custom configurations
  2. Basic Functionality (4 tests) - Single/batch embedding, empty input
  3. Error Classification (4 tests) - Network errors, HTTP status codes, message patterns
  4. Retry Mechanism (4 tests) - Exponential backoff, delay cap, maxRetries respect
  5. Batch Processing (3 tests) - Fallback to individual, order preservation
  6. Configuration Methods (4 tests) - Model updates, dimensionality changes
  7. Model Support (3 tests) - Supported models, dimension validation
  8. Edge Cases (6 tests) - Invalid responses, concurrent requests, null safety
  9. Performance (2 tests) - Response time validation, large batch handling

πŸ“Š Expected Impact

Reliability Improvements

  • Systematic retry handling for transient API failures
  • Intelligent error classification (only retry appropriate errors)
  • Graceful degradation during API limitations
  • Reduced failure rates through proper error handling

Performance Optimization

  • Efficient bulk embedding operations with fallback safety
  • Smart delays prevent API rate limit violations
  • Memory efficient constant usage during fallback
  • Promise-based concurrent safe implementation

πŸ“„ Files to Modify

Core Implementation

  • packages/core/src/embedding/gemini-embedding.ts - Add retry logic and fallback
  • packages/core/src/embedding/base-embedding.ts - Enhance input validation

Testing & Documentation

  • packages/core/src/embedding/gemini-embedding.test.ts - New comprehensive test suite
  • packages/core/jest.config.json - Jest configuration (if not exists)
  • packages/core/README.md - Usage examples for retry configuration

πŸ” Technical Details

Retry Logic

private async executeWithRetry<T>(
    operation: () => Promise<T>, 
    context: string
): Promise<T> {
    for (let attempt = 0; attempt <= this.maxRetries; attempt++) {
        try {
            return await operation();
        } catch (error) {
            if (!this.isRetryableError(error) || attempt === this.maxRetries) {
                throw error;
            }
            
            const delay = Math.min(this.baseDelay * Math.pow(2, attempt), 10000);
            await this.sleep(delay);
        }
    }
}

Fallback Strategy

When batch processing fails after all retries:

  1. Switch to individual request processing
  2. Maintain retry logic for each individual request
  3. Preserve order in results
  4. Provide comprehensive error handling

⚠️ Important Notes

  • Zero Breaking Changes: Full backward compatibility maintained
  • Type Safety: Full TypeScript coverage required
  • Error Handling: Clear separation between retryable and fatal errors
  • Testing: All 34+ test cases must pass before merge
  • Documentation: Update README with clear usage examples

πŸ“š References

βœ… Acceptance Criteria

  • All retry mechanism features implemented
  • Error classification working correctly
  • Batch fallback strategy implemented
  • All 34+ test cases passing
  • Zero breaking changes to existing API
  • Documentation updated with examples
  • TypeScript types fully defined
  • Code follows project standards (STANDARDS.md)
  • Commit messages follow conventional format

Labels: feat, core, enhancement, good-first-issue
Priority: P2
Type: Feature

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions