Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions packages/core/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,45 @@ const context = new Context({
})
```

### Using Gemini Embeddings with Retry Configuration

```typescript
import { Context, MilvusVectorDatabase, GeminiEmbedding } from '@pleaseai/context-please-core'

// Initialize with Gemini embedding provider
const embedding = new GeminiEmbedding({
apiKey: process.env.GEMINI_API_KEY || 'your-gemini-api-key',
model: 'gemini-embedding-001',
outputDimensionality: 768, // Optional: Matryoshka Representation Learning support (256, 768, 1536, 3072)
maxRetries: 3, // Optional: Maximum retry attempts (default: 3)
baseDelay: 1000 // Optional: Base delay in ms for exponential backoff (default: 1000ms)
})

const vectorDatabase = new MilvusVectorDatabase({
address: process.env.MILVUS_ADDRESS || 'localhost:19530',
token: process.env.MILVUS_TOKEN || ''
})

const context = new Context({
embedding,
vectorDatabase
})

// The retry mechanism automatically handles:
// - Rate limit errors (429)
// - Server errors (500, 502, 503, 504)
Comment on lines +250 to +267

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟒 Excellent documentation additions

The documentation clearly explains:

  • Configuration options with defaults
  • Matryoshka Representation Learning support
  • Runtime configuration methods
  • Example usage

Minor improvement: Consider adding a section about when to adjust retry settings (e.g., "Increase maxRetries for unreliable networks, decrease for faster failure detection").

// - Network errors (ECONNREFUSED, ETIMEDOUT, ENOTFOUND, EAI_AGAIN)
// - Transient API failures with exponential backoff (1s β†’ 2s β†’ 4s β†’ 8s, capped at 10s)

// Update retry configuration at runtime
embedding.setMaxRetries(5)
embedding.setBaseDelay(2000)

// Check current retry configuration
const retryConfig = embedding.getRetryConfig()
console.log(`Max retries: ${retryConfig.maxRetries}, Base delay: ${retryConfig.baseDelay}ms`)
```

### Custom File Filtering

```typescript
Expand Down
1 change: 1 addition & 0 deletions packages/core/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
"@qdrant/js-client-grpc": "^1.15.1",
"@qdrant/js-client-rest": "^1.15.1",
"@zilliz/milvus2-sdk-node": "^2.5.10",
"es-toolkit": "^1.41.0",
"faiss-node": "^0.5.1",
"fs-extra": "^11.0.0",
"glob": "^10.0.0",
Expand Down
5 changes: 5 additions & 0 deletions packages/core/src/embedding/base-embedding.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@ export abstract class Embedding {
* @returns Processed text
*/
protected preprocessText(text: string): string {
// Handle null/undefined by converting to single space
if (text == null) {
return ' '
}

// Replace empty string with single space
if (text === '') {
return ' '
Expand Down
204 changes: 179 additions & 25 deletions packages/core/src/embedding/gemini-embedding.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,23 @@ export interface GeminiEmbeddingConfig {
apiKey: string
baseURL?: string // Optional custom API endpoint URL
outputDimensionality?: number // Optional dimension override
maxRetries?: number // Maximum number of retry attempts (default: 3)
baseDelay?: number // Base delay in milliseconds for exponential backoff (default: 1000ms)
}

export class GeminiEmbedding extends Embedding {
private client: GoogleGenAI
private config: GeminiEmbeddingConfig
private dimension: number = 3072 // Default dimension for gemini-embedding-001
protected maxTokens: number = 2048 // Maximum tokens for Gemini embedding models
private maxRetries: number
private baseDelay: number

constructor(config: GeminiEmbeddingConfig) {
super()
this.config = config
this.maxRetries = config.maxRetries ?? 3
this.baseDelay = config.baseDelay ?? 1000
this.client = new GoogleGenAI({
apiKey: config.apiKey,
...(config.baseURL && {
Expand Down Expand Up @@ -52,6 +58,97 @@ export class GeminiEmbedding extends Embedding {
}
}

/**
* Determine if an error is retryable
* @param error Error object to check
* @returns True if error is retryable
*/
private isRetryableError(error: unknown): boolean {
if (typeof error !== 'object' || error === null) {
return false
}

// Network errors
const networkErrorCodes = ['ECONNREFUSED', 'ETIMEDOUT', 'ENOTFOUND', 'EAI_AGAIN']
if ('code' in error && typeof error.code === 'string' && networkErrorCodes.includes(error.code)) {
return true
}

// HTTP status codes
const retryableStatusCodes = [429, 500, 502, 503, 504]
if ('status' in error && typeof error.status === 'number' && retryableStatusCodes.includes(error.status)) {
return true
}

// Error message patterns
const errorMessage = ('message' in error && typeof error.message === 'string')
? error.message.toLowerCase()
: ''
const retryablePatterns = [
'rate limit',
'quota exceeded',
'service unavailable',
'timeout',
'connection',
]

return retryablePatterns.some(pattern => errorMessage.includes(pattern))
}
Comment on lines +66 to +96

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟒 Well-designed error classification logic

The isRetryableError method effectively categorizes errors into retryable and non-retryable categories. Good coverage of network errors, HTTP status codes, and error message patterns.

Minor suggestion: Consider making the retryable patterns configurable for advanced users who may need custom error handling.

Also, the error message matching is case-insensitive which is good, but consider using more specific patterns to avoid false positives (e.g., "timeout" might match unrelated messages).


/**
* Sleep for specified milliseconds
* @param ms Milliseconds to sleep
*/
private async sleep(ms: number): Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms))
}

/**
* Execute operation with retry logic
* Only retries on retryable errors (network errors, rate limits, server errors)
* @param operation Operation to execute
* @param context Context string for error messages
* @returns Operation result
*/
private async executeWithRetry<T>(
operation: () => Promise<T>,
context: string,
): Promise<T> {
let lastError: unknown

for (let attempt = 0; attempt <= this.maxRetries; attempt++) {
try {
return await operation()
}
catch (error) {
lastError = error

// If error is not retryable, fail immediately
if (!this.isRetryableError(error)) {
const err = new Error(`${context}: ${error instanceof Error ? error.message : 'Unknown error'}`)
;(err as any).cause = error
throw err
}

// If we've exhausted all retries, throw the error
if (attempt === this.maxRetries) {
const err = new Error(`${context}: ${error instanceof Error ? error.message : 'Unknown error'}`)
;(err as any).cause = error
throw err
}

// Calculate delay with exponential backoff (capped at 10s)
const delay = Math.min(this.baseDelay * Math.pow(2, attempt), 10000)
await this.sleep(delay)
}
}

// This should never be reached, but TypeScript needs it
const err = new Error(`${context}: ${lastError instanceof Error ? lastError.message : 'Unknown error'}`)
;(err as any).cause = lastError
throw err
}

async detectDimension(): Promise<number> {
// Gemini doesn't need dynamic detection, return configured dimension
return this.dimension
Expand All @@ -61,7 +158,7 @@ export class GeminiEmbedding extends Embedding {
const processedText = this.preprocessText(text)
const model = this.config.model || 'gemini-embedding-001'

try {
return this.executeWithRetry(async () => {
const response = await this.client.models.embedContent({
model,
contents: processedText,
Expand All @@ -78,41 +175,65 @@ export class GeminiEmbedding extends Embedding {
vector: response.embeddings[0].values,
dimension: response.embeddings[0].values.length,
}
}
catch (error) {
throw new Error(`Gemini embedding failed: ${error instanceof Error ? error.message : 'Unknown error'}`)
}
}, 'Gemini embedding failed')
}

async embedBatch(texts: string[]): Promise<EmbeddingVector[]> {
const processedTexts = this.preprocessTexts(texts)
const model = this.config.model || 'gemini-embedding-001'

// Try batch processing with retry logic
try {
const response = await this.client.models.embedContent({
model,
contents: processedTexts,
config: {
outputDimensionality: this.config.outputDimensionality || this.dimension,
},
})
return await this.executeWithRetry(async () => {
const response = await this.client.models.embedContent({
model,
contents: processedTexts,
config: {
outputDimensionality: this.config.outputDimensionality || this.dimension,
},
})

if (!response.embeddings) {
throw new Error('Gemini API returned invalid response')
}
if (!response.embeddings) {
throw new Error('Gemini API returned invalid response')
}

return response.embeddings.map((embedding: ContentEmbedding) => {
if (!embedding.values) {
throw new Error('Gemini API returned invalid embedding data')
}
return {
vector: embedding.values,
dimension: embedding.values.length,
}
})
}, 'Gemini batch embedding failed')
}
catch (batchError) {
// Fallback: Process individually if batch fails after all retries
// Add delay between requests to avoid rate limiting
const results: EmbeddingVector[] = []
const FALLBACK_DELAY_MS = 100 // Delay between individual requests

for (let i = 0; i < processedTexts.length; i++) {
const text = processedTexts[i]
try {
// Add delay between requests (except for first)
if (i > 0) {
await new Promise(resolve => setTimeout(resolve, FALLBACK_DELAY_MS))
}

return response.embeddings.map((embedding: ContentEmbedding) => {
if (!embedding.values) {
throw new Error('Gemini API returned invalid embedding data')
const result = await this.embed(text)
results.push(result)
}
return {
vector: embedding.values,
dimension: embedding.values.length,
catch (individualError) {
// If individual request also fails, re-throw the error with cause
const error = new Error('Gemini batch embedding failed (both batch and individual attempts failed)')
;(error as any).cause = individualError
throw error
}
})
}
catch (error) {
throw new Error(`Gemini batch embedding failed: ${error instanceof Error ? error.message : 'Unknown error'}`)
}

return results
}
}

Expand Down Expand Up @@ -178,4 +299,37 @@ export class GeminiEmbedding extends Embedding {
const supportedDimensions = this.getSupportedDimensions()
return supportedDimensions.includes(dimension)
}

/**
* Get current retry configuration
* @returns Object containing maxRetries and baseDelay
*/
getRetryConfig(): { maxRetries: number, baseDelay: number } {
return {
maxRetries: this.maxRetries,
baseDelay: this.baseDelay,
}
}

/**
* Set maximum number of retry attempts
* @param maxRetries Maximum retry attempts
*/
setMaxRetries(maxRetries: number): void {
if (maxRetries < 0) {
throw new Error('maxRetries must be non-negative')
}
this.maxRetries = maxRetries
}
Comment on lines +318 to +323

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟑 Missing validation for zero retries edge case

While the validation checks for negative values, it allows maxRetries = 0 which disables retries entirely. This may be intentional, but it's not clear from the documentation or interface whether this is a supported use case.

Recommendation: Either explicitly document that 0 is allowed (to disable retries), or validate against it if it's not intended.

Suggested change
setMaxRetries(maxRetries: number): void {
if (maxRetries < 0) {
throw new Error('maxRetries must be non-negative')
}
this.maxRetries = maxRetries
}
setMaxRetries(maxRetries: number): void {
if (maxRetries < 0) {
throw new Error('maxRetries must be non-negative')
}
// Note: maxRetries = 0 is allowed to disable retry mechanism
this.maxRetries = maxRetries
}


/**
* Set base delay for exponential backoff
* @param baseDelay Base delay in milliseconds
*/
setBaseDelay(baseDelay: number): void {
if (baseDelay <= 0) {
throw new Error('baseDelay must be positive')
}
this.baseDelay = baseDelay
}
}
Loading
Loading