-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix: improve Ollama embeddings error handling and add retry logic #6528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add detailed error logging to capture actual Ollama API responses - Implement retry logic with exponential backoff for transient failures - Add better error detection for model not found errors - Improve error messages to be more descriptive - Add validation for embeddings response format Fixes #6526
|
|
||
| // Don't retry for certain errors | ||
| if ( | ||
| error.message?.includes(t("embeddings:ollama.modelNotFound", { modelId: "" }).split(":")[0]) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the translated error message (via t()) and splitting it with split(":") to check for a model-not-found condition is brittle. Consider using custom error types or dedicated error codes for more robust error detection.
|
Closing, issue is not scoped |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Add timeout to prevent indefinite hanging | ||
| const controller = new AbortController() | ||
| const timeoutId = setTimeout(() => controller.abort(), OLLAMA_EMBEDDING_TIMEOUT_MS) | ||
| for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR adds significant retry logic and error handling improvements but doesn't include any unit tests. Could we add tests to verify:
- Retry behavior with exponential backoff
- Proper handling of different error types
- Maximum retry limit enforcement
- Error message validation
This is especially important for ensuring the retry logic works as expected.
| // Check if we should retry | ||
| if (attempt < MAX_RETRIES) { | ||
| // Check if it's a transient error that we should retry | ||
| if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The retry logic currently retries for many error types. Is it intentional to retry for all of these?
ENOTFOUNDmight indicate a configuration issue (wrong host) rather than a transient failureAbortErrorhappens on timeout - retrying might just hit the same timeout again
Consider limiting retries to truly transient errors like ECONNRESET and ETIMEDOUT?
| const OLLAMA_VALIDATION_TIMEOUT_MS = 30000 // 30 seconds for validation requests | ||
|
|
||
| // Retry configuration | ||
| const MAX_RETRIES = 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making these retry parameters configurable through the ApiHandlerOptions:
| const MAX_RETRIES = 3 | |
| // Retry configuration | |
| const MAX_RETRIES = options.ollamaMaxRetries ?? 3 | |
| const INITIAL_RETRY_DELAY_MS = options.ollamaInitialRetryDelay ?? 1000 // 1 second | |
| const MAX_RETRY_DELAY_MS = options.ollamaMaxRetryDelay ?? 10000 // 10 seconds |
This would allow users to adjust retry behavior based on their specific setup.
|
|
||
| // Don't retry for certain errors | ||
| if ( | ||
| error.message?.includes(t("embeddings:ollama.modelNotFound", { modelId: "" }).split(":")[0]) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using string matching on localized error messages could be fragile. If the translation changes, this check might break. Consider using a more robust approach, perhaps by checking the error type or adding a specific error code property?
| // Handle specific error types with better messages | ||
| if (lastError.name === "AbortError") { | ||
| throw new Error(t("embeddings:validation.connectionFailed")) | ||
| } else if (lastError.message?.includes("fetch failed") || (lastError as any).code === "ECONNREFUSED") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's inconsistency in error checking - sometimes using error.code (line 174) and sometimes error.message?.includes() (line 208). Consider standardizing to one approach for better maintainability?
| error.code === "ETIMEDOUT" || | ||
| error.code === "ECONNRESET" | ||
| ) { | ||
| console.log(`Ollama embedding attempt ${attempt} failed, retrying in ${retryDelay}ms...`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding telemetry for retry attempts to help monitor retry patterns:
| console.log(`Ollama embedding attempt ${attempt} failed, retrying in ${retryDelay}ms...`) | |
| console.log(`Ollama embedding attempt ${attempt} failed, retrying in ${retryDelay}ms...`) | |
| TelemetryService.instance.captureEvent(TelemetryEventName.CODE_INDEX_RETRY, { | |
| attempt, | |
| retryDelay, | |
| errorCode: error.code, | |
| location: "OllamaEmbedder:createEmbeddings" | |
| }) |
Summary
This PR improves the error handling for Ollama embeddings to address the issue where embedding generation fails and stops the indexing process.
Changes
Testing
Related Issue
Fixes #6526
Notes for Reviewers
The main improvements are:
This should help users who are experiencing embedding generation failures with local Ollama setups.
Important
Improves error handling and adds retry logic with exponential backoff for Ollama embeddings in
ollama.ts.createEmbeddings()to capture actual response from Ollama API.createEmbeddings()for transient failures (up to 3 attempts).createEmbeddings()to ensure response is an array of arrays.createEmbeddings()andvalidateConfiguration().This description was created by
for b9c9d21. You can customize this summary. It will automatically update as commits are pushed.