-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix: prevent Ollama indexing from freezing at 69% #6850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add dynamic timeout for batch embedding operations based on batch size - Improve error handling and recovery in batch processing - Ensure errors are properly thrown to prevent silent failures - Add immediate error state updates when batch processing fails - Add better logging for debugging timeout and retry scenarios Fixes #6849
| texts.length > 1 | ||
| ? t("embeddings:ollama.batchTimeoutError", { | ||
| count: texts.length, | ||
| timeout: Math.round(Math.max(OLLAMA_EMBEDDING_TIMEOUT_MS, texts.length * 2000) / 1000), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Instead of recalculating the batch timeout for the error message (using Math.max again), reuse the 'batchTimeout' variable calculated earlier to ensure consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed my own code and found it acceptable, but only because the alternative was infinite recursion.
|
|
||
| // Add timeout to prevent indefinite hanging | ||
| // Use a longer timeout for batch operations as they can take more time | ||
| const batchTimeout = Math.max(OLLAMA_EMBEDDING_TIMEOUT_MS, texts.length * 2000) // At least 2 seconds per text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the 2-second-per-text multiplier intentional? This seems like a reasonable heuristic, but could we make this configurable or at least document why 2 seconds was chosen? Some models or systems might need different timeout scaling.
| "modelNotEmbeddingCapable": "Ollama model is not embedding capable: {{modelId}}", | ||
| "hostNotFound": "Ollama host not found: {{baseUrl}}" | ||
| "hostNotFound": "Ollama host not found: {{baseUrl}}", | ||
| "batchTimeoutError": "Ollama embedding timed out after {{timeout}} seconds while processing {{count}} texts. Consider reducing batch size or increasing timeout." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new 'batchTimeoutError' translation key is only added to the English locale. Should we add placeholder translations to all other locale files (fr, de, es, ca, hi, id, it, ja, ko, nl, pl, pt-BR, ru, tr, vi) to prevent missing translation errors?
| if (embeddingError.message?.includes("timed out") || embeddingError.message?.includes("timeout")) { | ||
| throw new Error( | ||
| `Embedding timeout for batch of ${batchTexts.length} texts. This may indicate the Ollama service is overloaded or the model is too slow. ${embeddingError.message}`, | ||
| { cause: embeddingError }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When wrapping errors with the 'cause' option, could we lose stack traces in some environments? Consider preserving the original error more explicitly, perhaps by including the original stack in the message or using a custom error class that better preserves debugging information.
| texts.length > 1 | ||
| ? t("embeddings:ollama.batchTimeoutError", { | ||
| count: texts.length, | ||
| timeout: Math.round(Math.max(OLLAMA_EMBEDDING_TIMEOUT_MS, texts.length * 2000) / 1000), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timeout calculation logic appears both here (line 135) and earlier (line 73). Could we extract this to a helper function like 'calculateBatchTimeout(textCount: number)' for better maintainability?
|
|
||
| if (attempts < MAX_BATCH_RETRIES) { | ||
| const delay = INITIAL_RETRY_DELAY_MS * Math.pow(2, attempts - 1) | ||
| console.log(`[DirectoryScanner] Retrying batch processing in ${delay}ms...`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with other error messages in this file, should this retry log message include the workspace context? Something like: '[DirectoryScanner] Retrying batch processing in ms for workspace ...'
|
No repro steps or scope |
Summary
This PR fixes an issue where codebase indexing would freeze at 69% when using Ollama with the mxbai-embed-large model.
Problem
The indexing process was hanging indefinitely due to:
Solution
Changes
Ollama Embedder (
src/services/code-index/embedders/ollama.ts):Scanner (
src/services/code-index/processors/scanner.ts):Orchestrator (
src/services/code-index/orchestrator.ts):Localization (
src/i18n/locales/en/embeddings.json):Testing
Fixes #6849
Important
Fixes indexing freeze at 69% by adding dynamic timeout handling and improving error propagation in Ollama embedder.
ollama.ts.scanner.tsandorchestrator.ts.orchestrator.ts.ollama.ts.ollama.tsandscanner.ts.embeddings.json.scanner.tsandorchestrator.ts.This description was created by
for 2eb26fc. You can customize this summary. It will automatically update as commits are pushed.