Skip to content

[BUG] OOM kill proccess. RAG with xslx #936

@Balots

Description

@Balots

🐛 Bug Description

RAG consumes memory exponentially when working with xlsx documents, which leads to a crash

🔄 Steps to Reproduce

  1. Go to 'Create Index'
  2. Click on 'Upload Files'
  3. Scroll down to anything Excel file and click
  4. RAG server crashes down after some time

✅ Expected Behavior

GPU inference works as expected, as the server handles the error.

❌ Actual Behavior

However, with CPU inference, memory usage is uncontrolled, causing an OOM to simply kill the process.

📸 Screenshots

Image

🖥️ Environment Information

Desktop/Server:

  • OS: Ubuntu 24.04
  • Python Version: 3.12
  • Node.js Version: 24.11
  • Ollama Version: 0.15.6

📋 System Health Check

 ~/localGPT.bak  localgpt-v2 !140 ?1  python3 system_health_check.py                                                                                                   ok  genv py  03:53:57 
 RAG System Health Check
==================================================
 Testing basic imports...
✅ Basic imports successful
 Checking configurations...
 External Models: {'embedding_model': 'Qwen/Qwen3-Embedding-0.6B', 'reranker_model': 'answerdotai/answerai-colbert-small-v1', 'vision_model': 'Qwen/Qwen-VL-Chat', 'fallback_reranker': 'BAAI/bge-reranker-base'}
 Ollama Config: {'host': 'http://localhost:11434', 'generation_model': 'qwen3:8b', 'enrichment_model': 'qwen3:0.6b'}
 Pipeline Configs: {'default': {'description': 'Production-ready pipeline with hybrid search, AI reranking, and verification', 'storage': {'lancedb_uri': './lancedb', 'text_table_name': 'text_pages_v3', 'image_table_name': 'image_pages_v3', 'bm25_path': './index_store/bm25', 'graph_path': './index_store/graph/knowledge_graph.gml'}, 'retrieval': {'retriever': 'multivector', 'search_type': 'hybrid', 'late_chunking': {'enabled': True, 'table_suffix': '_lc_v3'}, 'dense': {'enabled': True, 'weight': 0.7}, 'bm25': {'enabled': True, 'index_name': 'rag_bm25_index'}, 'graph': {'enabled': False, 'graph_path': './index_store/graph/knowledge_graph.gml'}}, 'embedding_model_name': 'Qwen/Qwen3-Embedding-0.6B', 'vision_model_name': 'Qwen/Qwen-VL-Chat', 'reranker': {'enabled': True, 'type': 'ai', 'strategy': 'rerankers-lib', 'model_name': 'answerdotai/answerai-colbert-small-v1', 'top_k': 10}, 'query_decomposition': {'enabled': True, 'max_sub_queries': 3, 'compose_from_sub_answers': True}, 'verification': {'enabled': True}, 'retrieval_k': 20, 'context_window_size': 0, 'semantic_cache_threshold': 0.98, 'cache_scope': 'global', 'contextual_enricher': {'enabled': True, 'window_size': 1}, 'indexing': {'embedding_batch_size': 50, 'enrichment_batch_size': 10, 'enable_progress_tracking': True}}, 'fast': {'description': 'Speed-optimized pipeline with minimal overhead', 'storage': {'lancedb_uri': './lancedb', 'text_table_name': 'text_pages_v3', 'image_table_name': 'image_pages_v3', 'bm25_path': './index_store/bm25'}, 'retrieval': {'retriever': 'multivector', 'search_type': 'vector_only', 'late_chunking': {'enabled': False}, 'dense': {'enabled': True}}, 'embedding_model_name': 'Qwen/Qwen3-Embedding-0.6B', 'reranker': {'enabled': False}, 'query_decomposition': {'enabled': False}, 'verification': {'enabled': False}, 'retrieval_k': 10, 'context_window_size': 0, 'contextual_enricher': {'enabled': False, 'window_size': 1}, 'indexing': {'embedding_batch_size': 100, 'enrichment_batch_size': 50, 'enable_progress_tracking': False}}, 'bm25': {'enabled': True, 'index_name': 'rag_bm25_index'}, 'graph_rag': {'enabled': False}}
 Embedding model: Qwen/Qwen3-Embedding-0.6B (1024 dims) - Check data compatibility!
✅ Configuration check completed
 Testing database access...
/home/user/localGPT.bak/system_health_check.py:99: DeprecationWarning: table_names() is deprecated, use list_tables() instead
  tables = db.table_names()
✅ LanceDB connected - 6 tables available
 Available tables:
   - text_pages_1da5082e-7f64-4814-8c00-06e0606e6002
   - text_pages_1da5082e-7f64-4814-8c00-06e0606e6002_lc
   - text_pages_ce810632-44ff-4e68-b503-05a32d309d2f
   - text_pages_ce810632-44ff-4e68-b503-05a32d309d2f_lc
   - text_pages_e2041646-e2e8-4125-bce6-65c2fb402336
   ... and 1 more
 Testing agent initialization...
Initialized Verifier with Ollama model 'qwen3:8b'.
Agent initialized (GraphRAG disabled).
✅ Agent initialization successful
 Testing embedding model...
Initializing HF Embedder with model 'Qwen/Qwen3-Embedding-0.6B' on device 'cpu'. (first load)
QwenEmbedder weights loaded and cached for Qwen/Qwen3-Embedding-0.6B.
Generating 1 embeddings with Qwen/Qwen3-Embedding-0.6B model...
✅ Embedding model: Qwen/Qwen3-Embedding-0.6B
✅ Vector dimension: 1024
 Using 1024-dim embeddings (Qwen3 compatible) - Ensure data compatibility!
 Testing sample query...
/home/user/localGPT.bak/system_health_check.py:122: DeprecationWarning: table_names() is deprecated, use list_tables() instead
  tables = db.table_names()
 Testing query on table: text_pages_1da5082e-7f64-4814-8c00-06e0606e6002
 ROUTING DEBUG: Starting triage for query: 'what is this document about?...'
 ROUTING DEBUG: Attempting overview-based routing...
 ROUTING DEBUG: No document overviews available, returning None
❌ ROUTING DEBUG: Overview routing returned None, falling back to LLM triage
烙 ROUTING DEBUG: No history, using LLM fallback triage...
烙 ROUTING DEBUG: LLM fallback triage decided: 'rag_query'
 ROUTING DEBUG: Final triage decision: 'rag_query'
Agent Triage Decision: 'rag_query'
Generating 1 embeddings with Qwen/Qwen3-Embedding-0.6B model...
✅ ROUTING DEBUG: Executing RAG_QUERY path (query_type='rag_query')

--- Query Decomposition Enabled ---
Query Decomposition Reasoning: Single information need; no context to resolve.
Original query: 'what is this document about?' (Contextual: 'what is this document about?')
Decomposed into 1 sub-queries: ['what is this document about?']
--- Only one sub-query after decomposition; using direct retrieval path ---
LanceDB connection established at: ./lancedb

--- Performing Retrieval for query: 'what is this document about?' on table 'text_pages_1da5082e-7f64-4814-8c00-06e0606e6002' ---
Generating 1 embeddings with Qwen/Qwen3-Embedding-0.6B model...
2026-02-19 03:54:27,346 | INFO     | rag-system | Top 19 results:
2026-02-19 03:54:27,346 | INFO     | rag-system | chunk_id       score   preview
2026-02-19 03:54:27,346 | INFO     | rag-system | ------------------------------
2026-02-19 03:54:27,347 | INFO     | rag-system | 6fbdff51-813 0.000   Context: The image shows an OCR-processor workflow,…
2026-02-19 03:54:27,347 | INFO     | rag-system | 6fbdff51-813 0.000   Context: The context summary is: The image shows a bat…
2026-02-19 03:54:27,347 | INFO     | rag-system | 6fbdff51-813 0.000   Context: The specific chunk discusses PyTorch documentation…
2026-02-19 03:54:27,347 | INFO     | rag-system | 6fbdff51-813 0.000   Context: The OCR module's result shows successful document…
2026-02-19 03:54:27,347 | INFO     | rag-system | 6fbdff51-813 0.000   Context: The image shows a document being processed using…
2026-02-19 03:54:27,347 | INFO     | rag-system | 6fbdff51-813 0.000   Context: The local context highlights the practice section…
2026-02-19 03:54:27,347 | INFO     | rag-system | 6fbdff51-813 0.000   Context: The context discusses how Tesseract, a model for…
2026-02-19 03:54:27,348 | INFO     | rag-system | 6fbdff51-813 0.000   Context: The specific chunk highlights how the author…
2026-02-19 03:54:27,348 | INFO     | rag-system | 6fbdff51-813 12881.766 Context: The image depicts a CI/CD workflow, highlighting…
2026-02-19 03:54:27,348 | INFO     | rag-system | 6fbdff51-813 13214.093 Context: Рекомендуется переобучить модель для адаптации к…
2026-02-19 03:54:27,348 | INFO     | rag-system | 6fbdff51-813 14085.947 Context: The context summarizes Jenkins CI/CD code…
2026-02-19 03:54:27,348 | INFO     | rag-system | 6fbdff51-813 14226.038 Context: This chunk describes how JSON data is processed…
2026-02-19 03:54:27,348 | INFO     | rag-system | 6fbdff51-813 15197.564 Context: The image shows example code for inference model…
2026-02-19 03:54:27,349 | INFO     | rag-system | 6fbdff51-813 15289.271 Context: Создание модели для корректировки внутренних…
2026-02-19 03:54:27,349 | INFO     | rag-system | 6fbdff51-813 16512.316 Context: The "Организационный" section outlines the…
2026-02-19 03:54:27,349 | INFO     | rag-system | 6fbdff51-813 19344.285 Context: This chunk summarizes the CI/CD workflow using…
2026-02-19 03:54:27,349 | INFO     | rag-system | 6fbdff51-813 24795.383 Context: The chunk discusses technologies like Jupyter Hub…
2026-02-19 03:54:27,350 | INFO     | rag-system | 6fbdff51-813 26259.051 Context: Разработка OCR-системы с использованием моделей…
2026-02-19 03:54:27,350 | INFO     | rag-system | 6fbdff51-813 28496.629 Context: The chunk discusses how WMIC and PsExec were used…
Retrieved 19 documents.
 Initialising Answer.AI ColBERT reranker (answerdotai/answerai-colbert-small-v1) via rerankers lib…
Loading ColBERTRanker model answerdotai/answerai-colbert-small-v1 (this message can be suppressed by setting verbose=0)
No device set
Using device cuda
No dtype set
Using dtype torch.float32
Loading model answerdotai/answerai-colbert-small-v1, this might take a while...
Linear Dim set to: 96 for downcasting
✅ AI reranker initialized successfully.

--- Reranking top 19 docs with AI model... ---
✅ Reranking completed in 0.23s. Refined to 10 docs.

==========================RAG Context================================

2026-02-19 03:54:35,112 | INFO     | httpx | HTTP Request: POST http://localhost:11434/api/generate "HTTP/1.1 200 OK"
 Total query processing time: 15.26s
✅ Sample query successful
 Answer preview: Answer:
The document describes a university project focused on **developing an OCR (Optical Characte...
 Found 10 source documents

==================================================
 Health Check Complete: 6/6 checks passed
✅ System is healthy! 

📝 Error Logs

Please include relevant error messages or logs:

No relevant logs, cause proccess was killed by system

🔧 Configuration

  • Deployment method: [Docker / Direct Python]
  • Models used: qwen3:0.6b
  • Document types: [Excel]

🤔 Possible Solution

Raise exception if user try upload excel document

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions