-
-
Notifications
You must be signed in to change notification settings - Fork 266
Description
π Bug Summary
RAG Chat returns invalid count (merely a symptom - but we start here)
π Description
Some of what I am experiencing may be user error.
I am waiting until all the files are loaded and complete before I check why paperless says 24K files with 'Image-File' tag. Possibly, I need force some action after all the files are loaded (there are many still in queue).
However, one thing is glaring and evidently confusing or, maybe a bug.
RAG Chat gave me a count of 4 documents when the log showed 40.
See below
RAG Prompt:
Welcome to Paperless-AI RAG Chat! Ask a question about your documents.
How many documents have Image-File tag
RAG Response:
Based on the provided documents, there are 4 documents that have the "Image-File" tag.
Then it listed 5 sources...
Paperless-ai Logs:
paperless-ai | 2026-02-24 17:31:16,197 - RAGZ - INFO - Context request: How many documents have Image-File tag
paperless-ai | 2026-02-24 17:31:16,197 - RAGZ - INFO - Validating search engine state
paperless-ai | 2026-02-24 17:31:16,205 - RAGZ - INFO - ChromaDB collection contains 73987 documents
paperless-ai | 2026-02-24 17:31:16,205 - RAGZ - INFO - BM25 index contains 73988 documents
paperless-ai | 2026-02-24 17:31:16,205 - RAGZ - INFO - Search engine validation successful
paperless-ai | 2026-02-24 17:31:16,206 - RAGZ - INFO - Performing search for: 'How many documents have Image-File tag'
paperless-ai | 2026-02-24 17:31:16,206 - RAGZ - INFO - Performing hybrid search for query: 'How many documents have Image-File tag'
paperless-ai | 2026-02-24 17:31:16,418 - RAGZ - INFO - Keyword search found 40 results
Index was completed.
System was rebooted and I waited over an hour for paperless-ai and paperless-ngx to sync (and then 'ai' started processing more documents).
π Steps to Reproduce
- Load documents with a tag into 'ngx'
- Allow 'ai' to process them
- Allow 'ai' to index (I forced and waited overnight)
- Perform RAG Chat with count by the tag used
- Wait for response
- Look into log
β Expected Behavior
Expected the count provided in the Chat response to match the count shown in log
β Actual Behavior
Chat response showed count of 4, it displayed 5 sources and log showed 40 found
π·οΈ Paperless-AI Version
3.0.9
π Docker Logs
paperless-ai | 2026-02-24 17:31:16,197 - RAGZ - INFO - Context request: How many documents have Image-File tag
paperless-ai | 2026-02-24 17:31:16,197 - RAGZ - INFO - Validating search engine state
paperless-ai | 2026-02-24 17:31:16,205 - RAGZ - INFO - ChromaDB collection contains 73987 documents
paperless-ai | 2026-02-24 17:31:16,205 - RAGZ - INFO - BM25 index contains 73988 documents
paperless-ai | 2026-02-24 17:31:16,205 - RAGZ - INFO - Search engine validation successful
paperless-ai | 2026-02-24 17:31:16,206 - RAGZ - INFO - Performing search for: 'How many documents have Image-File tag'
paperless-ai | 2026-02-24 17:31:16,206 - RAGZ - INFO - Performing hybrid search for query: 'How many documents have Image-File tag'
paperless-ai | 2026-02-24 17:31:16,418 - RAGZ - INFO - Keyword search found **40** resultsπ Paperless-ngx Logs
No ngx log entries seem relevant to the prompt (tried twice)πΌοΈ Screenshots of your settings page
No response
π₯οΈ Desktop Environment
Linux
π» OS Version
'ai' and 'ngx are on 6.18.9+deb13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.18.9-1~bpo13+1 (2026-02-13) x86_64 GNU/Linux (OMV)
π Browser
None
π’ Browser Version
Windows Chrome browser Version 135.0.7049.96 (Official Build) (64-bit)
π Mobile Browser
No response
π Additional Information
- I have checked existing issues and this is not a duplicate
- I have tried debugging this issue on my own
- I can provide a fix and submit a PR
- I am sure that this problem is affecting everyone, not only me
- I have provided all required information above
π Extra Notes
Docker version 29.2.1, build a5c7197