Skip to content

[Bug]:Β #866

@fmotta

Description

@fmotta

πŸ” Bug Summary

RAG Chat returns invalid count (merely a symptom - but we start here)

πŸ“– Description

Some of what I am experiencing may be user error.
I am waiting until all the files are loaded and complete before I check why paperless says 24K files with 'Image-File' tag. Possibly, I need force some action after all the files are loaded (there are many still in queue).

However, one thing is glaring and evidently confusing or, maybe a bug.
RAG Chat gave me a count of 4 documents when the log showed 40.

See below

RAG Prompt:
Welcome to Paperless-AI RAG Chat! Ask a question about your documents.
How many documents have Image-File tag
RAG Response:
Based on the provided documents, there are 4 documents that have the "Image-File" tag.
Then it listed 5 sources...

Paperless-ai Logs:

paperless-ai | 2026-02-24 17:31:16,197 - RAGZ - INFO - Context request: How many documents have Image-File tag
paperless-ai | 2026-02-24 17:31:16,197 - RAGZ - INFO - Validating search engine state
paperless-ai | 2026-02-24 17:31:16,205 - RAGZ - INFO - ChromaDB collection contains 73987 documents
paperless-ai | 2026-02-24 17:31:16,205 - RAGZ - INFO - BM25 index contains 73988 documents
paperless-ai | 2026-02-24 17:31:16,205 - RAGZ - INFO - Search engine validation successful
paperless-ai | 2026-02-24 17:31:16,206 - RAGZ - INFO - Performing search for: 'How many documents have Image-File tag'
paperless-ai | 2026-02-24 17:31:16,206 - RAGZ - INFO - Performing hybrid search for query: 'How many documents have Image-File tag'
paperless-ai | 2026-02-24 17:31:16,418 - RAGZ - INFO - Keyword search found 40 results

Index was completed.
System was rebooted and I waited over an hour for paperless-ai and paperless-ngx to sync (and then 'ai' started processing more documents).

πŸ”„ Steps to Reproduce

  1. Load documents with a tag into 'ngx'
  2. Allow 'ai' to process them
  3. Allow 'ai' to index (I forced and waited overnight)
  4. Perform RAG Chat with count by the tag used
  5. Wait for response
  6. Look into log

βœ… Expected Behavior

Expected the count provided in the Chat response to match the count shown in log

❌ Actual Behavior

Chat response showed count of 4, it displayed 5 sources and log showed 40 found

🏷️ Paperless-AI Version

3.0.9

πŸ“œ Docker Logs

paperless-ai  | 2026-02-24 17:31:16,197 - RAGZ - INFO - Context request: How many documents have Image-File tag
paperless-ai  | 2026-02-24 17:31:16,197 - RAGZ - INFO - Validating search engine state
paperless-ai  | 2026-02-24 17:31:16,205 - RAGZ - INFO - ChromaDB collection contains 73987 documents
paperless-ai  | 2026-02-24 17:31:16,205 - RAGZ - INFO - BM25 index contains 73988 documents
paperless-ai  | 2026-02-24 17:31:16,205 - RAGZ - INFO - Search engine validation successful
paperless-ai  | 2026-02-24 17:31:16,206 - RAGZ - INFO - Performing search for: 'How many documents have Image-File tag'
paperless-ai  | 2026-02-24 17:31:16,206 - RAGZ - INFO - Performing hybrid search for query: 'How many documents have Image-File tag'
paperless-ai  | 2026-02-24 17:31:16,418 - RAGZ - INFO - Keyword search found **40** results

πŸ“œ Paperless-ngx Logs

No ngx log entries seem relevant to the prompt (tried twice)

πŸ–ΌοΈ Screenshots of your settings page

No response

πŸ–₯️ Desktop Environment

Linux

πŸ’» OS Version

'ai' and 'ngx are on 6.18.9+deb13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.18.9-1~bpo13+1 (2026-02-13) x86_64 GNU/Linux (OMV)

🌐 Browser

None

πŸ”’ Browser Version

Windows Chrome browser Version 135.0.7049.96 (Official Build) (64-bit)

🌐 Mobile Browser

No response

πŸ“ Additional Information

  • I have checked existing issues and this is not a duplicate
  • I have tried debugging this issue on my own
  • I can provide a fix and submit a PR
  • I am sure that this problem is affecting everyone, not only me
  • I have provided all required information above

πŸ“Œ Extra Notes

Docker version 29.2.1, build a5c7197

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions