Skip to content

[Bug]: Manual scan shows no documents, while RAG indexing works (Paperless-NGX API OK, 16 docs indexed)ย #858

@funky51

Description

@funky51

๐Ÿ” Bug Summary

Manual scan shows no documents, while RAG indexing works (Paperless-NGX API OK, 16 docs indexed)

๐Ÿ“– Description

Paperless-AI can successfully connect to Paperless-NGX and the RAG engine works correctly (documents are indexed, RAG chat returns relevant results).

However:

  • the Manual tab shows an empty "Choose a documentโ€ฆ" dropdown (no documents listed),

  • the automatic scanner never processes any documents,

  • logs repeatedly show:

    • Failed to get own user ID. Abort scanning.
    • [DEBUG] Invalid API response on page 1
    • [DEBUG] Finished fetching. Found 0 documents.

So RAG is fully functional, but the "classic" manual/auto scan pipeline sees 0 documents.

๐Ÿ”„ Steps to Reproduce

  1. Run Paperless-NGX behind http://PAPERLESS_HOST:8899 (Docker, Synology NAS).

  2. Deploy Paperless-AI via Docker/Portainer stack with:

    • PAPERLESS_API_URL=http://PAPERLESS_HOST:8899
    • PAPERLESS_API_TOKEN=<redacted> (from Paperless-NGX user profile)
    • PAPERLESS_USERNAME=<my-paperless-user>
    • AI_PROVIDER=ollama
    • OLLAMA_API_URL=http://OLLAMA_HOST:11441
    • OLLAMA_MODEL=llama3.1:latest
  3. Open Paperless-AI Web UI.

  4. Go to RAG Chat and click "Start indexing".

    • โ†’ RAG indexing completes successfully (N documents indexed).
    • โ†’ RAG chat can answer questions based on these documents.
  5. Go to the Manual tab.

    • The dropdown "Select document โ†’ Choose a documentโ€ฆ" is always empty.
  6. Click "Scan now" in the Manual/Scan section.

    • No entries appear in History.
    • No Paperless-NGX documents get updated tags or metadata.
  7. Check the Docker logs.

    • You will see Failed to get own user ID. Abort scanning. and Invalid API response on page 1 coming from the scanner/Node part, while the RAG part reports a successful index with N documents.

โœ… Expected Behavior

  • The Manual tab should list Paperless-NGX documents so that I can select one and run "Analyze with AI".
  • The automatic scanner should be able to fetch and process documents (according to the configured tag rules) and show entries in the History tab.
  • The scanner/Node side should parse the Paperless-NGX API response in the same way as the RAG backend:
    • If RAG can see N documents, the manual/auto scan should also see N documents, not 0.

โŒ Actual Behavior

  • RAG: works

    • RAG indexing completes successfully:
      • The logs show a full fetch from the Paperless-NGX API, with HTTP 200.
      • ChromaDB and BM25 indices are built with N documents.
    • RAG chat returns relevant answers based on those documents.
  • Manual / Auto scan: does not work

    • Manual tab:
      • "Select document โ†’ Choose a documentโ€ฆ" is always empty (0 documents).
    • Automatic scan:
      • "Scan now" does not process any document.
      • The History tab remains empty.
      • No documents in Paperless-NGX receive "AI processed" tags or updated metadata.

๐Ÿท๏ธ Paperless-AI Version

3.0.9

๐Ÿ“œ Docker Logs

_Excerpt (redacted and shortened):_


Fetching documents from Paperless-NGX API: http://PAPERLESS_HOST:8899
Response status code: 200

Processing documents: 16/16
Saved 16 documents to ./data/documents.json
ChromaDB collection contains 16 documents
BM25 index contains 16 documents
Search engine validation successful

Context request: Which documents are about credit?
Semantic search found 16 results
Hybrid search found 16 results
Reranked 16 results
Returning 16 search results
INFO:     127.0.0.1:xxxxx - "POST /ragz/chat HTTP/1.1" 200 OK

Failed to get own user ID. Abort scanning.
[DEBUG] Invalid API response on page 1
[DEBUG] Finished fetching. Found 0 documents.
[DEBUG] Invalid API response on page 1

๐Ÿ“œ Paperless-ngx Logs

๐Ÿ–ผ๏ธ Screenshots of your settings page

No response

๐Ÿ–ฅ๏ธ Desktop Environment

Windows

๐Ÿ’ป OS Version

Windows 11

๐ŸŒ Browser

Other

๐Ÿ”ข Browser Version

opera

๐ŸŒ Mobile Browser

No response

๐Ÿ“ Additional Information

  • I have checked existing issues and this is not a duplicate
  • I have tried debugging this issue on my own
  • I can provide a fix and submit a PR
  • I am sure that this problem is affecting everyone, not only me
  • I have provided all required information above

๐Ÿ“Œ Extra Notes

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions