File upload and RAG Changes and Improvements #3367
Replies: 2 comments
-
+1 I definitely would like this control. To choose if my text files, docx files, PDF files are included/OCR'd and included, or just sent to RAG. And I would like to see this for every message moving forward (e.g. if we're still pulling RAG for this thread) with the ability to toggle it. Control, options, visibility. |
Beta Was this translation helpful? Give feedback.
-
yo @NDolensek — you’re spot on. what you’re describing is basically a form of “involuntary vector ingestion” — you drop a file in, expecting smart context… but instead get meaningless chunk injection, or worse, broken reasoning. i ran into the same issue and ended up building an open-source patch that:
MIT license, and stable in real workloads. cheers and big +1 for this kind of granular control request. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Currently, file upload functionality significantly differs from what users learned to expect based on the web chat interfaces. Librechat enforces local RAG even for text files that could easily fit in the chat context and, hence, the behavior of the models when uploading files is often sub-par and unexpected. This results in many user complaints and basically either forces users to return to the 2022 ChatGPT-era copying and pasting of text into the chat or even abandoning the Librechat interface and settling for the standard web interfaces.
I understand that e.g. OpenAI API doesn't make it easy to support file uploads, but I suggest some effort is put into aligning Librechat behavior with the web chat interfaces it is (at least stylistically) replicating. At the minimum, I suggest the local RAG to be toggleable. I believe a text extractor functionality could alleviate much of these issues in a relatively simple manner.
for a text file attachment that is significantly shorter than the model context length, the entire text should be dumped into the chat context. RAG is completely unnecessary here and will only result in a significantly decreased response quality. For a pdf file, a standard extractor could be used to extract text or convert to xml and pass this directly to the chat API. This should happen in the background (users chat window shouldn't be filled with the extracted text, but the text should be passed to the API as-is). Effectively, the extracted text should be appended to the users message in-context. Possibly, the user could be able to inspect the extracted text by clicking, but this is minor.
if a file is very large, say >50% or even >80% of the context, warn the user/refuse/offer RAG/offer Assistants API for an OpenAI request. The refusal option would replicate the standard chat web interface behavior, which would be expected behavior for most users.
some models (e.g. Gemini API, with others presumably reaching parity soon) now support direct file upload for text, images, videos. Leverage direct file upload when possible. This option would probably require some work if multiple files are uploaded, since the files have to be referred to directly in the chat requests. Possibly, only a single file could be supported initially.
Beta Was this translation helpful? Give feedback.
All reactions