Using GPT-4o with RAG results in error "type": "INPUT_LENGTH", "info": "NaN / 127500" #7708
Replies: 7 comments 2 replies
-
Correction for "Steps to Reproduce" Setting up these conditions is a long process that involves numerous edits to config files |
Beta Was this translation helpful? Give feedback.
-
@paulfields I'm not able to reproduce this issue, do you mind sharing the file? |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Update - I deployed GPT-35-turbo just to see if there was anything model-specific going on and got the same issue. |
Beta Was this translation helpful? Give feedback.
-
interesting, still no issue can you try updating to make sure everything is up to date? It shouldn't be the issue though. I will look into what could possibly be making it "NaN" as that is the only outlier Updating Instructions (docker): |
Beta Was this translation helpful? Give feedback.
-
Hi Danny - I followed the instructions in https://www.librechat.ai/docs/local/docker#update-librechat and rebuilt everything, except for the modified config (e.g. yml, env, yaml) files that were not changed by the git pull command, but still seeing the error (see attached). I wonder if it could be something with how my ./rag-api/app/main.py is making entries to postgres and mongodb. I am assuming you are using a main.py to create the embeddings for RAG, so if you think that's a plausible cause would you mind sharing the insert statements from your main.py script? Alternately is there something in my config files I should be checking for that could contribute to such an error being thrown? |
Beta Was this translation helpful? Give feedback.
-
I was able to get past the NaN error by modifying BaseClient.js and OpenAIClient.js. Maybe I'm not thinking correctly but I didn't see a solution that fit my desired use case in the RAG setup instructions. . Now that I have the ability to upload and create embeddings through a customized ./rag-api/app/main.py, and have it present in the documents set on the right side tools menu in the UI, I'm not able to get the chat to recognize it. See the following image. Any idea what might be going wrong? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
What happened?
I've enabled RAG in my local deployment using a FastAPI service in a custom main.py script under rag-api/app that implements an /embed endpoint using my own deployment of Azure OpenAI’s text-embedding-ada-002. It inserting embeddings from a short document I've created directly into postgres and matching metadata into MongoDB. I see the indexes as available resources on the right hand side of Librechat's UI.
I have also enabled my own deployment of Azure OpenAI GPT-4o which I'm able to use for chat, BUT when I drag on of my indexes into chat prompt and execute a (short) prompt I get the following error in the UI
"The latest message token count is too long, exceeding the token limit, or your token limit parameters are misconfigured, adversely affecting the context window. More info: NaN 127500. . .
In the log output I get the following
As noted both my embedded content and prompt messages are very short, and I've not found a way to specify the input_length or content_length params in a way that resolves this error.
Version Information
I'm running LibreChat in docker using a version I downloaded from ghcr.io/danny-avila/librechat-dev:latest.
newlibrechat-rag_api latest 944580fb3e4c 17 hours ago 465MB
ghcr.io/danny-avila/librechat-dev latest d27883cb774d 4 days ago 1.77GB
Steps to Reproduce
Setting up these conditions is a log process that involves numerous edits to config files, deploying models externally to Librechat etc. Please let me know if you have specific questions on the setup and I'm happy to answer.
Please also refer to the screenshot which helps describe the issue.
What browsers are you seeing the problem on?
Chrome, Microsoft Edge
Relevant log output
Screenshots
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions