Using LMStudio for RAG/File Search with LibreChat? #7713

jhmonroe · 2025-06-03T17:47:11Z

jhmonroe
Jun 3, 2025

Am I missing something and this should be super simple? Or is this not possible?

I run LibreChat locally inside Docker, I run local models in LMStudio, and want to use LMStudio's nomic (or otherwise) embedding capabilities instead of an external provider like OpenAI.
(A) i want to do it for free/private locally, and B ) especially because when I tested file upload directly within LMStudio [using its RAG] it correctly pulled all text from the PDF, but when I used OpenAI cloud embedding it missed a huge chunk of the text—perhaps because the API version uses a text-only embedding model without vision [since the chatgpt version answered questions correctly that were related to content not accessible in the text of the PDF])

I read thru this documentation and set up the basic things in env:
https://www.librechat.ai/docs/configuration/rag_api

I'm trying to cobble together based on the docs and the examples for local usage.

The following settings allow data to be passed to LMSTUDIO but lmstudio reports an error because the data received looks like an array of numbers and not text.

LMStudio receives data looking like this:
2025-06-03 19:10:55 [DEBUG]
Received request: POST to /v1/embeddings with body {
"input": [
[
2485,
1129,
11921,
870,
392,
19164,
916,
74698,
11880,
7682,
392,
19164,
61039,
30004,
11880,
18251,
18142,
2442,
79717,
48511,
30,
41320,
......... goes on and on

The error in LM Studio:
2025-06-03 19:05:08 [ERROR] 'input' field must be a string or an array of strings
The error in LibreChat:
Every time I also get the "error processing file" popup red message in LibreChat.

The .env settings:
RAG_API_URL=http://host.docker.internal:1234
RAG_OPENAI_BASEURL=http://host.docker.internal:1234/v1
RAG_OPENAI_API_KEY=lm-studio
RAG_USE_FULL_CONTEXT=True
EMBEDDINGS_PROVIDER=openai
EMBEDDINGS_MODEL=text-embedding-nomic-embed-text-v1.5
(also tried another embeddings model: EMBEDDINGS_MODEL=text-embedding-nomic-embed-text-v2-moe)
(also tried EMBEDDINGS_PROVIDER=ollama and this correctly outputs plain text to LMStudio)
Also changed these lines in docker-compose-override:
rag_api:
image: ghcr.io/danny-avila/librechat-rag-api-dev:latest

Troubleshooting: I checked the following discussions:
#3046
@dcolley is my issue similar to yours?
danny-avila/rag_api#115
I also took a look at this: https://github.com/danny-avila/rag_api/
I didn't want to get into changing the LibreChat core files if there's just a setting...

Troubleshooting (I ran this all thru Gemini out of curiosity and here's what it said):
This log output is the smoking gun!
This means that LibreChat's RAG API is sending an input field that is an array containing another array of numbers.
LM Studio is absolutely correct in throwing the error...
What you're seeing in the input field ([2485, 1129, 11921, ...]) are token IDs, not raw text strings.

Troubleshooting Additional:
Perhaps this notice is a lead... but couldn't find reference to this anywhere else.
https://docs.useanything.com/setup/embedder-configuration/local/lmstudio
"Heads up! LMStudio's inference server only allows you to load multiple LLMs or a single embedding model, but not both. This means LMStudio cannot be both your LLM and embedder."

I have asked LMStudio if this is true: lmstudio-ai/lmstudio-bug-tracker#692

General thought on supporting LMStudio as much or more than Ollama: GUI products like LibreChat are used by people who are not comfortable working in CLI all the time. So a LibreChat user is likely to also prefer LMStudio over Ollama. Please provide more compatibility with LMStudio! :-)
General thought on LibreChat fans: We are people who want to run things locally and privately. If we wanted to use cloud providers we'd use those instead. Offering more configuration options for local solutions makes more sense and would be preferred over defaulting to cloud solutions like OpenAI where our stuff gets processed by an external provider...

UPDATE / SOLUTION:
#3046
Making a copy of the config.py file in my LibreChat root folder and pointing the compose-override to it did solve the problem and allowed me to use the openai embeddings with LMStudio. I seconded the feature request to make this an easy option for people to opt into: danny-avila/rag_api#115

onestardao · 2025-08-01T01:54:27Z

onestardao
Aug 1, 2025

yo @jvmarone — your LMStudio + LibreChat setup sounds like a fun (and painful) experiment.
been down that rabbit hole too, and yep: this kind of embedding mismatch + RAG retrieval chaos is super common when things try to speak in different tensor dialects.

the error from LMStudio (input field must be a string or array of strings) usually hits when the upstream retrieval returns a raw int list or malformed embedding body. looks like your POST /v1/my/embeddings is expecting nested strings, but LMStudio thinks it's embedding integers directly. classic API expectation drift.

i eventually got tired of patching these one by one and built a semantic rescue engine that stabilizes the entire pipeline.
it's MIT-licensed, production-ready, and solves:

inconsistent chunk structure

malformed embeddings / unexpected formats

API pre/post-processing collapses

silent logic reset mid-flow (yeah, that happens)

i haven’t tested LMStudio specifically with WFGY yet, but the engine’s modular enough to plug in external embedders or file servers. if you’re still stuck and want a clean end-to-end RAG layer, happy to walk through setup. could save you hours of diffing json bodies.

cheers and good luck with the stack juggling!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Using LMStudio for RAG/File Search with LibreChat? #7713

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Using LMStudio for RAG/File Search with LibreChat? #7713

Uh oh!

Uh oh!

jhmonroe Jun 3, 2025

Replies: 1 comment

Uh oh!

onestardao Aug 1, 2025

jhmonroe
Jun 3, 2025

onestardao
Aug 1, 2025