Reasoning models like QwQ or DeepSeek-R1 break chat title generation #6609
-
What happened?when the UI calls strangely the LibreChat server logs don't show anything. I think this may be either a race condition or an incompatibility with how reasoning models work. Alternatively in In the debug logs, it looks like LibreChat is not calling the correct model:
Using a MITM proxy I can see it is requesting the model from the incorrect endpoint - instead of trying to call the model from the IPEX endpoint (see librechat.yaml config below) it calls the same model from the current vLLM endpoint that is not hosting the model. Is this a limitation or a bug? Version Information$ docker images | grep librechat $ git rev-parse HEAD Steps to Reproduce
What browsers are you seeing the problem on?Firefox Relevant log output2025-03-28T17:37:13.949Z warn: RAG API is either not running or not reachable at http://rag_api:8000, you may experience errors with file uploads.
2025-03-28T17:37:13.969Z info: [MCP] Initializing servers
2025-03-28T17:37:13.970Z info: [MCP][exa] Connection state: connecting
2025-03-28T17:37:13.973Z debug: [MCP][exa] Transport sending: {"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"librechat-mcp-cl... [truncated]
2025-03-28T17:37:14.306Z debug: [MCP][exa] Transport sending: {"method":"notifications/initialized","jsonrpc":"2.0"}
2025-03-28T17:37:14.306Z info: [MCP][exa] Connection state: connected
2025-03-28T17:37:14.306Z info: [MCP][exa] Capabilities: {"tools":{}}
2025-03-28T17:37:14.306Z debug: [MCP][exa] Transport sending: {"method":"tools/list","jsonrpc":"2.0","id":1}
2025-03-28T17:37:14.307Z info: [MCP][exa] Available tools: search
2025-03-28T17:37:14.308Z info: [MCP] Initialized 1/1 server(s)
2025-03-28T17:37:14.308Z info: [MCP][exa] ✓ Initialized
2025-03-28T17:37:14.308Z info: [MCP] All servers initialized successfully
2025-03-28T17:37:14.308Z debug: [MCP][exa] Transport sending: {"method":"tools/list","jsonrpc":"2.0","id":2}
2025-03-28T17:37:14.310Z info: No changes needed for 'USER' role permissions
2025-03-28T17:37:14.311Z info: No changes needed for 'ADMIN' role permissions
2025-03-28T17:37:14.311Z info: Outdated Config version: 1.1.4
Latest version: 1.2.3
Check out the Config changelogs for the latest options and features added.
https://... [truncated]
2025-03-28T17:37:14.314Z info: Server listening on all interfaces at port 80. Use http://localhost:80 to access it
2025-03-28T17:39:17.724Z debug: [validateJson] files
["\"Ai_PDF.json\"","\"BrowserOp.json\"","\"Dr_Thoths_Tarot.json\"","\"DreamInterpreter.json\"","\"VoxScript.json\"","\"askyourpdf.json\"","\"drink_maestro.json\"","\"earthImagesAndVisualizations.json\"","\"image_prompt_enhancer.json\"","\"qrCodes.json\"","\"scholarai.json\"","\"uberchord.json\"","\"web_search.json\""]
2025-03-28T17:39:54.319Z debug: [AskController]
{
text: "hi",
conversationId: null,
endpoint: "vLLM CUDA",
endpointType: "custom",
resendFiles: true,
modelOptions.model: "Qwen/QwQ-32B-AWQ",
modelsConfig: "exists",
}
2025-03-28T17:39:54.391Z debug: [BaseClient] Loading history:
{
conversationId: "c27b35f5-ad5c-499d-ae0d-66790db5c121",
parentMessageId: "00000000-0000-0000-0000-000000000000",
}
2025-03-28T17:39:54.430Z debug: [BaseClient] Context Count (1/2)
{
remainingContextTokens: 4087,
maxContextTokens: 4095,
}
2025-03-28T17:39:54.430Z debug: [BaseClient] Context Count (2/2)
{
remainingContextTokens: 4087,
maxContextTokens: 4095,
}
2025-03-28T17:39:54.430Z debug: [BaseClient] tokenCountMap:
{
6323f556-0dcb-4fc0-9d85-810385e4c875: 5,
}
2025-03-28T17:39:54.431Z debug: [BaseClient]
{
promptTokens: 8,
remainingContextTokens: 4087,
payloadSize: 1,
maxContextTokens: 4095,
}
2025-03-28T17:39:54.431Z debug: [BaseClient] tokenCountMap
{
6323f556-0dcb-4fc0-9d85-810385e4c875: 5,
instructions: undefined,
}
2025-03-28T17:39:54.431Z debug: [BaseClient] userMessage
{
messageId: "6323f556-0dcb-4fc0-9d85-810385e4c875",
parentMessageId: "00000000-0000-0000-0000-000000000000",
conversationId: "c27b35f5-ad5c-499d-ae0d-66790db5c121",
sender: "User",
text: "hi",
isCreatedByUser: true,
tokenCount: 5,
}
2025-03-28T17:39:54.432Z debug: [OpenAIClient] chatCompletion
{
baseURL: "http://jgpt.local:8000/v1",
modelOptions.model: "Qwen/QwQ-32B-AWQ",
modelOptions.user: "67d8962ce74996275c71b4e2",
modelOptions.stream: true,
// 1 message(s)
modelOptions.messages: [{"role":"user","content":"hi"}],
}
2025-03-28T17:39:54.435Z debug: Making request to http://jgpt.local:8000/v1/chat/completions
2025-03-28T17:39:54.443Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2025-03-28T17:40:01.556Z debug: [OpenAIClient] chatCompletion response
{
object: "chat.completion",
id: "chatcmpl-7c122dcd444f4fde8ee78a2e2ca3074c",
// 1 choice(s)
choices: [{"stop_reason":null,"message":{"role":"assistant","reasoning_content":".\n","content":"\n\nHello! 😊... [truncated]],
created: 1743183594,
model: "Qwen/QwQ-32B-AWQ",
}
2025-03-28T17:40:01.558Z debug: [spendTokens] conversationId: c27b35f5-ad5c-499d-ae0d-66790db5c121 | Context: message | Token usage:
{
promptTokens: 8,
completionTokens: 243,
}
2025-03-28T17:40:01.564Z debug: [spendTokens] No transactions incurred against balance
2025-03-28T17:40:01.567Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2025-03-28T17:40:01.572Z debug: [AskController] Request closed
2025-03-28T17:40:01.575Z debug: [OpenAIClient] chatCompletion
{
baseURL: "http://jgpt.local:8000/v1",
modelOptions.model: "Qwen/Qwen2.5-7B-Instruct-AWQ",
modelOptions.user: "67d8962ce74996275c71b4e2",
modelOptions.temperature: 0.2,
modelOptions.presence_penalty: 0,
modelOptions.frequency_penalty: 0,
modelOptions.max_tokens: 16,
// 1 message(s)
modelOptions.messages: [{"role":"system","content":"Please generate a concise, 5-word-or-less title for the conversation, us... [truncated]],
}
2025-03-28T17:40:01.576Z debug: Making request to http://jgpt.local:8000/v1/chat/completions
2025-03-28T17:40:01.578Z warn: [OpenAIClient.chatCompletion][create] API error
2025-03-28T17:40:01.578Z error
2025-03-28T17:40:01.578Z error: [OpenAIClient.chatCompletion] Unhandled error type Error: 404 status code (no body)
2025-03-28T17:40:01.578Z error: [OpenAIClient] There was an issue generating the title with the completion method Error: 404 status code (no body)
2025-03-28T17:40:01.578Z debug: [OpenAIClient] Convo Title: New Chat
2025-03-28T17:40:01.578Z debug: [saveConvo] api/server/services/Endpoints/openAI/addTitle.js ScreenshotsNo response Code of Conduct
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Do those models accept system messages? that might be the issue you're having. It's not specific to reasoning models, as Deepseek-R1 titling works from other providers |
Beta Was this translation helpful? Give feedback.
Nevermind, I understand the situation better, you are expecting this which is not yet implemented:
#3321