GPT-5 on Azure seems to hang with long context #8963
Replies: 2 comments
-
We can’t control how long the model takes to respond. It’s common for longer context to increase time to first token. As for “error connecting to server” message, there may be a default timeout in effect from your reverse proxy. Search discussions as this is discussed before especially in regards to reasoning models which take longer in general. |
Beta Was this translation helpful? Give feedback.
-
Even though OpenAI mentions 30,000 tokens per minute hard limit, it's in fact (at the time) per conversation. They state that by reaching tier 2 (7 days after first payment and 50$ spent on their platform), limits will increase. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
What happened?
Currently on v0.8.0-rc1 using the new GPT-5 model on Azure, it seems that GPT-5 works fine when I have short comments:
But when i feed it a longer context (maybe 8000 tokens) it seems to time out.
The gpt-5-chat model seems to be fine, it's specifically the GPT-5 model.
Version Information
:~/LibreChat# docker images | grep librechat
ghcr.io/danny-avila/librechat-dev latest a916c8e1148b 2 hours ago 1.17GB
ghcr.io/danny-avila/librechat-rag-api-dev latest 0e8a1478bb84 4 days ago 7.87GB
Steps to Reproduce
Prompt GPT-5 base model (Not gpt-5-chat) with ~8000 tokens
What browsers are you seeing the problem on?
Firefox
Relevant log output
Screenshots
No response
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions