Cannot change max token output when set in librechat.yaml #6419
Unanswered
frenzybiscuit
asked this question in
Troubleshooting
Replies: 1 comment 4 replies
-
This is by design, the yaml file sets things on a global/system level. Remove it from the YAML if you want users to set their own limits |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
What happened?
When you define max_tokens as addParm in librechat.yaml, it sends it correctly to the llm backend:
In this case, mine is set to 1024.
However, when a user changes the max output tokens (either more or less), it stops sending it to the LLM backend entirely:
The LLM backend remains at the earlier 1024, but thats only because it was set to that on the last prompt. max_tokens completely vanishes from new prompts.
The max output tokens works when it's not predefined in the librechat.yaml.
The reason I think this is a bug, is because the max_tokens addParm output does vanish from the LLM backend and librechat no longer sends it.
Also, I personally feel like max_tokens should just define the upper limit and the end-user should be able to set it lower when they want to, so I feel like this is a bug.
Version Information
ghcr.io/danny-avila/librechat-dev latest c83689215440 5 hours ago 882MB
ghcr.io/danny-avila/librechat-dev e4979ae60fba 40 hours ago 866MB
ghcr.io/danny-avila/librechat-rag-api-dev latest 5f0a3f475b72 12 days ago 7.79GB
ghcr.io/danny-avila/librechat-rag-api-dev-lite latest 6550e7ddf180 12 days ago 1.3GB
Steps to Reproduce
add max_tokens: 1024 to addParm on a custom model in librechat.yaml.
Use koboldcpp to verify the output.
What browsers are you seeing the problem on?
Firefox
Relevant log output
Screenshots
No response
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions