With the new TensorRT-LLM 0.9.0, ModelConfig in tesnorrt_llm.runtime.generation now has new args max_batch_size, max_beam_width - how do we set these? #1505
Unanswered
digitalmonkey
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The new version of Tensorrt_llm introduced new arguments to ModelConfig - max_batch_size, max_beam_width.
How do we set these? Specifically, this breaks the trt_llama_api.py script that is used to build the rag on windows. I'm trying to run that on Ubuntu and wish to stay up to date with tensorrt_llm's latest release.
Beta Was this translation helpful? Give feedback.
All reactions