关于vllm AsyncLLMEngine支持问题 #69
RuntimeError217
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
目前官方给出的基于vllm流式推理样例代码使用了离线推理引擎,该引擎无法真正支持高并发的流式推理。希望能给出基于AsyncLLMEngine的实现方案。
Beta Was this translation helpful? Give feedback.
All reactions