Replies: 3 comments
-
调用stream_infer的时候,sequence_start,sequence_end都设置为True即可 |
Beta Was this translation helpful? Give feedback.
0 replies
-
sequence_end都设置为True,多轮对话就关闭掉了,怎么在多轮对话开启的时候关闭 kv cache,huggingface model 加载是通过 use_cache 配置开关 kv cache 的,lmdeploy 有类似的配置项吗 @lvhan028 |
Beta Was this translation helpful? Give feedback.
0 replies
-
sequence_end并不表示多轮对话关闭了。它意味着,调用方(用户)负责拼接历史prompt和当前的prompt |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
rt.
Beta Was this translation helpful? Give feedback.
All reactions