Skip to content

Commit 4f2de19

Browse files
committed
fix lmdeploy max_new_tokens
1 parent 8671553 commit 4f2de19

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

gpt_server/model_backend/lmdeploy_backend.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ async def stream_chat(self, params: Dict[str, Any]) -> AsyncGenerator:
106106
top_p = float(params.get("top_p", 0.8))
107107
top_k = params.get("top_k", 50)
108108

109-
max_new_tokens = min(int(params.get("max_new_tokens", 1024 * 8)), 1024 * 4)
109+
max_new_tokens = int(params.get("max_new_tokens", 1024 * 8))
110110
stop_str = params.get("stop", None)
111111
stop_token_ids = params.get("stop_words_ids", None) or []
112112
presence_penalty = float(params.get("presence_penalty", 0.0))

0 commit comments

Comments
 (0)