如何使用 LMDeploy 把 InternLM-20B-4bit 部署为服务 #451
vansin
started this conversation in
Show and tell
Replies: 4 comments 7 replies
-
通过以下步骤,即可快速把 InternLM-20B-Chat 部署为服务,并在线与模型聊天。 第一步:安装 lmdeploy pip install 'lmdeploy>=0.0.9'` 第二步:下载 InternLM-20B-4bit 模型 git-lfs install
git clone https://huggingface.co/internlm/internlm-chat-20b 第三步:转换模型权重格式 python3 -m lmdeploy.serve.turbomind.deploy internlm-chat \
--model-path ./internlm-chat-20b 第四步:启动 gradio 服务 python3 -m lmdeploy.serve.gradio.app ./workspace --server_name {ip_addr} --server_port {port} |
Beta Was this translation helpful? Give feedback.
0 replies
-
cd internlm-chat-20b |
Beta Was this translation helpful? Give feedback.
0 replies
-
量化部署出现这个错误,大佬们看看是为什么? |
Beta Was this translation helpful? Give feedback.
6 replies
-
24G的显存,好像跑不了4bit的量化模型?还是哪里的参数要设置? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
通过以下步骤,即可快速把 InternLM-20B-4bit 部署为服务,并在线与模型聊天。
第一步:安装 lmdeploy
pip install 'lmdeploy>=0.0.9'
第二步:下载 InternLM-20B-4bit 模型
第三步:转换模型权重格式
第四步:启动 gradio 服务
Beta Was this translation helpful? Give feedback.
All reactions