Skip to content

Commit aca5a7c

Browse files
Add OLLaMA doc (#1660)
1 parent 9f39915 commit aca5a7c

File tree

7 files changed

+330
-16
lines changed

7 files changed

+330
-16
lines changed
Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
# OLLaMA导出文档
2+
3+
SWIFT已经支持了OLLaMA Modelfile的导出能力,该能力合并到了`swift export`命令中。
4+
5+
## 目录
6+
7+
- [环境准备](#环境准备)
8+
- [导出](#导出)
9+
- [需要注意的问题](#需要注意的问题)
10+
11+
## 环境准备
12+
13+
```shell
14+
# 设置pip全局镜像 (加速下载)
15+
pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/
16+
# 安装ms-swift
17+
git clone https://github.com/modelscope/swift.git
18+
cd swift
19+
pip install -e '.[llm]'
20+
```
21+
22+
OLLaMA导出不需要其他模块支持,因为SWIFT仅会导出ModelFile,后续的运行用户可以自行处理。
23+
24+
## 导出
25+
26+
OLLaMA导出命令行如下:
27+
28+
```shell
29+
# model_type
30+
swift export --model_type llama3-8b-instruct --to_ollama true --ollama_output_dir llama3-8b-instruct-ollama
31+
# ckpt_dir,注意lora训练需要增加--merge_lora true
32+
swift export --ckpt_dir /mnt/workspace/yzhao/tastelikefeet/swift/output/qwen-7b-chat/v141-20240331-110833/checkpoint-10942 --to_ollama true --ollama_output_dir qwen-7b-chat-ollama --merge_lora true
33+
```
34+
35+
执行后会打印如下log:
36+
```shell
37+
[INFO:swift] Exporting to ollama:
38+
[INFO:swift] If you have a gguf file, try to pass the file by :--gguf_file /xxx/xxx.gguf, else SWIFT will use the original(merged) model dir
39+
[INFO:swift] Downloading the model from ModelScope Hub, model_id: LLM-Research/Meta-Llama-3-8B-Instruct
40+
[WARNING:modelscope] Authentication has expired, please re-login with modelscope login --token "YOUR_SDK_TOKEN" if you need to access private models or datasets.
41+
[WARNING:modelscope] Using branch: master as version is unstable, use with caution
42+
[INFO:swift] Loading the model using model_dir: /mnt/workspace/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct
43+
[INFO:swift] Save Modelfile done, you can start ollama by:
44+
[INFO:swift] > ollama serve
45+
[INFO:swift] In another terminal:
46+
[INFO:swift] > ollama create my-custom-model -f /mnt/workspace/yzhao/tastelikefeet/swift/llama3-8b-instruct-ollama/Modelfile
47+
[INFO:swift] > ollama run my-custom-model
48+
[INFO:swift] End time of running main: 2024-08-09 17:17:48.768722
49+
```
50+
51+
提示可以运行,此时打开ModelFile查看:
52+
53+
```text
54+
FROM /mnt/workspace/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct
55+
TEMPLATE """{{ if .System }}<|begin_of_text|><|start_header_id|>system<|end_header_id|>
56+
57+
{{ .System }}<|eot_id|>{{ else }}<|begin_of_text|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
58+
59+
{{ .Prompt }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
60+
61+
{{ end }}{{ .Response }}<|eot_id|>"""
62+
PARAMETER stop "<|eot_id|>"
63+
PARAMETER temperature 0.3
64+
PARAMETER top_k 20
65+
PARAMETER top_p 0.7
66+
PARAMETER repeat_penalty 1.0
67+
```
68+
69+
用户可以改动生成的文件,用于后续推理。
70+
71+
### OLLaMA使用
72+
73+
使用上面的文件,需要安装OLLaMA:
74+
```shell
75+
# https://github.com/ollama/ollama
76+
curl -fsSL https://ollama.com/install.sh | sh
77+
```
78+
79+
启动OLLaMA:
80+
81+
```shell
82+
ollama serve
83+
```
84+
85+
在另一个terminal运行:
86+
87+
```shell
88+
ollama create my-custom-model -f /mnt/workspace/yzhao/tastelikefeet/swift/llama3-8b-instruct-ollama/Modelfile
89+
```
90+
91+
执行后会打印如下log:
92+
93+
```text
94+
transferring model data
95+
unpacking model metadata
96+
processing tensors
97+
converting model
98+
creating new layer sha256:37b0404fb276acb2e5b75f848673566ce7048c60280470d96009772594040706
99+
creating new layer sha256:2ecd014a372da71016e575822146f05d89dc8864522fdc88461c1e7f1532ba06
100+
creating new layer sha256:ddc2a243c4ec10db8aed5fbbc5ac82a4f8425cdc4bd3f0c355373a45bc9b6cb0
101+
creating new layer sha256:fc776bf39fa270fa5e2ef7c6782068acd858826e544fce2df19a7a8f74f3f9df
102+
writing manifest
103+
success
104+
```
105+
106+
之后就可以用命令的名字来推理:
107+
108+
```shell
109+
ollama run my-custom-model
110+
```
111+
112+
```shell
113+
>>> who are you?
114+
I'm LLaMA, I'm a large language model trained by a team of researcher at Meta AI. My primary function is to understand and respond to human
115+
input in a helpful and informative way. I'm a type of AI designed to simulate conversation, answer questions, and even generate text based
116+
on a given prompt or topic.
117+
118+
I'm not a human, but rather a computer program designed to mimic human-like conversation. I don't have personal experiences, emotions, or
119+
physical presence, but I'm here to provide information, answer your questions, and engage in conversation to the best of my abilities.
120+
121+
I'm constantly learning and improving my responses based on the interactions I have with users like you, so please bear with me if I make
122+
any mistakes or don't quite understand what you're asking. I'm here to help and provide assistance, so feel free to ask me anything!
123+
```
124+
125+
## 需要注意的问题
126+
127+
1. 部分模型在
128+
129+
```shell
130+
ollama create my-custom-model -f /mnt/workspace/yzhao/tastelikefeet/swift/qwen-7b-chat-ollama/Modelfile
131+
```
132+
133+
的时候会报错:
134+
135+
```shell
136+
Error: Models based on 'QWenLMHeadModel' are not yet supported
137+
```
138+
139+
这是因为ollama的转换并不支持所有类型的模型,此时可以自行进行gguf导出并修改Modelfile的FROM字段:
140+
141+
```shell
142+
# 详细转换步骤可以参考:https://github.com/ggerganov/llama.cpp/blob/master/examples/quantize/README.md
143+
git clone https://github.com/ggerganov/llama.cpp.git
144+
cd llama.cpp
145+
# 模型目录可以在`swift export`命令的日志中找到,类似:
146+
# Using model_dir: /mnt/workspace/yzhao/tastelikefeet/swift/output/qwen-7b-chat/v141-20240331-110833/checkpoint-10942-merged
147+
python convert_hf_to_gguf.py /mnt/workspace/yzhao/tastelikefeet/swift/output/qwen-7b-chat/v141-20240331-110833/checkpoint-10942-merged
148+
```
149+
150+
之后重新执行:
151+
152+
```shell
153+
ollama create my-custom-model -f /mnt/workspace/yzhao/tastelikefeet/swift/qwen-7b-chat-ollama/Modelfile
154+
```

docs/source/LLM/index.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,15 @@
88
4. [界面训练与推理](../GetStarted/%E7%95%8C%E9%9D%A2%E8%AE%AD%E7%BB%83%E6%8E%A8%E7%90%86.md)
99
5. [LLM评测文档](LLM评测文档.md)
1010
6. [LLM量化文档](LLM量化文档.md)
11-
7. [VLLM推理加速与部署](VLLM推理加速与部署.md)
12-
8. [LmDeploy推理加速与部署](LmDeploy推理加速与部署.md)
13-
9. [LLM实验文档](LLM实验文档.md)
14-
10. [DPO训练文档](DPO训练文档.md)
15-
11. [ORPO最佳实践](ORPO算法最佳实践.md)
16-
12. [SimPO最佳实践](SimPO算法最佳实践.md)
17-
13. [人类偏好对齐训练文档](人类偏好对齐训练文档.md)
18-
14. [Megatron训练文档](Megatron训练文档.md)
11+
7. [OLLAMA导出文档](OLLAMA导出文档.md)
12+
8. [VLLM推理加速与部署](VLLM推理加速与部署.md)
13+
9. [LmDeploy推理加速与部署](LmDeploy推理加速与部署.md)
14+
10. [LLM实验文档](LLM实验文档.md)
15+
11. [DPO训练文档](DPO训练文档.md)
16+
12. [ORPO最佳实践](ORPO算法最佳实践.md)
17+
13. [SimPO最佳实践](SimPO算法最佳实践.md)
18+
14. [人类偏好对齐训练文档](人类偏好对齐训练文档.md)
19+
15. [Megatron训练文档](Megatron训练文档.md)
1920

2021
### ⭐️最佳实践系列
2122

docs/source/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ Swift DOCUMENTATION
2626
LLM/人类偏好对齐训练文档.md
2727
LLM/LLM评测文档.md
2828
LLM/LLM量化文档.md
29+
LLM/OLLAMA导出文档.md
2930
LLM/VLLM推理加速与部署.md
3031
LLM/LmDeploy推理加速与部署.md
3132
LLM/Megatron训练文档.md
Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
# OLLaMA Export Documentation
2+
3+
SWIFT now supports exporting OLLaMA Model files, integrated into the `swift export` command.
4+
5+
## Contents
6+
7+
- [Environment Setup](#environment-setup)
8+
- [Export](#export)
9+
- [Points to Note](#points-to-note)
10+
11+
## Environment Setup
12+
13+
```shell
14+
# Set pip global mirror (to speed up downloads)
15+
pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/
16+
# Install ms-swift
17+
git clone https://github.com/modelscope/swift.git
18+
cd swift
19+
pip install -e '.[llm]'
20+
```
21+
22+
No additional modules are needed for OLLaMA export, as SWIFT only exports the ModelFile. Users can handle subsequent operations.
23+
24+
## Export
25+
26+
The OLLaMA export command line is as follows:
27+
28+
```shell
29+
# model_type
30+
swift export --model_type llama3-8b-instruct --to_ollama true --ollama_output_dir llama3-8b-instruct-ollama
31+
# ckpt_dir, note that for lora training, add --merge_lora true
32+
swift export --ckpt_dir /mnt/workspace/yzhao/tastelikefeet/swift/output/qwen-7b-chat/v141-20240331-110833/checkpoint-10942 --to_ollama true --ollama_output_dir qwen-7b-chat-ollama --merge_lora true
33+
```
34+
35+
After execution, the following log will be printed:
36+
```shell
37+
[INFO:swift] Exporting to ollama:
38+
[INFO:swift] If you have a gguf file, try to pass the file by :--gguf_file /xxx/xxx.gguf, else SWIFT will use the original(merged) model dir
39+
[INFO:swift] Downloading the model from ModelScope Hub, model_id: LLM-Research/Meta-Llama-3-8B-Instruct
40+
[WARNING:modelscope] Authentication has expired, please re-login with modelscope login --token "YOUR_SDK_TOKEN" if you need to access private models or datasets.
41+
[WARNING:modelscope] Using branch: master as version is unstable, use with caution
42+
[INFO:swift] Loading the model using model_dir: /mnt/workspace/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct
43+
[INFO:swift] Save Modelfile done, you can start ollama by:
44+
[INFO:swift] > ollama serve
45+
[INFO:swift] In another terminal:
46+
[INFO:swift] > ollama create my-custom-model -f /mnt/workspace/yzhao/tastelikefeet/swift/llama3-8b-instruct-ollama/Modelfile
47+
[INFO:swift] > ollama run my-custom-model
48+
[INFO:swift] End time of running main: 2024-08-09 17:17:48.768722
49+
```
50+
51+
Check the Modelfile:
52+
53+
```text
54+
FROM /mnt/workspace/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct
55+
TEMPLATE """{{ if .System }}<|begin_of_text|><|start_header_id|>system<|end_header_id|>
56+
57+
{{ .System }}<|eot_id|>{{ else }}<|begin_of_text|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
58+
59+
{{ .Prompt }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
60+
61+
{{ end }}{{ .Response }}<|eot_id|>"""
62+
PARAMETER stop "<|eot_id|>"
63+
PARAMETER temperature 0.3
64+
PARAMETER top_k 20
65+
PARAMETER top_p 0.7
66+
PARAMETER repeat_penalty 1.0
67+
```
68+
69+
Users can modify the generated file for subsequent inference.
70+
71+
### Using OLLaMA
72+
73+
To use the above file, install OLLaMA:
74+
75+
```shell
76+
# https://github.com/ollama/ollama
77+
curl -fsSL https://ollama.com/install.sh | sh
78+
```
79+
80+
Start OLLaMA:
81+
82+
```shell
83+
ollama serve
84+
```
85+
86+
In another terminal, run:
87+
88+
```shell
89+
ollama create my-custom-model -f /mnt/workspace/yzhao/tastelikefeet/swift/llama3-8b-instruct-ollama/Modelfile
90+
```
91+
92+
The following log will be printed after execution:
93+
94+
```text
95+
transferring model data
96+
unpacking model metadata
97+
processing tensors
98+
converting model
99+
creating new layer sha256:37b0404fb276acb2e5b75f848673566ce7048c60280470d96009772594040706
100+
creating new layer sha256:2ecd014a372da71016e575822146f05d89dc8864522fdc88461c1e7f1532ba06
101+
creating new layer sha256:ddc2a243c4ec10db8aed5fbbc5ac82a4f8425cdc4bd3f0c355373a45bc9b6cb0
102+
creating new layer sha256:fc776bf39fa270fa5e2ef7c6782068acd858826e544fce2df19a7a8f74f3f9df
103+
writing manifest
104+
success
105+
```
106+
107+
You can then use the command name for inference:
108+
109+
```shell
110+
ollama run my-custom-model
111+
```
112+
113+
```shell
114+
>>> who are you?
115+
I'm LLaMA, a large language model trained by a team of researchers at Meta AI. My primary function is to understand and respond to human
116+
input in a helpful and informative way. I'm a type of AI designed to simulate conversation, answer questions, and even generate text based
117+
on a given prompt or topic.
118+
119+
I'm not a human, but rather a computer program designed to mimic human-like conversation. I don't have personal experiences, emotions, or
120+
physical presence, but I'm here to provide information, answer your questions, and engage in conversation to the best of my abilities.
121+
122+
I'm constantly learning and improving my responses based on the interactions I have with users like you, so please bear with me if I make
123+
any mistakes or don't quite understand what you're asking. I'm here to help and provide assistance, so feel free to ask me anything!
124+
```
125+
126+
## Points to Note
127+
128+
1. Some models may report an error during:
129+
130+
```shell
131+
ollama create my-custom-model -f /mnt/workspace/yzhao/tastelikefeet/swift/qwen-7b-chat-ollama/Modelfile
132+
```
133+
134+
Error message:
135+
136+
```shell
137+
Error: Models based on 'QWenLMHeadModel' are not yet supported
138+
```
139+
140+
This is because the conversion in OLLaMA does not support all types of models. You can perform gguf export yourself and modify the FROM field in the Modelfile:
141+
142+
```shell
143+
# Detailed conversion steps can be found at: https://github.com/ggerganov/llama.cpp/blob/master/examples/quantize/README.md
144+
git clone https://github.com/ggerganov/llama.cpp.git
145+
cd llama.cpp
146+
# The model directory can be found in the `swift export` command log, similar to:
147+
# Using model_dir: /mnt/workspace/yzhao/tastelikefeet/swift/output/qwen-7b-chat/v141-20240331-110833/checkpoint-10942-merged
148+
python convert_hf_to_gguf.py /mnt/workspace/yzhao/tastelikefeet/swift/output/qwen-7b-chat/v141-20240331-110833/checkpoint-10942-merged
149+
```
150+
151+
Then re-execute:
152+
153+
```shell
154+
ollama create my-custom-model -f /mnt/workspace/yzhao/tastelikefeet/swift/qwen-7b-chat-ollama/Modelfile
155+
```

docs/source_en/LLM/index.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,15 @@
88
4. [Web-UI Training and Inference](../GetStarted/Web-ui.md)
99
5. [LLM Evaluation](LLM-eval.md)
1010
6. [LLM Quantization](LLM-quantization.md)
11-
7. [VLLM Inference and Deployment](VLLM-inference-acceleration-and-deployment.md)
12-
8. [LmDeploy Inference and Deployment](LmDeploy-inference-acceleration-and-deployment.md)
13-
9. [LLM Experimental](LLM-exp.md)
14-
10. [DPO Training](DPO.md)
15-
11. [ORPO Training](ORPO.md)
16-
12. [SimPO Training](SimPO.md)
17-
13. [Human Preference Alignment Training Documentation](Human-Preference-Alignment-Training-Documentation.md)
18-
14. [Megatron-training](Megatron-training.md)
11+
7. [OLLAMA Export](./OLLaMA-Export.md)
12+
8. [VLLM Inference and Deployment](VLLM-inference-acceleration-and-deployment.md)
13+
9. [LmDeploy Inference and Deployment](LmDeploy-inference-acceleration-and-deployment.md)
14+
10. [LLM Experimental](LLM-exp.md)
15+
11. [DPO Training](DPO.md)
16+
12. [ORPO Training](ORPO.md)
17+
13. [SimPO Training](SimPO.md)
18+
14. [Human Preference Alignment Training Documentation](Human-Preference-Alignment-Training-Documentation.md)
19+
15. [Megatron-training](Megatron-training.md)
1920

2021
### ⭐️Best Practices!
2122

docs/source_en/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ Swift DOCUMENTATION
2626
LLM/Human-Preference-Alignment-Training-Documentation.md
2727
LLM/LLM-eval.md
2828
LLM/LLM-quantization.md
29+
LLM/OLLaMA-Export.md
2930
LLM/VLLM-inference-acceleration-and-deployment.md
3031
LLM/LmDeploy-inference-acceleration-and-deployment.md
3132
LLM/Megatron-training.md

swift/llm/export.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,7 @@ def llm_export(args: ExportArguments) -> None:
200200
model_dir = args.ckpt_dir
201201
else:
202202
model_dir = args.model_id_or_path
203+
logger.info(f'Using model_dir: {model_dir}')
203204
_, tokenizer = get_model_tokenizer(
204205
args.model_type, model_id_or_path=model_dir, revision=args.model_revision, load_model=False)
205206
model_dir = tokenizer.model_dir

0 commit comments

Comments
 (0)