|
| 1 | +# OLLaMA Export Documentation |
| 2 | + |
| 3 | +SWIFT now supports exporting OLLaMA Model files, integrated into the `swift export` command. |
| 4 | + |
| 5 | +## Contents |
| 6 | + |
| 7 | +- [Environment Setup](#environment-setup) |
| 8 | +- [Export](#export) |
| 9 | +- [Points to Note](#points-to-note) |
| 10 | + |
| 11 | +## Environment Setup |
| 12 | + |
| 13 | +```shell |
| 14 | +# Set pip global mirror (to speed up downloads) |
| 15 | +pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ |
| 16 | +# Install ms-swift |
| 17 | +git clone https://github.com/modelscope/swift.git |
| 18 | +cd swift |
| 19 | +pip install -e '.[llm]' |
| 20 | +``` |
| 21 | + |
| 22 | +No additional modules are needed for OLLaMA export, as SWIFT only exports the ModelFile. Users can handle subsequent operations. |
| 23 | + |
| 24 | +## Export |
| 25 | + |
| 26 | +The OLLaMA export command line is as follows: |
| 27 | + |
| 28 | +```shell |
| 29 | +# model_type |
| 30 | +swift export --model_type llama3-8b-instruct --to_ollama true --ollama_output_dir llama3-8b-instruct-ollama |
| 31 | +# ckpt_dir, note that for lora training, add --merge_lora true |
| 32 | +swift export --ckpt_dir /mnt/workspace/yzhao/tastelikefeet/swift/output/qwen-7b-chat/v141-20240331-110833/checkpoint-10942 --to_ollama true --ollama_output_dir qwen-7b-chat-ollama --merge_lora true |
| 33 | +``` |
| 34 | + |
| 35 | +After execution, the following log will be printed: |
| 36 | +```shell |
| 37 | +[INFO:swift] Exporting to ollama: |
| 38 | +[INFO:swift] If you have a gguf file, try to pass the file by :--gguf_file /xxx/xxx.gguf, else SWIFT will use the original(merged) model dir |
| 39 | +[INFO:swift] Downloading the model from ModelScope Hub, model_id: LLM-Research/Meta-Llama-3-8B-Instruct |
| 40 | +[WARNING:modelscope] Authentication has expired, please re-login with modelscope login --token "YOUR_SDK_TOKEN" if you need to access private models or datasets. |
| 41 | +[WARNING:modelscope] Using branch: master as version is unstable, use with caution |
| 42 | +[INFO:swift] Loading the model using model_dir: /mnt/workspace/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct |
| 43 | +[INFO:swift] Save Modelfile done, you can start ollama by: |
| 44 | +[INFO:swift] > ollama serve |
| 45 | +[INFO:swift] In another terminal: |
| 46 | +[INFO:swift] > ollama create my-custom-model -f /mnt/workspace/yzhao/tastelikefeet/swift/llama3-8b-instruct-ollama/Modelfile |
| 47 | +[INFO:swift] > ollama run my-custom-model |
| 48 | +[INFO:swift] End time of running main: 2024-08-09 17:17:48.768722 |
| 49 | +``` |
| 50 | +
|
| 51 | +Check the Modelfile: |
| 52 | +
|
| 53 | +```text |
| 54 | +FROM /mnt/workspace/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct |
| 55 | +TEMPLATE """{{ if .System }}<|begin_of_text|><|start_header_id|>system<|end_header_id|> |
| 56 | +
|
| 57 | +{{ .System }}<|eot_id|>{{ else }}<|begin_of_text|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|> |
| 58 | +
|
| 59 | +{{ .Prompt }}<|eot_id|><|start_header_id|>assistant<|end_header_id|> |
| 60 | +
|
| 61 | +{{ end }}{{ .Response }}<|eot_id|>""" |
| 62 | +PARAMETER stop "<|eot_id|>" |
| 63 | +PARAMETER temperature 0.3 |
| 64 | +PARAMETER top_k 20 |
| 65 | +PARAMETER top_p 0.7 |
| 66 | +PARAMETER repeat_penalty 1.0 |
| 67 | +``` |
| 68 | +
|
| 69 | +Users can modify the generated file for subsequent inference. |
| 70 | +
|
| 71 | +### Using OLLaMA |
| 72 | +
|
| 73 | +To use the above file, install OLLaMA: |
| 74 | +
|
| 75 | +```shell |
| 76 | +# https://github.com/ollama/ollama |
| 77 | +curl -fsSL https://ollama.com/install.sh | sh |
| 78 | +``` |
| 79 | +
|
| 80 | +Start OLLaMA: |
| 81 | +
|
| 82 | +```shell |
| 83 | +ollama serve |
| 84 | +``` |
| 85 | +
|
| 86 | +In another terminal, run: |
| 87 | +
|
| 88 | +```shell |
| 89 | +ollama create my-custom-model -f /mnt/workspace/yzhao/tastelikefeet/swift/llama3-8b-instruct-ollama/Modelfile |
| 90 | +``` |
| 91 | +
|
| 92 | +The following log will be printed after execution: |
| 93 | +
|
| 94 | +```text |
| 95 | +transferring model data |
| 96 | +unpacking model metadata |
| 97 | +processing tensors |
| 98 | +converting model |
| 99 | +creating new layer sha256:37b0404fb276acb2e5b75f848673566ce7048c60280470d96009772594040706 |
| 100 | +creating new layer sha256:2ecd014a372da71016e575822146f05d89dc8864522fdc88461c1e7f1532ba06 |
| 101 | +creating new layer sha256:ddc2a243c4ec10db8aed5fbbc5ac82a4f8425cdc4bd3f0c355373a45bc9b6cb0 |
| 102 | +creating new layer sha256:fc776bf39fa270fa5e2ef7c6782068acd858826e544fce2df19a7a8f74f3f9df |
| 103 | +writing manifest |
| 104 | +success |
| 105 | +``` |
| 106 | +
|
| 107 | +You can then use the command name for inference: |
| 108 | +
|
| 109 | +```shell |
| 110 | +ollama run my-custom-model |
| 111 | +``` |
| 112 | +
|
| 113 | +```shell |
| 114 | +>>> who are you? |
| 115 | +I'm LLaMA, a large language model trained by a team of researchers at Meta AI. My primary function is to understand and respond to human |
| 116 | +input in a helpful and informative way. I'm a type of AI designed to simulate conversation, answer questions, and even generate text based |
| 117 | +on a given prompt or topic. |
| 118 | + |
| 119 | +I'm not a human, but rather a computer program designed to mimic human-like conversation. I don't have personal experiences, emotions, or |
| 120 | +physical presence, but I'm here to provide information, answer your questions, and engage in conversation to the best of my abilities. |
| 121 | +
|
| 122 | +I'm constantly learning and improving my responses based on the interactions I have with users like you, so please bear with me if I make |
| 123 | +any mistakes or don't quite understand what you're asking. I'm here to help and provide assistance, so feel free to ask me anything! |
| 124 | +``` |
| 125 | +
|
| 126 | +## Points to Note |
| 127 | +
|
| 128 | +1. Some models may report an error during: |
| 129 | +
|
| 130 | +```shell |
| 131 | +ollama create my-custom-model -f /mnt/workspace/yzhao/tastelikefeet/swift/qwen-7b-chat-ollama/Modelfile |
| 132 | +``` |
| 133 | +
|
| 134 | +Error message: |
| 135 | +
|
| 136 | +```shell |
| 137 | +Error: Models based on 'QWenLMHeadModel' are not yet supported |
| 138 | +``` |
| 139 | +
|
| 140 | +This is because the conversion in OLLaMA does not support all types of models. You can perform gguf export yourself and modify the FROM field in the Modelfile: |
| 141 | +
|
| 142 | +```shell |
| 143 | +# Detailed conversion steps can be found at: https://github.com/ggerganov/llama.cpp/blob/master/examples/quantize/README.md |
| 144 | +git clone https://github.com/ggerganov/llama.cpp.git |
| 145 | +cd llama.cpp |
| 146 | +# The model directory can be found in the `swift export` command log, similar to: |
| 147 | +# Using model_dir: /mnt/workspace/yzhao/tastelikefeet/swift/output/qwen-7b-chat/v141-20240331-110833/checkpoint-10942-merged |
| 148 | +python convert_hf_to_gguf.py /mnt/workspace/yzhao/tastelikefeet/swift/output/qwen-7b-chat/v141-20240331-110833/checkpoint-10942-merged |
| 149 | +``` |
| 150 | +
|
| 151 | +Then re-execute: |
| 152 | +
|
| 153 | +```shell |
| 154 | +ollama create my-custom-model -f /mnt/workspace/yzhao/tastelikefeet/swift/qwen-7b-chat-ollama/Modelfile |
| 155 | +``` |
0 commit comments