You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**High Accuracy and efficiency multi-task fine-tuning framework for Code LLMs.**
79
+
**High Accuracy and efficiency Multi-task Fine-tuning framework for Code LLMs.**
80
+
81
+
**CodeFuse-MFTCoder** is an open-source project of CodeFuse for accurate and efficient Multi-task Fine-tuning(MFT) on Large Language Models(LLMs), especially on Code-LLMs(large language model for code tasks).
82
+
Moreover, we open source Code LLM models and code-related datasets along with the MFTCoder framework.
80
83
81
-
**CodeFuse-MFTCoder** is an open-source project of CodeFuse for multitasking Code-LLMs(large language model for code tasks), which includes models, datasets, training codebases and inference guides.
82
84
In MFTCoder, we released two codebases for finetuning Large Language Models:
83
-
-```mft_peft_hf``` is based on the HuggingFace Accelerate and deepspeed framework.
84
-
-```mft_atorch``` is based on the [ATorch frameworks](https://github.com/intelligent-machine-learning/dlrover), which is a fast distributed training framework of LLM.
85
+
-```MFTCoder-accelerate``` is a framework with accelerate and DeepSpeed/FSDP. All tech-stacks are open-source and vibrant. We highly recommend you try this framework and make your fintuning accurate and efficient.
86
+
-```MFTCoder-atorch``` is based on the [ATorch frameworks](https://github.com/intelligent-machine-learning/dlrover), which is a fast distributed training framework of LLM.
85
87
86
88
The aim of this project is to foster collaboration and share advancements in large language models, particularly within the domain of code development.
87
89
@@ -93,17 +95,17 @@ The aim of this project is to foster collaboration and share advancements in lar
93
95
94
96
:white_check_mark:**Multi-model**: It integrates state-of-the-art open-source models such as gpt-neox, llama, llama-2, baichuan, Qwen, chatglm2, and more. (These finetuned models will be released in the near future.)
95
97
96
-
:white_check_mark:**Multi-framework**: It provides support for both HuggingFace Accelerate (with deepspeed) and [ATorch](https://github.com/intelligent-machine-learning/dlrover).
98
+
:white_check_mark:**Multi-framework**: It provides support for both Accelerate (with Deepspeed and FSDP) and ATorch
97
99
98
-
:white_check_mark:**Efficient fine-tuning**: It supports LoRA and QLoRA, enabling fine-tuning of large models with minimal resources. The training speed meets the demands of almost all fine-tuning scenarios.
100
+
:white_check_mark:**Efficient fine-tuning**: It supports LoRA, QLoRA as well as Full-parameters training, enabling fine-tuning of large models with minimal resources. The training speed meets the demands of almost all fine-tuning scenarios.
99
101
100
102
The main components of this project include:
101
103
- Support for both SFT (Supervised FineTuning) and MFT (Multi-task FineTuning). The current MFTCoder achieves data balance among multiple tasks, and future releases will achieve a balance between task difficulty and convergence speed during training.
102
-
- Support for QLoRA instruction fine-tuning, as well as LoRA fine-tuning.
103
-
- Support for most mainstream open-source large models, particularly those relevant to Code-LLMs, such as Code-LLaMA, Starcoder, Codegeex2, Qwen, GPT-Neox, and more.
104
+
- Support for QLoRA instruction fine-tuning, LoRA fine-tuning as well as Full-parameters fine-tuning.
105
+
- Support for most mainstream open-source large models, particularly those relevant to Code-LLMs, such as DeepSeek-coder, Mistral, Mixtral, Chatglm3, Code-LLaMA, Starcoder, Codegeex2, Qwen, GPT-Neox, and more.
104
106
- Support for weight merging between the LoRA adaptor and base models, simplifying the inference process.
105
107
- Release of 2 high-quality code-related instruction fine-tuning datasets: [Evol-instruction-66k](https://huggingface.co/datasets/codefuse-ai/Evol-instruction-66k) and [CodeExercise-Python-27k](https://huggingface.co/datasets/codefuse-ai/CodeExercise-Python-27k).
106
-
- Release of 2 models: [CodeFuse-13B](https://huggingface.co/codefuse-ai/CodeFuse-13B) and [CodeFuse-CodeLlama-34B](https://huggingface.co/codefuse-ai/CodeFuse-CodeLlama-34B).
108
+
- Release of many Code LLMs, please refer to organizations: [codefuse-ai on huggingface](https://huggingface.co/codefuse-ai) or [codefuse-ai on modelscope](https://modelscope.cn/organization/codefuse-ai).
107
109
108
110
109
111
## Requirements
@@ -113,13 +115,20 @@ Next, we have provided an init_env.sh script to simplify the installation of req
113
115
```bash
114
116
sh init_env.sh
115
117
```
116
-
If you require flash attention, please refer to the following link for installation instructions: https://github.com/Dao-AILab/flash-attention
118
+
We highly recommend training with flash attention(version >= 2.1.0, preferably 2.3.6), please refer to the following link for installation instructions: https://github.com/Dao-AILab/flash-attention
117
119
118
120
119
121
## Training
120
-
🚀 [Huggingface accelerate + deepspeed Codebase for MFT(Multi-task Finetuning)](mftcoder_accelerate/README.md)
122
+
As mentioned above, we open source two training frameworks. You could refer to their own READMEs for more details as followed.
123
+
124
+
If you are familiar with open source ```transformers```, ```DeepSpeed``` or ```FSDP```, we highly recommend you try:
125
+
126
+
🚀🚀 [MFTCoder-accelerate: Accelerate + Deepspeed/FSDP Codebase for MFT(Multi-task Finetuning)](mftcoder_accelerate/README.md)
127
+
128
+
129
+
If you want to explore some new framework like atorch, you could check:
121
130
122
-
🚀 [Atorch Codebase for MFT(Multi-task Finetuning)](mftcoder_atorch/README.md)
131
+
🚀 [MFTCoder-atorch: Atorch Codebase for MFT(Multi-task Finetuning)](mftcoder_atorch/README.md)
-开源多个[Codefuse系列指令微调模型权重],具体参见我们的huggingface组织和modelscope组织下的模型:[codefuse-ai huggingface](https://huggingface.co/codefuse-ai) or [codefuse-ai 魔搭](https://modelscope.cn/organization/codefuse-ai)。
0 commit comments