You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+32-23Lines changed: 32 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,6 +41,12 @@
41
41
42
42
43
43
## News
44
+
🔥🔥 [2024/01/17] We released MFTCoder v0.3.0, mainly for MFTCoder-accelerate. It now supports new models like Mixtral(MoE), Deepseek-coder, chatglm3. It supports FSDP as an option. It also supports Self-paced Loss as a solution for convergence balance in Multitask Fine-tuning.
45
+
46
+
🔥🔥 [2024/01/17][CodeFuse-Deepseek-33B](https://huggingface.co/codefuse-ai/CodeFuse-Deepseek-33B) has been released, achieving a pass@1 (greedy decoding) score of 78.7% on HumanEval. It achieves top1 win-rate on Bigcode Leardboard.
47
+
48
+
🔥🔥 [2024/01/17][CodeFuse-Mixtral-8x7B](https://huggingface.co/codefuse-ai/CodeFuse-Mixtral-8X7B) has been released, achieving a pass@1 (greedy decoding) score of 56.1% on HumanEval.
49
+
44
50
🔥🔥 [2023/11/07][MFTCoder Paper](https://arxiv.org/abs/2311.02303) has been released on Arxiv, which discloses technique details of multi-task-fine-tuning.
45
51
46
52
🔥🔥 [2023/10/20][CodeFuse-QWen-14B](https://huggingface.co/codefuse-ai/CodeFuse-QWen-14B) has been released, achieving a pass@1 (greedy decoding) score of 48.8% on HumanEval, which gains 16% absolute improvement over the base model [Qwen-14b](https://huggingface.co/Qwen/Qwen-14B)
@@ -88,7 +96,7 @@ In MFTCoder, we released two codebases for finetuning Large Language Models:
88
96
The aim of this project is to foster collaboration and share advancements in large language models, particularly within the domain of code development.
89
97
90
98
### Frameworks
91
-

99
+

92
100
93
101
### Highlights
94
102
:white_check_mark:**Multi-task**: Train models on multiple tasks while maintaining a balance between them. The models can even generalize to new, previously unseen tasks.
@@ -133,17 +141,18 @@ If you want to explore some new framework like atorch, you could check:
133
141
134
142
## Models
135
143
136
-
We are excited to release the following two CodeLLMs trained by MFTCoder, now available on Hugging Face:
137
-
144
+
We are excited to release the following two CodeLLMs trained by MFTCoder, now available on both HuggingFace and ModelScope:
138
145
139
-
| Model | Base Model | Num of examples trained | Batch Size | Seq Length |
0 commit comments