You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-1Lines changed: 2 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -39,6 +39,7 @@ To facilitate use by users unfamiliar with deep learning, we provide a Gradio we
39
39
Additionally, we are expanding capabilities for other modalities. Currently, we support full-parameter training and LoRA training for AnimateDiff.
40
40
41
41
## 🎉 News
42
+
- 2024.04.10: Use SWIFT to fine-tune the qwen-7b-chat model to enhance its function call capabilities, and combine it with [Modelscope-Agent](https://github.com/modelscope/modelscope-agent) for best practices, which can be found [here](https://github.com/modelscope/swift/tree/main/docs/source_en/LLM/Agent-best-practice.md#Usage-with-Modelscope_Agent).
42
43
- 🔥2024.04.09: Support ruozhiba dataset. Search `ruozhiba` in [this documentation](docs/source_en/LLM/Supported-models-datasets.md) to begin training!
43
44
- 2024.04.08: Support the fine-tuning and inference of XVERSE-MoE-A4.2B model, use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/xverse_moe_a4_2b/lora/sft.sh) to start training!
44
45
- 2024.04.04: Support **QLoRA+FSDP** to train a 70B model with two 24G memory GPUs, use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/llama2_70b_chat/qlora_fsdp/sft.sh) to train.
@@ -431,7 +432,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
Copy file name to clipboardExpand all lines: docs/source_en/LLM/Agent-best-practice.md
+115Lines changed: 115 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,6 +10,7 @@ SWIFT supports open-source models, especially small and medium-sized models (7B,
10
10
-[Data Preparation](#Data-Preparation)
11
11
-[Fine-tuning](#Fine-tuning)
12
12
-[Inference](#Inference)
13
+
-[Usage with Modelscope-Agent](#Usage-with-Modelscope_Agent)
13
14
-[Summary](#Summary)
14
15
15
16
## Environment Setup
@@ -421,7 +422,121 @@ print()
421
422
# response:
422
423
# Final Answer: There is fire in the image at coordinates [101.1, 200.9]
423
424
```
425
+
## Usage-with-Modelscope_Agent
426
+
In conjunction with Modelscope-Agent(https://github.com/modelscope/modelscope-agent), fine-tune models for building Agents.
424
427
428
+
This section focuses on the interactive framework AgentFabric within Modelscope-Agent to fine-tune the small model qwen-7b-chat to enablefunctioncall capabilities.
429
+
430
+
Due to the mismatch between the system prompt in ms-agent and that in Modelscope-Agent, direct training yields suboptimal results. To address this, we have created a new dataset [ms_agent_for_agentfabric](https://modelscope.cn/datasets/AI-ModelScope/ms_agent_for_agentfabric/summary) by converting the format from ms-agent, which is now integrated into SWIFT. The `ms-agent-for-agentfabric-default` includes 30,000 entries converted from ms-agent data, while`ms-agent-for-agentfabric-additional` contains 488 entries filtered from actual functioncall access data by the open-source AgentFabric framework.
431
+
432
+
### Fine-tuning
433
+
Replace `dataset` with `ms-agent-for-agentfabric` and `ms-agent-for-agentfabric-default`:
Note that if deploying with `swift deploy`, the value of `model` should be set to `qwen-7b-chat`.
515
+
516
+
#### Launching AgentFabric
517
+
In the following practice, [Wanx Image Generation](https://help.aliyun.com/zh/dashscope/opening-service?spm=a2c4g.11186623.0.0.50724937O7n40B) and [Amap Weather]((https://lbs.amap.com/api/webservice/guide/create-project/get-key)) will be called, requiring manual setting of API KEY. After setting, start AgentFabric:
It can be seen that the fine-tuned model can correctly understand instructions and call tools.
425
540
## Summary
426
541
427
542
Through the Agent training capability supported by SWIFT, we fine-tuned the qwen-7b-chat model using ms-agent and ms-bench. It can be seen that after fine-tuning, the model retains the general knowledge question-answering ability, and when the system field is added with APIs, it can correctly call and complete tasks. It should be noted that:
0 commit comments