You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Before you start the second fine-tuning, you need to download the pre-trained model we provided. Here we provide two pre-trained models, [InternVL-Chat-V1.2](https://huggingface.co/OpenGVLab/InternVL-Chat-Chinese-V1-2) and [InternVL-Chat-V1.2-Plus](https://huggingface.co/OpenGVLab/InternVL-Chat-Chinese-V1-2-Plus).
6
+
7
+
You can use the following command to download one of them, we recommend you download the plus version.
After downloading the pre-trained model, you need to prepare your customized SFT data. You should write a JSON file in `internvl_chat/shell/data/`, just like [this file](./shell/data/data_yi34b_finetune.json).
You can fine-tune our pre-trained models using this [script (train full LLM)](./shell/hermes2_yi34b/internvl_chat_v1_2_hermes2_yi34b_448_finetune_continue.sh) or this [script (train LoRA adapter)](./shell/hermes2_yi34b/internvl_chat_v1_2_hermes2_yi34b_448_finetune_continue_lora.sh), depending on your available GPU devices.
54
+
55
+
Before fine-tuning, you should set the `--meta_path` to the path of the JSON file you created in the last step. And, the default pre-trained model in these shell scripts is `./pretrained/InternVL-Chat-Chinese-V1-2`. You should change it to `./pretrained/InternVL-Chat-Chinese-V1-2-Plus` if you want to fine-tune our plus version.
56
+
57
+
> Note: fine-tune the full LLM needs 16 A100 80G GPUs, and fine-tune the LoRA needs 2 A100 80G GPUs.
58
+
59
+
```sh
60
+
# using 16 GPUs with slurm system, fine-tune the full LLM
61
+
PARTITION='your partition' GPUS=16 sh shell/hermes2_yi34b/internvl_chat_v1_2_hermes2_yi34b_448_finetune_continue.sh
62
+
# using 2 GPUs, fine-tune the LoRA
63
+
CUDA_VISIBLE_DEVICES=0,1 sh shell/hermes2_yi34b/internvl_chat_v1_2_hermes2_yi34b_448_finetune_continue_lora.sh
64
+
```
65
+
66
+
If you run into any problems, please let me know and I will improve the training guide to make it easier to use.
You can continue to fine-tune the checkpoint from the previous training process use this [script](./shell/hermes2_yi34b/internvl_chat_v1_2_hermes2_yi34b_448_finetune_continue.sh).
146
-
147
-
Before fine-tuning, you should set the `--meta_path` in to your custom meta file of training data.
148
-
149
-
```sh
150
-
# using 16 GPUs, fine-tune the full LLM
151
-
PARTITION='your partition' GPUS=16 sh shell/hermes2_yi34b/internvl_chat_v1_2_hermes2_yi34b_448_finetune_continue.sh
152
-
# using 2 GPUs, fine-tune the LoRA
153
-
CUDA_VISIBLE_DEVICES=0,1 sh shell/hermes2_yi34b/internvl_chat_v1_2_hermes2_yi34b_448_finetune_continue_lora.sh
154
-
```
145
+
See [CONTINUED_FINETUNE.md](CONTINUED_FINETUNE.md).
155
146
156
147
## 📊 Evaluation
157
148
@@ -161,19 +152,19 @@ CUDA_VISIBLE_DEVICES=0,1 sh shell/hermes2_yi34b/internvl_chat_v1_2_hermes2_yi34b
- We found that incorrect images were used for training and testing in `AI2D`, meaning that for problems where `abcLabel` is True, `abc_images` were not utilized. We have now corrected the images used for testing, but the results may still be somewhat lower as a consequence.
179
170
@@ -189,8 +180,8 @@ CUDA_VISIBLE_DEVICES=0,1 sh shell/hermes2_yi34b/internvl_chat_v1_2_hermes2_yi34b
189
180
190
181
| model | QLLaMA | LLM | res | COCO | Flickr | NoCaps | VQAv2 | GQA | VizWiz | TextVQA | MME | POPE | Download |
0 commit comments