OpenGVLab
diff --git a/‎gorilla/README.md‎
Lines changed: 121 additions & 0 deletions b/‎gorilla/README.md‎
Lines changed: 121 additions & 0 deletions
diff --git a/‎gorilla/alpaca_finetuning_v1/engine_finetuning.py‎
Lines changed: 132 additions & 0 deletions b/‎gorilla/alpaca_finetuning_v1/engine_finetuning.py‎
Lines changed: 132 additions & 0 deletions
diff --git a/‎gorilla/alpaca_finetuning_v1/extract_adapter_from_checkpoint.py‎
Lines changed: 23 additions & 0 deletions b/‎gorilla/alpaca_finetuning_v1/extract_adapter_from_checkpoint.py‎
Lines changed: 23 additions & 0 deletions
@@ -0,0 +1,121 @@
+# Environment Setup
+
+* Setup up a new conda env and install required packages
+  ```bash
+  # create conda env
+  conda create -n minigpt python=3.10 -y
+  conda activate minigpt
+  # install packages
+  pip install -r requirements.txt
+  ```
+
+* This project relies on [apex](https://github.com/NVIDIA/apex), which, unfortunately, you need to compile from source. Please follow the [official instructions](https://github.com/NVIDIA/apex#from-source) to compile.
+  * Some experience to compile successfully:
+    1. `git clone https://github.com/NVIDIA/apex`
+    2. make sure the version of CUDA on your machine is eqaul to the version with which your installed pytorch is built.
+    3. make sure `pip >= 23.1`, otherwise run `pip install --upgrade pip`
+    4. `pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./`
+
+* LLaMA ckpt preparation. Please request access to the pre-trained LLaMA from [this form](https://forms.gle/jk851eBVbX1m5TAv5), and orgnize the diractory like:
+  ```
+  {path/to/llama}/
+  |- consolidated.00.pth
+  |- params.json
+  |- tokenizer.model
+  ```
+
+* **The `torchhub_train.json` from [Gorilla official repository](https://github.com/ShishirPatil/gorilla/tree/main/data/apibench) has different format compared to `tensorflow_train.json` and `huggingface_train.json`, so currently we didn't conduct experiments on it.**
+
+
+# Full finetune
+
+## model training
+
+* First specify `llama_path` in `finetune/scripts/finetune/finetune_7B_gorilla_{tf,hf,th}.sh`
+
+* Then run script:
+  ```bash
+  cd finetune
+  bash scripts/finetune/finetune_7B_gorilla_{tf,hf,th}.sh sdp 1
+  ```
+  Last parameter is model parallel size, and you can increase it if GPU memory is low. For A100/A6000, you can leave it 1.
+
+
+
+## Inference
+
+* To evaluate model performance, first we need generate responses with finetuned model.
+
+* First copy `params.json` to model folder:
+  ```bash
+  cp {path/to/llama}/params.json finetune/output/{exp_name}/{epoch*}/
+  ```
+
+* Then run command:
+  ```bash
+  cd inference
+  torchrun --nproc_per_node 1 gorilla_inference_full_finetune.py --dataset_path ../gorilla-main/eval/eval-data/questions/{tensorflowhub, huggingface, torchhub}/questions_{tensorflowhub, huggingface, torchhub}_0_shot.jsonl --ckpt_dir ../finetune/output/{exp_name}/{epoch*}/ --tokenizer_path {path/to/llama}/tokenizer.model
+  ```
+  **Note**: `ckpt_dir` should be a **FOLDER**, not a `.pth` file. Only inference on one GPU is supported.
+
+
+
+# LLaMA adapter finetune
+
+## model training
+
+* First specify `llama_path` in `alpaca_finetuning_v1/finetune_{tf,hf,th}.sh`
+
+* Then run script:
+  ```bash
+  cd alpaca_finetuning_v1
+  bash finetune_{tf,hf,th}.sh
+  ```
+  Note: `--blr` is the base learning rate, we have `lr = blr * eff_batch_size / 256` in `alpaca_finetuning_v1/finetuning.py` line 237. Adjust it when you change the GPU number.
+
+* After adapter finetune, we can extract adapter parameters from checkpoint. Run:
+  ```bash
+  python extract_adapter_from_checkpoint.py --model_path ./checkpoint/{exp_name}/{pth_file}
+  ```
+
+## Inference
+
+* Run command:
+  ```bash
+  cd inference
+  torchrun --nproc_per_node 1 gorilla_inference_llama_adapter_v1.py --ckpt_dir {path/to/llama} --tokenizer_path {path/to/llama}/tokenizer.model --adapter_path ../alpaca_finetuning_v1/checkpoint/{exp_name}/{adapter_pth_file} --dataset_path ../gorilla-main/eval/eval-data/questions/{tensorflowhub, huggingface, torchhub}/questions_{tensorflowhub, huggingface, torchhub}_0_shot.jsonl
+  ```
+  **Note**: `ckpt_dir` should be a **FOLDER**, not a `.pth` file. Only inference on one GPU is supported.
+
+
+
+# Evaluation
+
+* Run Gorilla official evaluation code by:
+  ```bash
+  cd gorilla-main/eval/eval-scripts/
+
+  # For full finetune
+  python ast_eval_{tf,hf,th}.py --api_dataset ../../data/api/{tensorflowhub_api, huggingface_api, torchhub_api}.jsonl --apibench ../../data/apibench/{tensorflow,huggingface,torchhub}_eval.json --llm_responses ../../../finetune/output/{exp_name}/{epoch*}/model_prediction_results.jsonl
+  
+  # For llama-adapter
+  python ast_eval_{tf,hf,th}.py --api_dataset ../../data/api/{tensorflowhub_api, huggingface_api, torchhub_api}.jsonl --apibench ../../data/apibench/{tensorflow,huggingface,torchhub}_eval.json --llm_responses ../../../alpaca_finetuning_v1/checkpoint/{exp_name}/model_prediction_results.jsonl
+  ```
+
+
+
+# Results
+
+Our finetuned LLaMA-adapter models and their predictions can be found in [this link](https://drive.google.com/drive/folders/1PN5QjOlMVnmSSFi68CubvQfGYeodvO8w?usp=sharing).
+
+| Methods       | TensorFlow Hub      | TensorFlow Hub     | HuggingFace         | HuggingFace        |
+| ------------- | ------------------- | ------------------ | ------------------- | ------------------ |
+|               | overall $\uparrow$ | hallu $\downarrow$ | overall $\uparrow$ | hallu $\downarrow$ |
+| Official      | 83.79               | 5.40               | 71.68               | 10.95              |
+| Full finetune | 88.02               | 1.02               | 69.69               | 10.29              |
+| LLaMA-adapter | 86.90               | 0.74               | 63.62               | 11.83              |
+
+
+
+
+
@@ -0,0 +1,132 @@
+import math
+import sys
+from typing import Iterable
+
+import torch
+import util.lr_sched as lr_sched
+import util.misc as misc
+
+
+def train_one_epoch(
+    model: torch.nn.Module,
+    data_loader: Iterable,
+    optimizer: torch.optim.Optimizer,
+    device: torch.device,
+    epoch: int,
+    loss_scaler,
+    log_writer=None,
+    args=None,
+):
+
+    model.train(True)
+    metric_logger = misc.MetricLogger(delimiter="  ")
+    metric_logger.add_meter("lr", misc.SmoothedValue(window_size=1, fmt="{value:.6f}"))
+    header = "Epoch: [{}]".format(epoch)
+    print_freq = 10
+
+    accum_iter = args.accum_iter
+
+    optimizer.zero_grad()
+
+    if log_writer is not None:
+        print("log_dir: {}".format(log_writer.log_dir))
+    for data_iter_step, (examples, labels, example_mask) in enumerate(
+        metric_logger.log_every(data_loader, print_freq, header)
+    ):
+        # we use a per iteration (instead of per epoch) lr scheduler
+        if data_iter_step % accum_iter == 0:
+            lr_sched.adjust_learning_rate(optimizer, data_iter_step / len(data_loader) + epoch, args)
+
+        c_loss = model(examples, labels)
+        loss = c_loss
+        loss_value = loss.item()
+        c_loss_value = c_loss.item()
+
+        if not math.isfinite(loss_value):
+            print("Loss is {}, stopping training".format(loss_value))
+            sys.exit(1)
+
+        loss /= accum_iter
+
+        loss_scaler(loss, optimizer, parameters=model.parameters(), update_grad=(data_iter_step + 1) % accum_iter == 0)
+        if (data_iter_step + 1) % accum_iter == 0:
+            optimizer.zero_grad()
+
+        torch.cuda.synchronize()
+
+        metric_logger.update(closs=c_loss_value)
+
+        lr = optimizer.param_groups[0]["lr"]
+        metric_logger.update(lr=lr)
+
+        misc.all_reduce_mean(loss_value)
+        c_loss_value_reduce = misc.all_reduce_mean(c_loss_value)
+
+        if log_writer is not None and (data_iter_step + 1) % accum_iter == 0:
+            """We use epoch_1000x as the x-axis in tensorboard.
+            This calibrates different curves when batch size changes.
+            """
+            epoch_1000x = int((data_iter_step / len(data_loader) + epoch) * 1000)
+            log_writer.add_scalar("c_train_loss", c_loss_value_reduce, epoch_1000x)
+            log_writer.add_scalar("lr", lr, epoch_1000x)
+
+    # gather the stats from all processes
+    metric_logger.synchronize_between_processes()
+    print("Averaged stats:", metric_logger)
+    return {k: meter.global_avg for k, meter in metric_logger.meters.items()}
+
+
+def val_one_epoch(
+    model: torch.nn.Module,
+    data_loader: Iterable,
+    optimizer: torch.optim.Optimizer,
+    device: torch.device,
+    epoch: int,
+    loss_scaler,
+    log_writer=None,
+    args=None,
+):
+    model.eval()
+    metric_logger = misc.MetricLogger(delimiter="  ")
+    metric_logger.add_meter("lr", misc.SmoothedValue(window_size=1, fmt="{value:.6f}"))
+    header = "Epoch: [{}]".format(epoch)
+    print_freq = 10
+
+    accum_iter = args.accum_iter
+
+    if log_writer is not None:
+        print("log_dir: {}".format(log_writer.log_dir))
+    for data_iter_step, (examples, labels, example_mask) in enumerate(
+        metric_logger.log_every(data_loader, print_freq, header)
+    ):
+
+        with torch.no_grad():
+            c_loss = model(examples, labels)
+        loss = c_loss
+        loss_value = loss.item()
+
+        c_loss_value = c_loss.item()
+
+        if not math.isfinite(loss_value):
+            print("Loss is {}, stopping training".format(loss_value))
+            sys.exit(1)
+
+        metric_logger.update(closs=c_loss_value)
+
+        lr = optimizer.param_groups[0]["lr"]
+        metric_logger.update(lr=lr)
+
+        misc.all_reduce_mean(loss_value)
+        c_loss_value_reduce = misc.all_reduce_mean(c_loss_value)
+        if log_writer is not None and (data_iter_step + 1) % accum_iter == 0:
+            """We use epoch_1000x as the x-axis in tensorboard.
+            This calibrates different curves when batch size changes.
+            """
+            epoch_1000x = int((data_iter_step / len(data_loader) + epoch) * 1000)
+            log_writer.add_scalar("c_train_loss", c_loss_value_reduce, epoch_1000x)
+            log_writer.add_scalar("lr", lr, epoch_1000x)
+
+    # gather the stats from all processes
+    metric_logger.synchronize_between_processes()
+    print("Averaged stats:", metric_logger)
+    return {k: meter.global_avg for k, meter in metric_logger.meters.items()}
@@ -0,0 +1,23 @@
+import torch
+import argparse
+
+args = argparse.ArgumentParser("extract", add_help=False)
+
+args.add_argument("--model_path", type=str)
+
+args = args.parse_args()
+
+model = torch.load(args.model_path, map_location="cpu")
+new_model = dict()
+weight_list = ["layers." + str(i) + ".attention.gate" for i in range(32)]
+old_weight_list = ["layers." + str(i) + ".attention.gate" for i in range(32)]
+weight_list = weight_list + ["adapter_query.weight"]
+
+print(weight_list)
+print(model["model"]["adapter_query.weight"].shape)
+
+for i in range(len(weight_list)):
+    new_model[weight_list[i]] = model["model"][weight_list[i]]
+
+save_path = args.model_path.replace('.pth', '-adapter.pth')
+torch.save(new_model, save_path)