Skip to content

Commit c4eaa98

Browse files
committed
Merge remote-tracking branch 'origin/main'
2 parents 90a7011 + e7a20ed commit c4eaa98

File tree

8 files changed

+135
-40
lines changed

8 files changed

+135
-40
lines changed

README.md

Lines changed: 8 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# LLaMA-Adapter: Efficient Fine-tuning of LLaMA 🚀
22

3+
## Announcement: We release **[LLaMA2-Accessory](https://github.com/Alpha-VLLM/LLaMA2-Accessory)**, an open-source toolkit for **pre-training**, **fine-tuning** and **deployment** of **LLMs** and **mutlimodal LLMs**.🔥
4+
35
Official implementation of ['LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention'](https://arxiv.org/pdf/2303.16199.pdf) and ['LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model'](https://arxiv.org/pdf/2304.15010.pdf).
46

57
<p align="center"> <img src="docs/logo_v4.png"/ width="100%"> <br>
@@ -11,13 +13,15 @@ This repo proposes **LLaMA-Adapter (V2)**, a lightweight adaption method for fin
1113
Try out the web demo 🤗 of LLaMA-Adapter: [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/csuhan/LLaMA-Adapter), [LLaMA-Adapter V2](http://llama-adapter.opengvlab.com/) and [ImageBind-LLM](http://imagebind-llm.opengvlab.com/).
1214

1315
## News
16+
- **[2023.07.24]** We release **[LLaMA2-Accessory](https://github.com/Alpha-VLLM/LLaMA2-Accessory)**, an open-source toolkit for **pre-training**, **fine-tuning** and **deployment** of **Large Language Models (LLMs)** and **mutlimodal LLMs**. Please check [Alpha-VLLM/LLaMA2-Accessory](https://github.com/Alpha-VLLM/LLaMA2-Accessory) for more details!🔥🔥🔥
17+
- **[2023.07.05]** We release the pretrain/finetune code of [llama_adapter_v2_multimodal](https://github.com/OpenGVLab/LLaMA-Adapter/tree/main/llama_adapter_v2_multimodal).
1418
- **[2023.07.04]** We release the code for reproducing [Gorilla](https://github.com/ShishirPatil/gorilla) by both full finetune and LLaMA-Adapter, please see [gorilla/README.md](https://github.com/OpenGVLab/LLaMA-Adapter/blob/main/gorilla/README.md).
15-
- **[2023.06.08]** We release the [demo](http://imagebind-llm.opengvlab.com/) of ImageBind-LLM 🔥🔥🔥.
16-
- **[2023.06.06]** We release [Point-Bind](https://github.com/ZrrSkywalker/Point-Bind) 🔥🔥🔥 to extend ImageBind with 3D point clouds, which achieves 3D instruction-following capacity for [imagebind_LLM](imagebind_LLM).
19+
- **[2023.06.08]** We release the [demo](http://imagebind-llm.opengvlab.com/) of ImageBind-LLM.
20+
- **[2023.06.06]** We release [Point-Bind](https://github.com/ZrrSkywalker/Point-Bind) to extend ImageBind with 3D point clouds, which achieves 3D instruction-following capacity for [imagebind_LLM](imagebind_LLM).
1721
- **[2023.06.05]** We support the integration of LLaMA-Adapter (both V1 and V2) and [LangChain](https://python.langchain.com/en/latest/index.html). Check out the [Notebook](/docs/langchain_LLaMA_AdapterV2_demo.ipynb).
18-
- **[2023.05.29]** We release the code of ImageBind-LLM at [imagebind_LLM](imagebind_LLM) 🔥🔥🔥.
22+
- **[2023.05.29]** We release the code of ImageBind-LLM at [imagebind_LLM](imagebind_LLM).
1923
- **[2023.05.23]** We release the [demos](http://llama-adapter.opengvlab.com/) and [multi-modal code](llama_adapter_v2_multimodal) of LLaMA-Adapter V2!
20-
- **[2023.05.05]** We release the paper and code of our new work [Personalize Segment Anything](https://github.com/ZrrSkywalker/Personalize-SAM) 🔥🔥🔥, which efficiently fine-tunes Segment Anything with **10 seconds**, and improves DreamBooth for better **text-to-image generation**.
24+
- **[2023.05.05]** We release the paper and code of our new work [Personalize Segment Anything](https://github.com/ZrrSkywalker/Personalize-SAM), which efficiently fine-tunes Segment Anything with **10 seconds**, and improves DreamBooth for better **text-to-image generation**.
2125
- **[2023.04.30]** We noticed that GPT-4 evaluation has a strong positional bias in favor of the first response. We will soon update the paper to reveal the position bias. Great thanks to [Canwen Xu](https://scholar.google.com/citations?user=oopKCDMAAAAJ&hl=en).
2226
- **[2023.04.28]** We release **LLaMA-Adapter V2**, a multi-modal instruction model. Check out our [paper](https://arxiv.org/abs/2304.15010), [demos](#demos) and [code](llama_adapter_v2_chat65b)!
2327
- **[2023.03.28]** The [paper](https://arxiv.org/pdf/2303.16199.pdf) and [training code](alpaca_finetuning_v1) for **LLaMA-Adapter V1** are released. 📌
@@ -39,20 +43,6 @@ Try out the web demo 🤗 of LLaMA-Adapter: [![Hugging Face Spaces](https://img.
3943
+ **ImageBind-dialog** will be release soon
4044

4145

42-
## <div id="demos">Demos (LLaMA-Adapter V2)</div>
43-
44-
### -> Chatbot System
45-
46-
<img src="docs/chat_demo.png" width="80%" />
47-
48-
49-
<!-- | <img src="docs/multi_model_example_1.png" /> | <img src="docs/multi_model_example_2.png" /> |
50-
|---|---|
51-
| <img src="docs/multi_model_example_3.png" /> | <img src="docs/multi_model_example_4.png" /> | -->
52-
53-
54-
55-
5646
## Overview
5747
Efficiency Comparison:
5848
| Model | Parameters | Storage Space | Training Time

llama_adapter_v2_chat65b/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,9 @@ conda env create -f environment.yml
7575

7676
* Use Ctrl+C to exit the demo at any time.
7777

78+
## Demo
79+
<img src="../docs/chat_demo.png" width="80%" />
80+
7881
## Known issues
7982

8083
* Some users may experience the error `RuntimeError: Expected is_sm80 to be true, but got false.` (Mostly sm_86 GPU users, including A6000, A5000 and 3090). This is because we changed the attention module to use `torch.nn.functional.scaled_dot_product_attention` if it exists, but a [dispatch logic error](https://github.com/pytorch/pytorch/issues/94883) in PyTorch = 2.0.0 causes failure on some GPU architectures. The affected users can upgrade to PyTorch >= 2.1.0 or the nightly build, in which the bug is fixed.

llama_adapter_v2_multimodal/README.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# LLaMA-Adapter-V2 Multi-modal
22

33
## News
4-
4+
* [July 5, 2023] Release pre-traininig and fine-tuning codes.
55
* [May 26, 2023] Initial release.
66

77

@@ -23,7 +23,7 @@
2323
└── tokenizer.model
2424
```
2525

26-
## Usage
26+
## Inference
2727

2828
Here is a simple inference script for LLaMA-Adapter V2. The pre-trained model will be downloaded directly from [Github Release](https://github.com/ZrrSkywalker/LLaMA-Adapter/releases/tag/v.2.0.0).
2929

@@ -37,7 +37,9 @@ device = "cuda" if torch.cuda.is_available() else "cpu"
3737

3838
llama_dir = "/path/to/LLaMA/"
3939

40+
# choose from BIAS-7B, LORA-BIAS-7B
4041
model, preprocess = llama.load("BIAS-7B", llama_dir, device)
42+
model.eval()
4143

4244
prompt = llama.format_prompt("Please introduce this painting.")
4345
img = Image.fromarray(cv2.imread("../docs/logo_v1.png"))
@@ -71,4 +73,7 @@ import llama
7173
print(llama.available_models())
7274
```
7375

74-
Now we provide `BIAS-7B`, which fine-tunes the `bias` and `norm` parameters of LLaMA. We will include more pretrained models in the future, such as the LoRA fine-tuning model `LoRA-7B` and partial-tuning model `PARTIAL-7B`.
76+
Now we provide `BIAS-7B` which fine-tunes the `bias` and `norm` parameters of LLaMA, and `LORA-BIAS-7B` which fine-tunes the `bias`, `norm` and `lora` parameters of LLaMA. We will include more pretrained models in the future, such as the LoRA fine-tuning model `LORA-7B` and partial-tuning model `PARTIAL-7B`.
77+
78+
## Pre-traininig & Fine-tuning
79+
See [train.md](docs/train.md)

llama_adapter_v2_multimodal/demo.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,14 @@
77

88
llama_dir = "/path/to/LLaMA/"
99

10+
# choose from BIAS-7B, LORA-BIAS-7B
1011
model, preprocess = llama.load("BIAS-7B", llama_dir, device)
1112
model.eval()
1213

1314
prompt = llama.format_prompt('Please introduce this painting.')
14-
img = Image.fromarray(cv2.imread("./docs/logo_v1.png"))
15+
img = Image.fromarray(cv2.imread("../docs/logo_v1.png"))
1516
img = preprocess(img).unsqueeze(0).to(device)
1617

1718
result = model.generate(img, [prompt])[0]
1819

19-
print(result)
20+
print(result)
Lines changed: 40 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
1-
The training process of LLaMA-Adapter V2 consists of the pre-training and fine-tuning phases.
1+
The training process of LLaMA-Adapter V2 consists of the pre-training and fine-tuning phases.
22

33
## Pre-training
4+
45
### Data
5-
* We use multiple datasets with **image-text pairs** for pre-training. The texts are English-only.
66

7+
* We use multiple datasets with **image-text pairs** for pre-training. The texts are English-only.
78
* For each dataset, the meta file should be organized in the `.csv` format as following:
89

910
```
@@ -14,8 +15,8 @@ The training process of LLaMA-Adapter V2 consists of the pre-training and fine-t
1415
```
1516

1617
Alternatively, you may modify the [`PretrainDataset`](/data/dataset.py) implementation to adapt to your own meta file format.
17-
1818
* Write a `.yaml` config file to specify the datasets for pre-training:
19+
1920
```
2021
META:
2122
- '/path/to/cc3m.csv'
@@ -25,29 +26,25 @@ The training process of LLaMA-Adapter V2 consists of the pre-training and fine-t
2526

2627
### Start pre-training
2728

28-
We are now ready to start pre-training (please make sure that the original LLaMA / Open-Chinese-LLaMA weights are available in `/path/to/llama_model_weights`).
29+
We are now ready to start pre-training (please make sure that the original LLaMA weights are available in `/path/to/llama_model_weights`).
2930

3031
```bash
3132
. exps/pretrain.sh /path/to/llama_model_weights /path/to/pretrain-data-config.yaml /output/path
3233
```
3334

34-
35-
3635
## Fine-tuning
3736

3837
### Data
3938

4039
* We fine-tune LLaMA-Adapter V2 on text-only as well as image-text instruction following datasets.
41-
4240
* The following lists the datasets we use for training our release weights:
4341

44-
| Name | Link |
45-
| ------------------------ | ------------------------------------------------------------ |
46-
| alpaca_gpt4_data.json | [File Link](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/blob/main/data/alpaca_gpt4_data.json) |
42+
| Name | Link |
43+
| ------------------------ | ------------------------------------------------------------------------------------------------------------ |
44+
| alpaca_gpt4_data.json | [File Link](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/blob/main/data/alpaca_gpt4_data.json) |
4745
| alpaca_gpt4_data_zh.json | [File Link](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/blob/main/data/alpaca_gpt4_data_zh.json) |
48-
| llava_instruct_150k.json | [File Link](https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K/raw/main/llava_instruct_150k.json) |
49-
| alpaca_data_zh_51k.json | [File Link](https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/data/alpaca_data_zh_51k.json) |
50-
46+
| llava_instruct_150k.json | [File Link](https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K/raw/main/llava_instruct_150k.json) |
47+
| alpaca_data_zh_51k.json | [File Link](https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/data/alpaca_data_zh_51k.json) |
5148
* Similar to pre-training, write a `.yaml` config file to specify the datasets for fine-tuning:
5249

5350
```
@@ -61,7 +58,36 @@ We are now ready to start pre-training (please make sure that the original LLaMA
6158

6259
```bash
6360
. exps/finetune.sh \
64-
/path/to/llama_model_weights /path/to/pre-trained/checkopint.pth \
61+
/path/to/llama_model_weights /path/to/pre-trained/checkpoint.pth \
6562
/path/to/finetune-data-config.yaml /output/path
6663
```
6764

65+
### Test and Save
66+
67+
```python
68+
import os
69+
from llama.llama_adapter import LLaMA_adapter
70+
import util.misc as misc
71+
import util.extract_adapter_from_checkpoint as extract
72+
73+
device = "cuda" if torch.cuda.is_available() else "cpu"
74+
75+
llama_dir = "path/to/llama/"
76+
llama_type = '7B'
77+
llama_ckpt_dir = os.path.join(llama_dir, llama_type)
78+
llama_tokenzier_path = os.path.join(llama_dir, 'tokenizer.model')
79+
model = LLaMA_adapter(llama_ckpt_dir, llama_tokenzier_path)
80+
81+
misc.load_model(model, 'path/to/finetune/checkpoint.pth')
82+
model.eval()
83+
model.to(device)
84+
85+
prompt = llama.format_prompt('your prompt')
86+
img = Image.fromarray(cv2.imread("your image"))
87+
img = model.clip_transform(img).unsqueeze(0).to(device)
88+
89+
result = model.generate(img, [prompt])[0]
90+
print(result)
91+
92+
extract.save(model,'path/to/adapter-7B.pth','BIAS') # Please end it with -llama_type.pth
93+
```

llama_adapter_v2_multimodal/llama/llama.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ class ModelArgs:
2626
w_bias: bool = False # use bias tuning
2727
w_lora: bool = False # use lora tuning
2828
lora_rank: int = 16
29+
w_new_gate: bool = False # for compatibility
2930

3031

3132
class RMSNorm(torch.nn.Module):
@@ -125,6 +126,10 @@ def __init__(self, args: ModelArgs):
125126
self.cache_v = None
126127

127128
self.gate = torch.nn.Parameter(torch.zeros(1, self.n_local_heads, 1, 1))
129+
130+
self.w_new_gate = args.w_new_gate
131+
if args.w_new_gate:
132+
self.new_gate = torch.nn.Parameter(torch.ones(1, 1, 1, 1))
128133

129134

130135
def train(self, mode: bool = True):
@@ -194,6 +199,8 @@ def forward(self, x: torch.Tensor, start_pos: int, freqs_cis: torch.Tensor, mask
194199
if adapter_len > 1:
195200
adapter_scores = torch.matmul(xq, adapter_k.transpose(2, 3)) / math.sqrt(self.head_dim)
196201
adapter_scores = self.gate.tanh() * F.softmax(adapter_scores.float(), dim=-1).type_as(xq)
202+
if self.w_new_gate:
203+
adapter_scores = self.new_gate * adapter_scores
197204
output = output + torch.matmul(adapter_scores, adapter_v)
198205
else:
199206
output = output + self.gate.tanh() * adapter_v

llama_adapter_v2_multimodal/llama/llama_adapter.py

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,9 @@ def __init__(self, llama_ckpt_dir, llama_tokenizer,
2020
v_embed_dim=768, v_depth=8,
2121
v_num_heads=16, v_mlp_ratio=4.0,
2222
query_len=10, query_layer=31,
23+
w_bias=False,
24+
w_lora=False, lora_rank=16,
25+
w_new_gate=False,
2326
phase="finetune"):
2427
super().__init__()
2528

@@ -58,6 +61,9 @@ def __init__(self, llama_ckpt_dir, llama_tokenizer,
5861

5962
# 5. llama
6063
model_args.w_bias = w_bias
64+
model_args.w_lora = w_lora
65+
model_args.lora_rank = lora_rank
66+
model_args.w_new_gate = w_new_gate
6167
model_args.vocab_size = self.tokenizer.n_words
6268
torch.set_default_tensor_type(torch.cuda.HalfTensor)
6369
self.llama = Transformer(model_args)
@@ -268,8 +274,10 @@ def generate(
268274
return decoded
269275

270276

277+
271278
_MODELS = {
272279
"BIAS-7B": "https://github.com/OpenGVLab/LLaMA-Adapter/releases/download/v.2.0.0/7fa55208379faf2dd862565284101b0e4a2a72114d6490a95e432cf9d9b6c813_BIAS-7B.pth",
280+
"LORA-BIAS-7B": "https://github.com/OpenGVLab/LLaMA-Adapter/releases/download/v.2.0.0/1bcbffc43484332672092e0024a8699a6eb5f558161aebf98a7c6b1db67224d1_LORA-BIAS-7B.pth",
273281
# "LORA16-7B": "",
274282
# "PARTIAL-7B": ""
275283
}
@@ -284,10 +292,8 @@ def load(name, llama_dir, device="cuda" if torch.cuda.is_available() else "cpu",
284292
elif os.path.isfile(name):
285293
model_path = name
286294
else:
287-
return RuntimeError(f"Model {name} not found; available models = {available_models()}")
295+
return RuntimeError(f"Model {name} not found; available models = {available_models()}"), None
288296

289-
ckpt = torch.load(model_path, map_location='cpu')
290-
291297
# BIAS-7B or https://xxx/sha256_BIAS-7B.pth -> 7B
292298
llama_type = name.split('.')[0].split('-')[-1]
293299
llama_ckpt_dir = os.path.join(llama_dir, llama_type)
@@ -296,6 +302,7 @@ def load(name, llama_dir, device="cuda" if torch.cuda.is_available() else "cpu",
296302
# load llama_adapter weights and model_cfg
297303
print(f'Loading LLaMA-Adapter from {model_path}')
298304
ckpt = torch.load(model_path, map_location='cpu')
305+
model_cfg = ckpt.get('config', {})
299306

300307
model = LLaMA_adapter(
301308
llama_ckpt_dir, llama_tokenzier_path,
@@ -304,6 +311,10 @@ def load(name, llama_dir, device="cuda" if torch.cuda.is_available() else "cpu",
304311
v_embed_dim=768, v_depth=8,
305312
v_num_heads=16, v_mlp_ratio=4.0,
306313
query_len=10, query_layer=31,
314+
w_bias=model_cfg.get('w_bias', False),
315+
w_lora=model_cfg.get('w_lora', False),
316+
lora_rank=model_cfg.get('lora_rank', 16),
317+
w_new_gate=model_cfg.get('w_lora', False), # for compatibility
307318
phase=phase)
308319

309320
load_result = model.load_state_dict(ckpt['model'], strict=False)
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
import torch
2+
3+
def save(full_model, path, model_type = 'BIAS'):
4+
if model_type == 'BIAS':
5+
keys = [
6+
f'visual_blocks.{i}.{key}.{suffix}'
7+
for i in range(8)
8+
for key in ['norm1', 'attn.qkv', 'attn.proj', 'norm2', 'mlp.fc1', 'mlp.fc2']
9+
for suffix in ['weight', 'bias']
10+
] + [
11+
f'llama.layers.{i}.{key}'
12+
for i in range(32)
13+
for key in ['attention.gate', 'attention.wq.bias', 'attention.wo.bias', 'feed_forward.w1.bias', 'feed_forward.w2.bias', 'feed_forward.w3.bias', 'attention_norm.weight', 'ffn_norm.weight']
14+
] + [
15+
f'{base_key}.{suffix}'
16+
for base_key in ['clip_proj_norm', 'visual_proj_norm', 'visual_proj', 'clip_proj']
17+
for suffix in ['weight', 'bias']
18+
] + ['llama.norm.weight', 'visual_query.weight', 'adapter_query.weight']
19+
20+
21+
elif model_type == 'LORA':
22+
keys = [
23+
f'visual_blocks.{i}.{key}.{suffix}'
24+
for i in range(8)
25+
for key in [f'norm{j}' for j in range(1, 3)] + ['attn.qkv', 'attn.proj', 'mlp.fc1', 'mlp.fc2']
26+
for suffix in ['weight', 'bias']
27+
] + [
28+
f'llama.layers.{i}.{key}'
29+
for i in range(32)
30+
for key in ['attention.gate', 'attention.wq.bias', 'attention.wo.bias', 'feed_forward.w1.bias', 'feed_forward.w2.bias', 'feed_forward.w3.bias', 'attention_norm.weight', 'ffn_norm.weight']
31+
+ [f'attention.lora_wk_l{j}.weight' for j in range(1, 3)]
32+
+ [f'attention.lora_wo_l{j}.weight' for j in range(1, 3)]
33+
+ [f'feed_forward.lora_w{k}_l{j}.weight' for k in range(1, 4) for j in range(1, 3)]
34+
+ [f'attention.lora_wq_l{j}.weight' for j in range(1, 3)]
35+
+ [f'attention.lora_wv_l{j}.weight' for j in range(1, 3)]
36+
+ ['attention.new_gate']
37+
] + [
38+
f'{base_key}.{suffix}'
39+
for base_key in ['clip_proj_norm', 'visual_proj_norm', 'visual_proj', 'clip_proj']
40+
for suffix in ['weight', 'bias']
41+
] + ['llama.norm.weight', 'visual_query.weight', 'adapter_query.weight']
42+
43+
## TODO: Add other model types
44+
45+
full_model_state_dict = full_model.state_dict()
46+
small_weights = {key: full_model_state_dict[key] for key in keys}
47+
if model_type == 'BIAS':
48+
wrapped_small_weights = {'model': small_weights,'config': {'w_bias': True, 'w_lora': False, 'lora_rank': 16}}
49+
elif model_type == 'LORA':
50+
wrapped_small_weights = {'model': small_weights,'config': {'w_bias': True, 'w_lora': True, 'lora_rank': 16}}
51+
# Save the wrapped small weights
52+
torch.save(wrapped_small_weights, path)

0 commit comments

Comments
 (0)