Skip to content

Commit ec4a59e

Browse files
authored
Add support for AdapterPlus (#746)
This PR aims to add support for AdapterPlus Github: https://github.com/visinf/adapter_plus Paper: https://arxiv.org/pdf/2406.06820 Integration of AdapterPlus into the `adapters` library will involve adding new parameters/options to the `BnConfig` Checklist of things that are added/to be added 1. New type of `scaling` called `channel`, in which we add learnable parameters for the channel/input_size dimension 2. New type of `init_weights` called `houlsby`, where the projection matrices $W_{down}$ and $W_{up}$ will be initialized with zero-centered Gaussian with a standard deviation of $10^{-2}$ truncated at 2 standard deviations, and zeros for bias 3. Support for `drop_path`, also known as stochastic depth **ONLY** applicable for vision based tasks using residual networks - located under a new file called `/methods/vision.py`
1 parent bf684ad commit ec4a59e

File tree

9 files changed

+591
-9
lines changed

9 files changed

+591
-9
lines changed

.github/workflows/tests_torch.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ jobs:
6363
- name: Install
6464
run: |
6565
pip install torch==2.3
66-
pip install .[sklearn,testing,sentencepiece]
66+
pip install .[sklearn,testing,sentencepiece,torchvision]
6767
- name: Test
6868
run: |
6969
make test-adapter-methods
@@ -86,7 +86,7 @@ jobs:
8686
- name: Install
8787
run: |
8888
pip install torch==2.3
89-
pip install .[sklearn,testing,sentencepiece]
89+
pip install .[sklearn,testing,sentencepiece,torchvision]
9090
- name: Test
9191
run: |
9292
make test-adapter-models
@@ -109,7 +109,7 @@ jobs:
109109
- name: Install
110110
run: |
111111
pip install torch==2.3
112-
pip install .[sklearn,testing,sentencepiece]
112+
pip install .[sklearn,testing,sentencepiece,torchvision]
113113
pip install conllu seqeval
114114
- name: Test Examples
115115
run: |

docs/classes/adapter_config.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,9 @@ Single (bottleneck) adapters
3434
.. autoclass:: adapters.CompacterPlusPlusConfig
3535
:members:
3636

37+
.. autoclass:: adapters.AdapterPlusConfig
38+
:members:
39+
3740
Prefix Tuning
3841
~~~~~~~~~~~~~~~~~~~~~~~
3942

docs/methods.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ A visualization of further configuration options related to the adapter structur
4242
- [`DoubleSeqBnConfig`](adapters.DoubleSeqBnConfig), as proposed by [Houlsby et al. (2019)](https://arxiv.org/pdf/1902.00751.pdf) places adapter layers after both the multi-head attention and feed-forward block in each Transformer layer.
4343
- [`SeqBnConfig`](adapters.SeqBnConfig), as proposed by [Pfeiffer et al. (2020)](https://arxiv.org/pdf/2005.00052.pdf) places an adapter layer only after the feed-forward block in each Transformer layer.
4444
- [`ParBnConfig`](adapters.ParBnConfig), as proposed by [He et al. (2021)](https://arxiv.org/pdf/2110.04366.pdf) places adapter layers in parallel to the original Transformer layers.
45-
45+
- [`AdapterPlusConfig`](adapters.AdapterPlusConfig), as proposed by [Steitz and Roth (2024)](https://arxiv.org/pdf/2406.06820) places adapter layers adapter layers after the multi-head attention and has channel wise scaling and houlsby weight initialization
4646
_Example_:
4747
```python
4848
from adapters import BnConfig
@@ -56,6 +56,7 @@ _Papers:_
5656
* [Parameter-Efficient Transfer Learning for NLP](https://arxiv.org/pdf/1902.00751.pdf) (Houlsby et al., 2019)
5757
* [Simple, Scalable Adaptation for Neural Machine Translation](https://arxiv.org/pdf/1909.08478.pdf) (Bapna and Firat, 2019)
5858
* [AdapterFusion: Non-Destructive Task Composition for Transfer Learning](https://aclanthology.org/2021.eacl-main.39.pdf) (Pfeiffer et al., 2021)
59+
* [Adapters Strike Back](https://arxiv.org/pdf/2406.06820) (Steitz and Roth., 2024)
5960
* [AdapterHub: A Framework for Adapting Transformers](https://arxiv.org/pdf/2007.07779.pdf) (Pfeiffer et al., 2020)
6061

6162
## Language Adapters - Invertible Adapters

notebooks/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,3 +35,4 @@ As adapters is fully compatible with HuggingFace's Transformers, you can also us
3535
| [NER on Wikiann](https://github.com/Adapter-Hub/adapters/blob/main/notebooks/08_NER_Wikiann.ipynb) | Evaluating adapters on NER on the wikiann dataset | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/08_NER_Wikiann.ipynb) |
3636
| [Finetuning Whisper with Adapters](https://github.com/Adapter-Hub/adapters/blob/main/notebooks/Adapter_Whisper_Audio_FineTuning.ipynb) | Fine Tuning Whisper using LoRA | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/Adapter_Whisper_Audio_FineTuning.ipynb) |
3737
| [Adapter Training with ReFT](https://github.com/Adapter-Hub/adapters/blob/main/notebooks/ReFT_Adapters_Finetuning.ipynb) | Fine Tuning using ReFT Adapters | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/ReFT_Adapters_Finetuning.ipynb) |
38+
| [ViT Fine-Tuning with AdapterPlus](https://github.com/Adapter-Hub/adapters/blob/main/notebooks/ViT_AdapterPlus_FineTuning.ipynb) | ViT Fine-Tuning with AdapterPlus | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/ViT_AdapterPlus_FineTuning.ipynb) |

0 commit comments

Comments
 (0)