Update dependency peft to v0.17.1 #61
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
==0.3.0
->==0.17.1
Release Notes
huggingface/peft (peft)
v0.17.1
: 0.17.1Compare Source
This patch release contains a few fixes (via #2710) for the newly introduced
target_parameters
feature, which allows LoRA to targetnn.Parameter
s directly (useful for mixture of expert layers). Most notably:model.add_adapter
ormodel.load_adapter
) did not work correctly. Since a solution is not trivial, PEFT now raises an error to prevent this situation.v0.17.0
: 0.17.0: SHiRA, MiSS, LoRA for MoE, and moreCompare Source
Highlights
New Methods
SHiRA
@kkb-code contributed Sparse High Rank Adapters (SHiRA, paper) which promise to offer a potential gain in performance over LoRAs - especially the concept loss when using multiple adapters is improved. Since the adapters only train on 1-2% of the weights and are inherently sparse, switching between adapters may be cheaper than with LoRAs. (#2584)
MiSS
@JL-er added a new PEFT method, MiSS (Matrix Shard Sharing) in #2604. This method is an evolution of Bone, which, according to our PEFT method comparison benchmark, gives excellent results when it comes to performance and memory efficiency. If you haven't tried it, you should do so now.
At the same time, Bone will be deprecated in favor of MiSS and will be removed in PEFT v0.19.0. If you already have a Bone checkpoint, you can use
scripts/convert-bone-to-miss.py
to convert it into a MiSS checkpoint and proceed with training using MiSS.Enhancements
LoRA for
nn.Parameter
LoRA is now able to target
nn.Parameter
directly (#2638, #2665)! Ever had this complicatednn.Module
with promising parameters inside but it was too custom to be supported by your favorite fine-tuning library? No worries, now you can targetnn.Parameters
directly using thetarget_parameters
config attribute which works similarly totarget_modules
.This option can be especially useful for models with Mixture of Expert (MoE) layers, as those often use
nn.Parameter
s directly and cannot be targeted withtarget_modules
. For example, for the Llama4 family of models, use the following config to target the MoE weights:Note that this feature is still experimental as it comes with a few caveats and therefore might change in the future. Also, MoE weights with many experts can be quite huge, so expect a higher memory usage than compared to targeting normal
nn.Linear
layers.Injecting adapters based on a
state_dict
Sometimes, it is possible that there is a PEFT adapter checkpoint but the corresponding PEFT config is not known for whatever reason. To inject the PEFT layers for this checkpoint, you would usually have to reverse-engineer the corresponding PEFT config, most notably the
target_modules
argument, based on thestate_dict
from the checkpoint. This can be cumbersome and error prone. To avoid this, it is also possible to callinject_adapter_in_model
and pass the loadedstate_dict
as an argument:Find more on
state_dict
based injection in the docs.Changes
Compatibility
A bug in prompt learning methods caused
modules_to_save
to be ignored. Especially classification tasks are affected since they usually add the classification/score layer tomodules_to_save
. In consequence, these layers were neither trained nor stored after training. This has been corrected now. (#2646)All Changes
New Contributors
Full Changelog: huggingface/peft@v0.16.0...v0.17.0
v0.16.0
: 0.16.0: LoRA-FA, RandLoRA, C³A, and much moreCompare Source
Highlights
New Methods
LoRA-FA
In #2468, @AaronZLT added the LoRA-FA optimizer to PEFT. This optimizer is based on
AdamW
and it increases memory efficiency of LoRA training. This means that you can train LoRA with less memory, or, with the same memory budget, use higher LoRA ranks, potentially getting better results.RandLoRA
Thanks to @PaulAlbert31, a new PEFT method called
RandLoRA
was added to PEFT (#2464). Similarly to VeRA, it uses non-learnable random low rank matrices that are combined through learnable matrices. This way, RandLoRA can approximate full rank updates of the weights. Training models quantized with bitsandbytes is supported.C³A
@Phoveran added Circular Convolution Adaptation, C3A, in #2577. This new PEFT method can overcome the limit of low rank adaptations as seen e.g. in LoRA while still promising to be fast and memory efficient.
Enhancements
Thanks to @gslama12 and @SP1029, LoRA now supports
Conv2d
layers withgroups != 1
. This requires the rankr
being divisible bygroups
. See #2403 and #2567 for context.@dsocek added support for Intel Neural Compressor (INC) quantization to LoRA in #2499.
DoRA now supports
Conv1d
layers thanks to @EskildAndersen (#2531).Passing
init_lora_weights="orthogonal"
now enables orthogonal weight initialization for LoRA (#2498).@gapsong brought us Quantization-Aware LoRA training in #2571. This can make QLoRA training more efficient, please check the included example. Right now, only GPTQ is supported.
There has been a big refactor of Orthogonal Finetuning, OFT, thanks to @zqiu24 (#2575). This makes the PEFT method run more quickly and require less memory. It is, however, incompatible with old OFT checkpoints. If you have old OFT checkpoints, either pin the PEFT version to
<0.16.0
or retrain it with the new PEFT version.Thanks to @keepdying, LoRA hotswapping with compiled models no longer leads to CUDA graph re-records (#2611).
Changes
Compatibility
required_grads_
ofmodules_to_save
is now set toTrue
when used directly withinject_adapter
. This is relevant for PEFT integrations, e.g. Transformers or Diffusers.vlm.language_model
, it will no longer work, please apply it tovlm
directly (see #2554 for context). Morever, the refactor results in different checkpoints. We managed to ensure backwards compatability in PEFT, i.e. old checkpoints can be loaded successfully. There is, however, no forward compatibility, i.e. loading checkpoints trained after the refactor is not possible with package versions from before the refactor. In this case, you need to upgrade PEFT and transformers. More context in #2574.<0.16.0
and<4.52.0
, respectively).All Changes
modules_to_save
by @githubnemo in #2481add_weighted_adapter
by @Beinsezii in #2512rank_pattern
,rank_alpha
foradd_weighted_adapter
by @Beinsezii in #2550prepare_model_for_gradient_checkpointing
protected to public by @qgallouedec in #2569New Contributors
Full Changelog: huggingface/peft@v0.15.2...v0.16.0
v0.15.2
Compare Source
This patch fixes a bug that resulted in prompt learning methods like P-tuning not to work (#2477).
v0.15.1
Compare Source
This patch includes a fix for #2450. In this bug
modules_to_save
was not handled correctly when used in conjunction with DeepSpeed ZeRO stage 3 which resulted in those modules being placeholder values in the saved checkpoints.Full Changelog: huggingface/peft@v0.15.0...v0.15.1
v0.15.0
Compare Source
Highlights
New Methods
CorDA: Context-Oriented Decomposition Adaptation
@iboing and @5eqn contributed CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning . This task-driven initialization method has two modes, knowledge-preservation and instruction-preservation, both using external data to select ranks intelligently. The former can be used to select those ranks that correspond to weights not affiliated with knowledge from, say, a QA dataset. The latter can be used to select those ranks that correspond most to the task at hand (e.g., a classification task). (#2231)
Trainable Tokens: Selective token update
The new Trainable Tokens tuner allows for selective training of tokens without re-training the full embedding matrix, e.g. when adding support for reasoning / thinking tokens. This is a lot more memory efficient and the saved checkpoint is much smaller. It can be used standalone or in conjunction with LoRA adapters by passing
trainable_token_indices
toLoraConfig
. (#2376)Enhancements
LoRA now supports targeting multihead attention modules (but for now only those with
_qkv_same_embed_dim=True
). These modules were tricky as they may expose linear submodules but won't use their forward methods, therefore needing explicit support. (#1324)Hotswapping now allows different alpha scalings and ranks without recompilation of the model when the model is prepared using a call to
prepare_model_for_compiled_hotswap()
before compiling the model. (#2177)GPTQModel support was added in #2247 as a replacement for AutoGPTQ which is not maintained anymore.
Changes
all-linear
astarget_modules
for custom (non-transformers) models (#2267). With this change comes a bugfix where it was possible that non-linear layers were selected when they shared the same name with a linear layer (e.g.,bar.foo
andbaz.foo
).register_peft_method()
call. (#2282)PEFT_TYPE_TO_MODEL_MAPPING
is now deprecated and should not be relied upon. UsePEFT_TYPE_TO_TUNER_MAPPING
instead. (#2282)modules_to_save
keys wrongly matched parts of the state dict if the key was a substring of another key (e.g.,classifier
andclassifier2
). (#2334)disable_input_dtype_casting=True
. (#2353)rank_pattern
andalpha_pattern
used by many adapters now supports matching full paths as well by specifying the pattern with a caret in front, for example:^foo
to targetmodel.foo
but notmodel.bar.foo
. (#2419)What's Changed
adapter_name
conflict with tuner by @pzdkn in #2254"all-linear"
to target custom models by @BenjaminBossan in #2267__all__
by @bluenote10 in #2280config.py
by @innerlee in #2297prepare_model_for_kbit_training
docstring by @NilBiescas in #2305resize_token_embeddings
to docs by @bingwork in #2290get_peft_model()
for in-place base model modification by @d-kleine in #2313low_cpu_mem_usage=True
with 8bit bitsandbytes by @BenjaminBossan in #2325PEFT_TYPE_TO_MODEL_MAPPING
variable with deprecation by @BenjaminBossan in #2328modules_to_save
loading if substring by @BenjaminBossan in #2334modules_to_save
by @BenjaminBossan in #2220torch.compile
tests and docs by @BenjaminBossan in #2332nn.Conv1d
by @CCLDArjun in #2333prepare_model_for_compiled_hotswap
raises when no adapter was found by @BenjaminBossan in #2375hf_hub_download
arguments are used when loading locally by @henryzhengr in #2373Configuration
📅 Schedule: Branch creation - "after 5am on saturday" (UTC), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about these updates again.
To execute skipped test pipelines write comment
/ok-to-test
.This PR has been generated by MintMaker (powered by Renovate Bot).