ENH: Adapter injection based on state_dict (#2637)

BenjaminBossan · web-flow · commit 337be05f03fd · 2025-08-01T18:39:53.000+02:00
Make it possible to inject the PEFT adapters based on a state_dict instead of the PEFT config. See huggingface/diffusers#11874 for context. Description Right now, when creating a PEFT adapter like LoRA, the adapter layers are injected based on the PEFT config, most notably the entries in `target_modules`, but other arguments also play into this. Generally, this is a good approach, but it breaks down in some situations. For instance, in diffusers, we often have the situation that the checkpoint was created without PEFT/diffusers, thus there is no PEFT config, only the `state_dict`. To load these checkpoints in diffusers, the current approach is to reverse-engineer a valid PEFT config based on the keys in the `state_dict`. Unfortunately, this is error prone. Moreover, not every combination of `state_dict` keys can be easily expressed in a PEFT config through a combination of `target_modules`, `exclude_modules`, etc. Yes, in theory everything can be expressed by passing `target_module=<regex_pattern>`, but reverse-engineering such a regex correctly and efficiently is very hard (and thus currently not done). This PR implements a completely different approach to inject adapters. Instead of relying on the PEFT config to determine which layers to target, it takes the `state_dict` directly as the source of truth. This should allow to exactly match what is desired. Implementation details I took care to implement this change in a way that if no `state_dict` is passed, the exact same code path as previously is taken. The risk of breaking anything should thus be minimized. Technically, it is not necessary to pass the `state_dict`, we are only interested in the keys. I still called the argument `state_dict`, since that is typically what we have at this point, but this can be easily changed. I thought it might be a good idea, if the `state_dict` is used, to still check what modules would have been targeted if we had used the PEFT config. Then, the results are compared and a warning is given if they differ. This allows the user to see if the PEFT config is not correctly specified. While running some diffusers tests, I never encountered this warning, which is good. However, if we plan, for instance, to get rid of all the reverse engineering of the PEFT config in diffusers, it would make more sense to not give this warning. Caveats When the original LoRA model was using `target_parameters`, injecting from `state_dict` will not work correctly. The problem is that the `state_dict` looks the same, whether the module or a parameter was targeted. Therefore, we cannot correctly determine the user's intent. For now, what I decided to do is: 1. Always assume that `target_modules` is meant, as it's the far more common occurrence. 2. When we detect `target_parameters` while using `state_dict` for injection, we raise an error. 3. If we don't detect this, injection might just slip through, resulting in modules being targeted (if they are valid modules) instead of parameters. 4. Document that these two features don't work together. I think overall, this is not too concerning, as both features are rather niche and thus unlikely to be used in conjunction. Related changes While working on this PR, I made a couple of related, though not strictly necessary, changes: - Refactor tests in `test_low_level_api.py` to use pytest instead of unittest - Add default target modules for LoHa and LoKr (just copying LoRA) - Most PEFT method's model classes like `LoraModel` had an `__init__` that effectively just called `super()` with the same arguments. I removed these `__init__` methods.
diff --git a/docs/source/developer_guides/low_level_api.md b/docs/source/developer_guides/low_level_api.md
@@ -16,7 +16,7 @@ rendered properly in your Markdown viewer.
 
 # Adapter injection
 
-With PEFT, you can inject trainable adapters into any `torch` module which allows you to use adapter methods without relying on the modeling classes in PEFT. Currently, PEFT supports injecting [LoRA](../conceptual_guides/adapter#low-rank-adaptation-lora), [AdaLoRA](../conceptual_guides/adapter#adaptive-low-rank-adaptation-adalora), and [IA3](../conceptual_guides/ia3) into models because for these adapters, inplace modification of the model is sufficient for finetuning it.
+With PEFT, you can inject trainable adapters into any `torch` module which allows you to use adapter methods without relying on the modeling classes in PEFT. This works for all adapters except for those based on prompt learning (e.g. prefix tuning or p-tuning).
 
 Check the table below to see when you should inject adapters.
 
@@ -87,6 +87,28 @@ DummyModel(
 )
 ```
 
+### Injection based on a `state_dict`
+
+Sometimes, it is possible that there is a PEFT adapter checkpoint but the corresponding PEFT config is not known for whatever reason. To inject the PEFT layers for this checkpoint, you would usually have to reverse-engineer the corresponding PEFT config, most notably the `target_modules` argument, based on the `state_dict` from the checkpoint. This can be cumbersome and error prone. To avoid this, it is also possible to call [`inject_adapter_in_model`] and pass the loaded `state_dict` as an argument:
+
+```python
+from safetensors.torch import load_file
+
+model = ...
+state_dict = load_file(<path-to-safetensors-file>)
+lora_config = LoraConfig(...)
+model = inject_adapter_in_model(lora_config, model, state_dict=state_dict)
+```
+
+In this case, PEFT will use the `state_dict` as reference for which layers to target instead of using the PEFT config. As a user, you don't have to set the exact `target_modules` of the PEFT config for this to work. However, you should still pass a PEFT config of the right type, in this example `LoraConfig`, you can leave the `target_modules` as `None`.
+
+Be aware that this still only creates the uninitialized PEFT layers, the values from the `state_dict` are not used to populate the model weights. To populate the weights, proceed with calling [`set_peft_model_state_dict`] as described below.
+
+⚠️ Note that if there is a mismatch between what is configured in the PEFT config and what is found in the `state_dict`, PEFT will warn you about this. You can ignore the warning if you know that the PEFT config is not correctly specified.
+
+> [!WARNING]
+> If the original PEFT adapters was using `target_parameters` instead of `target_modules`, injecting from a `state_dict` will not work correctly. In this case, it is mandatory to use the correct PEFT config for injection.
+
 ## Saving the model
 
 To only save the adapter, use the [`get_peft_model_state_dict`] function:
diff --git a/src/peft/mapping.py b/src/peft/mapping.py
@@ -14,7 +14,7 @@
 
 from __future__ import annotations
 
-from typing import TYPE_CHECKING, Any
+from typing import TYPE_CHECKING, Any, Optional
 
 import torch
 
@@ -45,7 +45,11 @@ def get_peft_config(config_dict: dict[str, Any]) -> PeftConfig:
 
 
 def inject_adapter_in_model(
-    peft_config: PeftConfig, model: torch.nn.Module, adapter_name: str = "default", low_cpu_mem_usage: bool = False
+    peft_config: PeftConfig,
+    model: torch.nn.Module,
+    adapter_name: str = "default",
+    low_cpu_mem_usage: bool = False,
+    state_dict: Optional[dict[str, torch.Tensor]] = None,
 ) -> torch.nn.Module:
     r"""
     A simple API to create and inject adapter in-place into a model. Currently the API does not support prompt learning
@@ -61,6 +65,11 @@ def inject_adapter_in_model(
             The name of the adapter to be injected, if not provided, the default adapter name is used ("default").
         low_cpu_mem_usage (`bool`, `optional`, defaults to `False`):
             Create empty adapter weights on meta device. Useful to speed up the loading process.
+        state_dict (`dict`, *optional*, defaults to `None`)
+            If a state_dict is passed here, the adapters will be injected based on the entries of the state_dict. This
+            can be useful when the exact `target_modules` of the PEFT method is unknown, for instance because the
+            checkpoint was created without meta data. Note that the values from the state_dict are not used, only the
+            keys are used to determine the correct layers that should be adapted.
     """
     if peft_config.is_prompt_learning or peft_config.is_adaption_prompt:
         raise ValueError("`create_and_replace` does not support prompt learning and adaption prompt yet.")
@@ -73,6 +82,8 @@ def inject_adapter_in_model(
     tuner_cls = PEFT_TYPE_TO_TUNER_MAPPING[peft_config.peft_type]
 
     # By instantiating a peft model we are injecting randomly initialized LoRA layers into the model's modules.
-    peft_model = tuner_cls(model, peft_config, adapter_name=adapter_name, low_cpu_mem_usage=low_cpu_mem_usage)
+    peft_model = tuner_cls(
+        model, peft_config, adapter_name=adapter_name, low_cpu_mem_usage=low_cpu_mem_usage, state_dict=state_dict
+    )
 
     return peft_model.model
diff --git a/src/peft/tuners/adalora/model.py b/src/peft/tuners/adalora/model.py
@@ -65,8 +65,8 @@ class AdaLoraModel(LoraModel):
 
     # Note: don't redefine prefix here, it should be inherited from LoraModel
 
-    def __init__(self, model, config, adapter_name):
-        super().__init__(model, config, adapter_name)
+    def __init__(self, model, config, adapter_name, **kwargs):
+        super().__init__(model, config, adapter_name, **kwargs)
 
         traininable_mode_counter = 0
         for config in self.peft_config.values():
diff --git a/src/peft/tuners/boft/model.py b/src/peft/tuners/boft/model.py
@@ -74,9 +74,6 @@ class BOFTModel(BaseTuner):
 
     prefix: str = "boft_"
 
-    def __init__(self, model, config, adapter_name, low_cpu_mem_usage: bool = False) -> None:
-        super().__init__(model, config, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage)
-
     def _check_new_adapter_config(self, config: BOFTConfig) -> None:
         """
         A helper method to check the config when a new adapter is being added.
diff --git a/src/peft/tuners/c3a/model.py b/src/peft/tuners/c3a/model.py
@@ -55,9 +55,6 @@ class C3AModel(BaseTuner):
 
     prefix: str = "c3a_"
 
-    def __init__(self, model, config, adapter_name, low_cpu_mem_usage: bool = False) -> None:
-        super().__init__(model, config, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage)
-
     def _check_new_adapter_config(self, config: C3AConfig) -> None:
         """
         A helper method to check the config when a new adapter is being added.
diff --git a/src/peft/tuners/fourierft/model.py b/src/peft/tuners/fourierft/model.py
@@ -58,9 +58,6 @@ class FourierFTModel(BaseTuner):
 
     prefix: str = "fourierft_"
 
-    def __init__(self, model, config, adapter_name, low_cpu_mem_usage: bool = False) -> None:
-        super().__init__(model, config, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage)
-
     def _check_new_adapter_config(self, config: FourierFTConfig) -> None:
         """
         A helper method to check the config when a new adapter is being added.
diff --git a/src/peft/tuners/ia3/model.py b/src/peft/tuners/ia3/model.py
@@ -75,9 +75,6 @@ class IA3Model(BaseTuner):
 
     prefix: str = "ia3_"
 
-    def __init__(self, model, config, adapter_name, low_cpu_mem_usage: bool = False):
-        super().__init__(model, config, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage)
-
     @staticmethod
     def _create_new_module(ia3_config, adapter_name, target, **kwargs):
         # avoid eager bnb import
diff --git a/src/peft/tuners/ln_tuning/model.py b/src/peft/tuners/ln_tuning/model.py
@@ -65,10 +65,6 @@ class LNTuningModel(BaseTuner):
 
     prefix: str = "ln_tuning_"
 
-    def __init__(self, model, config, adapter_name, low_cpu_mem_usage: bool = False) -> None:
-        # self.adapter_name = adapter_name
-        super().__init__(model, config, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage)
-
     def __getattr__(self, name: str):
         """Forward missing attributes to the wrapped module."""
         try:
diff --git a/src/peft/tuners/loha/model.py b/src/peft/tuners/loha/model.py
@@ -18,6 +18,7 @@
 from torch import nn
 
 from peft.tuners.lycoris_utils import LycorisConfig, LycorisTuner
+from peft.utils import TRANSFORMERS_MODELS_TO_LOHA_TARGET_MODULES_MAPPING
 from peft.utils.other import get_pattern_key
 
 from .layer import Conv2d, Linear, LoHaLayer
@@ -110,3 +111,13 @@ def _create_and_replace(
         else:
             new_module = self._create_new_module(config, adapter_name, target, **kwargs)
             self._replace_module(parent, target_name, new_module, target)
+
+    @staticmethod
+    def _prepare_adapter_config(peft_config, model_config):
+        if peft_config.target_modules is None:
+            if model_config["model_type"] not in TRANSFORMERS_MODELS_TO_LOHA_TARGET_MODULES_MAPPING:
+                raise ValueError("Please specify `target_modules` in `peft_config`")
+            peft_config.target_modules = set(
+                TRANSFORMERS_MODELS_TO_LOHA_TARGET_MODULES_MAPPING[model_config["model_type"]]
+            )
+        return peft_config
diff --git a/src/peft/tuners/lokr/model.py b/src/peft/tuners/lokr/model.py
@@ -18,6 +18,7 @@
 from torch import nn
 
 from peft.tuners.lycoris_utils import LycorisConfig, LycorisTuner
+from peft.utils import TRANSFORMERS_MODELS_TO_LOKR_TARGET_MODULES_MAPPING
 from peft.utils.other import get_pattern_key
 
 from .layer import Conv2d, Linear, LoKrLayer
@@ -112,3 +113,13 @@ def _create_and_replace(
         else:
             new_module = self._create_new_module(config, adapter_name, target, **kwargs)
             self._replace_module(parent, target_name, new_module, target)
+
+    @staticmethod
+    def _prepare_adapter_config(peft_config, model_config):
+        if peft_config.target_modules is None:
+            if model_config["model_type"] not in TRANSFORMERS_MODELS_TO_LOKR_TARGET_MODULES_MAPPING:
+                raise ValueError("Please specify `target_modules` in `peft_config`")
+            peft_config.target_modules = set(
+                TRANSFORMERS_MODELS_TO_LOKR_TARGET_MODULES_MAPPING[model_config["model_type"]]
+            )
+        return peft_config
diff --git a/src/peft/tuners/lora/__init__.py b/src/peft/tuners/lora/__init__.py
@@ -18,7 +18,7 @@
 from .config import EvaConfig, LoftQConfig, LoraConfig, LoraRuntimeConfig
 from .eva import get_eva_state_dict, initialize_lora_eva_weights
 from .gptq import GPTQLoraLinear
-from .layer import Conv2d, Conv3d, Embedding, Linear, LoraLayer
+from .layer import Conv2d, Conv3d, Embedding, Linear, LoraLayer, ParamWrapper
 from .model import LoraModel
 
 
@@ -34,6 +34,7 @@
     "LoraLayer",
     "LoraModel",
     "LoraRuntimeConfig",
+    "ParamWrapper",
     "get_eva_state_dict",
     "initialize_lora_eva_weights",
 ]
diff --git a/src/peft/tuners/lora/model.py b/src/peft/tuners/lora/model.py
@@ -139,9 +139,6 @@ class LoraModel(BaseTuner):
 
     prefix: str = "lora_"
 
-    def __init__(self, model, config, adapter_name, low_cpu_mem_usage: bool = False) -> None:
-        super().__init__(model, config, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage)
-
     def _check_new_adapter_config(self, config: LoraConfig) -> None:
         """
         A helper method to check the config when a new adapter is being added.
diff --git a/src/peft/tuners/lycoris_utils.py b/src/peft/tuners/lycoris_utils.py
@@ -203,9 +203,6 @@ class LycorisTuner(BaseTuner):
     prefix: str
     layers_mapping: dict[type[torch.nn.Module], type[LycorisLayer]]
 
-    def __init__(self, model, config, adapter_name, low_cpu_mem_usage: bool = False):
-        super().__init__(model, config, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage)
-
     def __getattr__(self, name: str):
         """Forward missing attributes to the wrapped module."""
         try:
diff --git a/src/peft/tuners/oft/model.py b/src/peft/tuners/oft/model.py
@@ -99,9 +99,6 @@ class OFTModel(BaseTuner):
 
     prefix: str = "oft_"
 
-    def __init__(self, model, config, adapter_name, low_cpu_mem_usage: bool = False) -> None:
-        super().__init__(model, config, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage)
-
     def _check_new_adapter_config(self, config: OFTConfig) -> None:
         """
         A helper method to check the config when a new adapter is being added.
diff --git a/src/peft/tuners/poly/model.py b/src/peft/tuners/poly/model.py
@@ -33,9 +33,6 @@
 class PolyModel(BaseTuner):
     prefix: str = "poly_"
 
-    def __init__(self, model, config, adapter_name) -> None:
-        super().__init__(model, config, adapter_name)
-
     @staticmethod
     def _check_target_module_exists(poly_config, key):
         return check_target_module_exists(poly_config, key)
diff --git a/src/peft/tuners/randlora/model.py b/src/peft/tuners/randlora/model.py
@@ -101,9 +101,6 @@ class RandLoraModel(BaseTuner):
 
     prefix: str = "randlora_"
 
-    def __init__(self, model, config, adapter_name, low_cpu_mem_usage: bool = False) -> None:
-        super().__init__(model, config, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage)
-
     def _find_dim(self, config) -> tuple[int, int]:
         """
         Finds the largest input and output dimensions across linear layers that have been wrapped with RandLora.
diff --git a/src/peft/tuners/shira/model.py b/src/peft/tuners/shira/model.py
@@ -64,9 +64,6 @@ class ShiraModel(BaseTuner):
 
     prefix: str = "shira_"
 
-    def __init__(self, model, config, adapter_name, low_cpu_mem_usage: bool = False) -> None:
-        super().__init__(model, config, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage)
-
     def _check_new_adapter_config(self, config: ShiraConfig) -> None:
         """
         A helper method to check the config when a new adapter is being added.
diff --git a/src/peft/tuners/trainable_tokens/model.py b/src/peft/tuners/trainable_tokens/model.py
@@ -31,9 +31,6 @@
 class TrainableTokensModel(BaseTuner):
     prefix: str = "trainable_tokens_"
 
-    def __init__(self, model, config, adapter_name, low_cpu_mem_usage: bool = False):
-        super().__init__(model, config, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage)
-
     def __getattr__(self, name: str):
         """Forward missing attributes to the wrapped module."""
         try:
@@ -49,13 +46,19 @@ def _prepare_adapter_config(self, peft_config, model_config):
         return peft_config
 
     def inject_adapter(
-        self, model: nn.Module, adapter_name: str, autocast_adapter_dtype: bool = True, low_cpu_mem_usage: bool = False
+        self,
+        model: nn.Module,
+        adapter_name: str,
+        autocast_adapter_dtype: bool = True,
+        low_cpu_mem_usage: bool = False,
+        **kwargs,
     ) -> None:
         super().inject_adapter(
             model=model,
             adapter_name=adapter_name,
             autocast_adapter_dtype=autocast_adapter_dtype,
             low_cpu_mem_usage=low_cpu_mem_usage,
+            **kwargs,
         )
 
         model_config = self.get_model_config(self)
diff --git a/src/peft/tuners/tuners_utils.py b/src/peft/tuners/tuners_utils.py
diff --git a/src/peft/tuners/vblora/model.py b/src/peft/tuners/vblora/model.py
diff --git a/src/peft/tuners/vera/model.py b/src/peft/tuners/vera/model.py
diff --git a/src/peft/utils/__init__.py b/src/peft/utils/__init__.py
diff --git a/src/peft/utils/constants.py b/src/peft/utils/constants.py
diff --git a/src/peft/utils/other.py b/src/peft/utils/other.py
diff --git a/src/peft/utils/save_and_load.py b/src/peft/utils/save_and_load.py
diff --git a/tests/test_low_level_api.py b/tests/test_low_level_api.py