Integration of PVeRA #2952

leofillioux · 2025-12-11T18:54:09Z

This PR is a continuation of issue #2948, for which we proposed the integration of the PVeRA adapter.

As recommended in the issue, we based our implementation on the implementation of the VeRA adapter, as both adapters are very close. Here are a list of the contributions from this PR.

Added the PVeRA adapter folder in src/peft, and adapted the config.py, layer.py, and model.py files from the ones from VeRA.
Added some documentation.
Adapted the tests/test_vera.py to tests/test_pvera.py and made sure this ran properly.

Note: because I'm running on a Mac, I was not able to run make test (had an error with MPS).

@BenjaminBossan could you please give me some feedback?

…ation, and the tests.

leofillioux · 2025-12-17T13:52:09Z

Update : ran make test on CPU and got 16496 passed, 3563 skipped, 10 xfailed, 13272 warnings on leofillioux:main, for comparison with the huggingface:main branch I got 16476 passed, 3371 skipped, 10 xfailed, 13272 warnings.

githubnemo

Thanks for the PR, this already looks quite good!

Please check the copyright notices and make sure that it is up-to-date (it often says 2024 but it should be 2025).

I've done a quick review and left a few comments. General remarks:

let's add PVeRA to tests/test_custom_models.py (adding the important configurations to TEST_CASES, similar to VeRA)
if these pass we can extend the coverage by adding PVeRA to tests/test_decoder_models.py and tests/test_encoder_decoder_models.py

Once these are implemented I'll do a more thorough review. After that it'd be nice to have a runnable example and to integrate it into the method comparison to benchmark it against the other methods (and to check our expectations).

Heads up: I'll be rather off than on in the next days, so merging and review will most likely happen in the next year.

docs/source/package_reference/pvera.md

src/peft/tuners/pvera/__init__.py

src/peft/tuners/pvera/config.py

src/peft/utils/save_and_load.py

src/peft/tuners/pvera/model.py

docs/source/package_reference/pvera.md

…r_decoder.

leofillioux · 2025-12-21T17:59:32Z

Hello @githubnemo, thank you for your review! I added two commits:

dd19a9a which addresses the comments made above.
d52cc76 which adds PVeRA to tests/test_custom_models.py, tests/test_decoder_models.py and tests/test_encoder_decoder_models.py. This required a few changes to other files (e.g. adding the pvera/bnb.py file, fixing the merging of adapter, ...).

After all these changes, I re-ran make test and got 16920 passed, 3643 skipped, 10 xfailed, 13582 warnings.
No worries, enjoy your time off!

githubnemo

Hey, thanks for the update!

Implementation and tests look quite mature. I think that if we provide bitsandbytes support we also need at least one test in tests/test_gpu_examples.py for quality control. I think test_causal_lm_training_4bit_vera could be used as a base.

In general I think we should rename PVeRA* to Pvera* (e.g., PVeRAModel -> PveraModel) to be consistent with VeraModel and friends. It is quite hard to remember the spelling of the various abbreviations already :)

Rest of the review is in the comments.

After the comments are resolved and the CI is green I think it would be nice to integrate PVeRA into the MetaMathQA benchmark by adding an experiment file based on the VeRA experiment.

Edit: To fix the docs build, add the PVeRA entry to docs/source/_toctree.yml similar to the other entries.

src/peft/tuners/pvera/config.py

githubnemo · 2026-01-13T14:55:27Z

src/peft/tuners/pvera/layer.py

+        super(nn.Linear, self).__init__()
+        PVeRALayer.__init__(self, base_layer, **kwargs)
+        self.fan_in_fan_out = fan_in_fan_out
+        self.sample_at_inference = sample_at_inference


For consistency I think this should be a dict so that I can toggle this behavior for each adapter.

Here, are you talking about sample_at_inference?

Ah I always forget that github doesn't highlight the commented lines. Yes, I was talking about sample_at_inference.

githubnemo · 2026-01-13T15:10:06Z

tests/test_custom_models.py

        elif issubclass(config_cls, (VBLoRAConfig, RandLoraConfig, OSFConfig)):
            lr = 0.01  # otherwise we get nan
+        elif issubclass(config_cls, PVeRAConfig):  # needs very small lr to not get nan
+            lr = 1e-6


This is indeed very small - do you have an idea which gradients are exploding and if this is a problem in practice?

I checked, the problems come from the pvera_lambda_b which has its parameters at nan after the third epoch. This seems to be caused by the input which has values up to 90, which in practice shouldn't occur since the inputs are usually normalized.

OK. Let's extend the comment to be more precise:

# needs a very small lr to not get nan in pvera_lambda_b due to high input values in this test (up to 90)

tests/test_custom_models.py

tests/test_encoder_decoder_models.py

githubnemo · 2026-01-13T15:21:37Z

tests/testing_common.py

            model = self.transformers_class.from_pretrained(model_id)
            model = get_peft_model(model, config)
-            model = model.to(self.torch_device)
+            model = model.to(self.torch_device).eval()


This seems wrong since this is a training test - why put the model in eval mode?

Yes you're right, it doesn't make sense to put .eval(). However there is a problem in that model is in training mode here, whereas model_from_pretrained is is eval mode. This means that for PVeRA one model will sample from the learned distribution and the other will not, therefore giving different results, which is why I had put them both in eval mode. Do you think we should maybe just remove this test for PVeRA?

I see. Let's check if we can make this consistent by setting the model to eval mode after getting the loss (i.e. in the context block where the comparison happens) and moving the logits = ... retrieval down there.

For example:

with tempfile.TemporaryDirectory() as tmp_dirname: model.eval() logits = model(**inputs)[0][0] model.save_pretrained(tmp_dirname) model_from_pretrained = self.transformers_class.from_pretrained(model_id) model_from_pretrained = PeftModel.from_pretrained(model_from_pretrained, tmp_dirname).to( self.torch_device ) logits_from_pretrained = model_from_pretrained(**inputs)[0][0] atol, rtol = 1e-4, 1e-4 assert torch.allclose(logits, logits_from_pretrained, atol=atol, rtol=rtol)

IMO this should not change the test for the worse and should not break anything but we'll see :)

githubnemo · 2026-01-13T15:21:48Z

tests/testing_common.py

+                model_from_pretrained = (
+                    PeftModel.from_pretrained(model_from_pretrained, tmp_dirname).to(self.torch_device).eval()


same as above

leofillioux · 2026-01-15T08:13:17Z

Hello @githubnemo, thanks again for your review. I have updated the code with most of your comments, but the following issues still remain:

A question regarding what should be converted to a dict in your comment here.
Regarding the small learning rate for PVeRA in a test, it seems to be due to a non-normalised input, which seems to cause instability in the sampling, I added a bit more info here.
Regarding adding .eval() to some tests during training. I see why this is problematic, so I have removed them, but as explained here, this is a problem for PVeRA as this test compared the outputs of a model in train with a model in eval model. What do you think of removing this test for PVeRA?

githubnemo

Thanks for the changes!

I've commented on hopefully all the outstanding issues and added a few nits.

CI seems to be passing except for the .eval issue which is hopefully resolved with the proposed fix.

githubnemo · 2026-01-21T13:14:54Z

src/peft/tuners/pvera/__init__.py

+from .model import PveraModel
+
+
+__all__ = ["Linear", "PVeRALayer", "PveraConfig", "PveraModel"]


Suggested change

__all__ = ["Linear", "PVeRALayer", "PveraConfig", "PveraModel"]

__all__ = ["Linear", "PveraLayer", "PveraConfig", "PveraModel"]

+for the class itself as well

githubnemo · 2026-01-21T13:16:14Z

src/peft/tuners/pvera/layer.py

+        super(nn.Linear, self).__init__()
+        PVeRALayer.__init__(self, base_layer, **kwargs)
+        self.fan_in_fan_out = fan_in_fan_out
+        self.sample_at_inference = sample_at_inference


Ah I always forget that github doesn't highlight the commented lines. Yes, I was talking about sample_at_inference.

githubnemo · 2026-01-21T13:19:18Z

tests/test_custom_models.py

        elif issubclass(config_cls, (VBLoRAConfig, RandLoraConfig, OSFConfig)):
            lr = 0.01  # otherwise we get nan
+        elif issubclass(config_cls, PVeRAConfig):  # needs very small lr to not get nan
+            lr = 1e-6


OK. Let's extend the comment to be more precise:

# needs a very small lr to not get nan in pvera_lambda_b due to high input values in this test (up to 90)

githubnemo · 2026-01-21T13:27:14Z

tests/testing_common.py

            model = self.transformers_class.from_pretrained(model_id)
            model = get_peft_model(model, config)
-            model = model.to(self.torch_device)
+            model = model.to(self.torch_device).eval()


I see. Let's check if we can make this consistent by setting the model to eval mode after getting the loss (i.e. in the context block where the comparison happens) and moving the logits = ... retrieval down there.

For example:

with tempfile.TemporaryDirectory() as tmp_dirname: model.eval() logits = model(**inputs)[0][0] model.save_pretrained(tmp_dirname) model_from_pretrained = self.transformers_class.from_pretrained(model_id) model_from_pretrained = PeftModel.from_pretrained(model_from_pretrained, tmp_dirname).to( self.torch_device ) logits_from_pretrained = model_from_pretrained(**inputs)[0][0] atol, rtol = 1e-4, 1e-4 assert torch.allclose(logits, logits_from_pretrained, atol=atol, rtol=rtol)

IMO this should not change the test for the worse and should not break anything but we'll see :)

githubnemo · 2026-01-21T13:35:10Z

tests/test_pvera.py

+    def test_multiple_adapters_save_load_save_projection_true(self, mlp_same_prng, tmp_path):
+        # check saving and loading works with multiple adapters and saved projection weights
+        torch.manual_seed(0)
+        input = torch.randn(5, 10)
+        mlp_same_prng.set_adapter("default")
+        mlp_same_prng.eval()
+        output_default = mlp_same_prng(input)
+        mlp_same_prng.set_adapter("other")
+        output_other = mlp_same_prng(input)
+
+        # sanity check
+        assert not torch.allclose(output_default, output_other, atol=1e-3, rtol=1e-3)
+
+        save_path = tmp_path / "pvera"
+        mlp_same_prng.save_pretrained(save_path)
+        assert os.path.exists(save_path / "adapter_config.json")
+        assert os.path.exists(save_path / "other" / "adapter_config.json")
+
+        torch.manual_seed(0)
+        mlp = MLP()
+        peft_model = PeftModel.from_pretrained(mlp, save_path)
+        peft_model.load_adapter(save_path / "other", "other")
+        peft_model.eval()
+
+        peft_model.set_adapter("default")
+        output_default_loaded = peft_model(input)
+        peft_model.set_adapter("other")
+        output_other_loaded = peft_model(input)
+
+        assert torch.allclose(output_default, output_default_loaded, atol=1e-3, rtol=1e-3)
+        assert torch.allclose(output_other, output_other_loaded, atol=1e-3, rtol=1e-3)


test_multiple_adapters_save_load_save_projection_true probably already covered by the common tests or am I missing something special?

To be honest, I integrated these (test_multiple_adapters_save_load_save_projection_true and test_multiple_adapters_save_projection_true_contains_pvera_A_pvera_B) because they were integrated for VeRA, but I agree that they do seem a bit redundant. I'm ok with removing them if you think they aren't useful.

githubnemo · 2026-01-21T13:40:50Z

tests/test_encoder_decoder_models.py

+            "r": 8,
+            "target_modules": None,
+            "pvera_dropout": 0.05,
+            "d_initial": 0.1,
+            "save_projection": True,
+            "bias": "none",
+            "task_type": "SEQ_2_SEQ_LM",


Let's just mention the non-default values here.

githubnemo · 2026-01-21T13:41:09Z

tests/test_decoder_models.py

+            "d_initial": 0.1,
+            "save_projection": True,
+            "bias": "none",
+        },


Same as below, just mention the non-default values.

githubnemo · 2026-01-21T13:50:21Z

src/peft/tuners/pvera/config.py

+        },
+    )
+    pvera_dropout: float = field(default=0.0, metadata={"help": "PVeRA dropout"})
+    d_initial: float = field(default=0.1, metadata={"help": "Initial init value for d vector."})


Suggested change

d_initial: float = field(default=0.1, metadata={"help": "Initial init value for d vector."})

d_initial: float = field(default=0.1, metadata={"help": "Initial value for d vector."})

But we can just use the docstring for this parameter from above verbatim

githubnemo · 2026-01-21T13:52:37Z

src/peft/tuners/pvera/config.py

+                "List of module names or regex expression of the module names to replace with PVeRA."
+                "For example, ['q', 'v'] or '.*decoder.*(SelfAttention|EncDecAttention).*(q|v)$'. "
+                "Only linear layers are supported."


Let's use the docstring values from above for all help values. In the end this makes it more maintainable and we're losing not much.

githubnemo · 2026-01-21T13:55:50Z

docs/source/package_reference/pvera.md

Also make sure to add this file to the docs/source/_toctree.yml possibly next to VeRA.

… sample_at_inference).

leofillioux · 2026-01-22T08:03:21Z

Hello @githubnemo,
Thanks again for your review! I think I have addressed the issues with my newest commits. The only thing I am unsure about is regarding the redundant tests, for which I have left a comment here.

Integration of PVeRA. Added the implementation of PVeRA, the document…

ff5477d

…ation, and the tests.

leofillioux mentioned this pull request Dec 12, 2025

Add PVeRA to PEFT #2948

Open

githubnemo reviewed Dec 20, 2025

View reviewed changes

leofillioux added 2 commits December 21, 2025 12:21

Recommendations from the PR review.

dd19a9a

Added PVeRA tests to custom_models, custom_decoder, and custom_encode…

d52cc76

…r_decoder.

githubnemo reviewed Jan 13, 2026

View reviewed changes

Added recommendations from PR.

e1c4fdd

githubnemo reviewed Jan 21, 2026

View reviewed changes

leofillioux added 3 commits January 21, 2026 16:23

Suggestions from PR (still need to analyze redundent tests and change…

a0faa92

… sample_at_inference).

Suggestions from PR (still need to analyze redundent tests and change…

ac8b548

… sample_at_inference).

Changed the sample_at_inference variable to allow dict.

aaa267f

		model_from_pretrained = (
		PeftModel.from_pretrained(model_from_pretrained, tmp_dirname).to(self.torch_device).eval()

		from .model import PveraModel


		__all__ = ["Linear", "PVeRALayer", "PveraConfig", "PveraModel"]

	d_initial: float = field(default=0.1, metadata={"help": "Initial init value for d vector."})
	d_initial: float = field(default=0.1, metadata={"help": "Initial value for d vector."})

Integration of PVeRA #2952

Are you sure you want to change the base?

Integration of PVeRA #2952

Conversation

leofillioux commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leofillioux commented Dec 17, 2025

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

leofillioux commented Dec 21, 2025

Uh oh!

githubnemo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leofillioux commented Jan 15, 2026

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leofillioux commented Jan 22, 2026

Uh oh!

leofillioux commented Dec 11, 2025 •

edited

Loading

githubnemo left a comment •

edited

Loading