Feature: RandLora integration into peft #2464

PaulAlbert31 · 2025-03-28T06:14:14Z

Follow up to issue 2441: #2441
This is an implementation of RandLoRA into peft, including some support for quantized training.

Some work is left to do on proper BnB configurations, the current implementation still works on --load_4bit configurations. Would need some guidance on this.

The 4bit and 8bit layer classes are however implemented.

Similarly to Vera, RandLoRA uses random bases which are shared between layers. These bases are saved into BufferDict objects but I had trouble using safetensors to save them. This may be due to a deprecated transformers version (4.46.3) during testing.

Happy to iterate as per feedback.

BenjaminBossan

Thanks for the PR. I've only skimmed it so far but it looks already quite good, well done (good job using the new method registration).

I think the best next step would be to add RandLoRA to the unit testing harness. This will allow us to easily see what already works and what doesn't. For this, check for instance how VeRA was added here:

peft/tests/test_custom_models.py

Lines 422 to 440 in 2ee02af

    
           ######## 
        
           # VeRA # 
        
           ######## 
        
           ("Vanilla MLP 1 VeRA", "MLP", VeraConfig, {"target_modules": "lin0"}), 
        
           ("Vanilla MLP 2 VeRA", "MLP", VeraConfig, {"target_modules": ["lin0"]}), 
        
           ("Vanilla MLP 3 VeRA", "MLP", VeraConfig, {"target_modules": ["lin1"]}), 
        
           ("Vanilla MLP 4 VeRA", "MLP", VeraConfig, {"target_modules": ["lin0", "lin1"]}), 
        
           ( 
        
               "Vanilla MLP 5 VeRA", 
        
               "MLP", 
        
               VeraConfig, 
        
               {"target_modules": ["lin0"], "modules_to_save": ["lin1"]}, 
        
           ), 
        
           ( 
        
               "Embedding + transformers Conv1D 1 VeRA", 
        
               "EmbConv1D", 
        
               VeraConfig, 
        
               {"target_modules": ["conv1d"]}, 
        
           ),

Let's add similar entries for RandLoRA. If we then find that some tests are failing, I'll guide you through the steps needed to make them pass.

Later, we'll probably also have to add some dedicated RandLoRA tests to ensure that the weight sharing works (see https://github.com/huggingface/peft/blob/main/tests/test_vera.py), maybe also something for the special grad handling. But it's fine to leave this for later.

Some work is left to do on proper BnB configurations, the current implementation still works on --load_4bit configurations. Would need some guidance on this.

I'm not quite sure what you mean by that, could you explain further? But we can also leave bnb integration for later, it might be better to get the normal RandLoRA fully working first.

but I had trouble using safetensors to save them. This may be due to a deprecated transformers version (4.46.3) during testing.

Could you give a small snippet to reproduce this?

Below, I'll list a few more steps required before merging the PR. This is just to get an overview and can be left for the later, for now let's focus on the points mentioned above.

Documentation
Example script/notebook
Full test integration
Optional: Add an experiment to be able to compare it to other PEFT methods

BenjaminBossan · 2025-03-31T09:30:28Z

@PaulAlbert31 Could you please run make style, otherwise the CI won't run the tests. Also, please always ping me once your changes are ready.

PaulAlbert31 · 2025-03-31T23:40:41Z

Hey @BenjaminBossan , apologies for the previous update I am still learning my way around these tools.
I have updated the draft after running make style and fixing errors.

Regarding the safetensor error I was mentioning earlier, this seems to go away after upgrading transformers so nothing to worry about there. This happens when saving the model after training so I guess this would be covered in the existing tests. Here is a small snippet I used to test.

import torch
from peft import ( 
    RandLoraConfig,
    get_peft_model
)
import transformers
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments, Trainer, TextDataset, DataCollatorForLanguageModeling

config = RandLoraConfig(
            r=16,
            randlora_alpha=32,
            randlora_dropout=0.05,
            target_modules=["k_proj"],
            bias="none",
            task_type="CAUSAL_LM",
            projection_prng_key=int(torch.exp(torch.tensor(1))*3.1415*1000),
)

model = AutoModelForCausalLM.from_pretrained(
            "Qwen/Qwen2.5-0.5B",
            torch_dtype=torch.float16,
            trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B", trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token # Set pad token if it's missing

model = get_peft_model(model, config)

dummy_text = "This is a dummy text for training.\nAnother line of dummy text.\nAnd a third line."
dummy_data_path = "dummy_data.txt" 
with open(dummy_data_path, "w") as f:
    f.write(dummy_text)

train_dataset = TextDataset(
    tokenizer=tokenizer,
    file_path=dummy_data_path,
    block_size=1
)

data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False
)

training_args = TrainingArguments(
    output_dir="./output_trainer",
    num_train_epochs=1,
    per_device_train_batch_size=1,
    optim="adamw_torch", 
    save_steps=1000,
    logging_steps=10, 
    learning_rate=1e-4,
    weight_decay=0.01,
    fp16=True, 
    bf16=False,
    max_steps=-1, 
    logging_dir="./logs", 
    dataloader_num_workers=0, 
    save_total_limit=1, 
    save_safetensors=True,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    data_collator=data_collator,
    tokenizer=tokenizer,
)

trainer.train()
model.save_pretrained("test_randlora")

Regarding quantized training, I meant that I use the load_4_bit argument to the AutoModelForCausalLM.from_pretrained to quantize the base model here in my code. My understanding is that a BnbQuantizationConfig is the updated way of doing it.

Since this config interacts with the base model and not the adapter, I guess this is not important for peft ?

Paul

BenjaminBossan

Thanks for the updates, the PR is making good progress.

A lot of the added tests are still failing. You can run them locally with:

pytest tests/test_custom_models.py -k randlora -v

The error most of the time is something like:

RuntimeError: The size of tensor a (20) must match the size of tensor b (10) at non-singleton dimension 1

I'm fairly certain that is due to how the slices are determined in get_scaled_bases but I'm not sure what the correct fix is. Could you please investigate?

Regarding the safetensor error I was mentioning earlier, this seems to go away after upgrading transformers so nothing to worry about there.

Nice. I tested your snippet and it passes for me too.

Regarding quantized training, I meant that I use the load_4_bit argument to the AutoModelForCausalLM.from_pretrained

Ah I see, no worries about that. In PEFT, we use the proper arguments. We will later just copy one of the existing PEFT bnb tests and make adjustments for RandLoRA. But as mentioned above, let's first get the non-quantized code fully running.

examples/fp4_finetuning/finetune_fp4_opt_bnb_peft.py

BenjaminBossan · 2025-04-02T08:34:44Z

src/peft/tuners/randlora/layer.py

+            # check input size
+            if randlora_A_param.shape[-1] < self.in_features:
+                raise ValueError(error_tmpl.format("randlora_A", randlora_A_param.shape[1], self.in_features))
+            # check output size
+            if randlora_B_param.shape[0] < self.out_features:
+                raise ValueError(error_tmpl.format("randlora_B", randlora_B_param.shape[0], self.out_features))


I wonder if the logic is correct here. Below, we have:

F.linear(F.linear(dropout(x), update_B), update_A), so it seems like B is applied first, so should be compared to in_features, and A is applied last, so should be compared to out_features. WDYT?

updated_A will always be used to obtain the smallest dimension of the weight matrix to reduce trainable parameters (randlora_gamma is applied to the A random base and contains most of the trainable parameters).
The size selection for randlora_A and randlora_B was wrong so I corrected and added a comment here.

BenjaminBossan · 2025-04-02T08:36:13Z

src/peft/tuners/randlora/layer.py

+            )
+            if randlora_A_param.shape[0] < self.r[adapter_name]:
+                raise ValueError(error_tmpl.format("randlora_A", randlora_A_param.shape[0], self.r[adapter_name]))
+            if randlora_B_param.shape[1] < self.r[adapter_name]:


I think this should be:

Suggested change

if randlora_B_param.shape[1] < self.r[adapter_name]:

if randlora_B_param.shape[-1] < self.r[adapter_name]:

Yes good catch, thanks

src/peft/tuners/randlora/layer.py

tests/test_custom_models.py

BenjaminBossan · 2025-04-02T09:02:32Z

src/peft/tuners/randlora/layer.py

+        self.randlora_dropout.update(nn.ModuleDict({adapter_name: randlora_dropout_layer}))
+
+        # Actual trainable parameters
+        n = min(self.in_features, self.out_features) / r


Can we use a more descriptive name than n please?

Changed to n -> num_bases

BenjaminBossan · 2025-04-02T09:03:17Z

src/peft/tuners/randlora/layer.py

+            requires_grad=True,
+        )
+
+        self.scaling[adapter_name] = randlora_alpha / r / math.sqrt(self.n)  # * 10#/ math.sqrt(self.n)


What about this comment?

Sorry remains from an hyper-parameter study

BenjaminBossan · 2025-04-02T09:04:30Z

src/peft/tuners/randlora/layer.py

+        if adapter_name in self.randlora_lambda.keys():
+            with torch.no_grad():
+                nn.init.zeros_(self.randlora_lambda[adapter_name])
+                nn.init.zeros_(self.randlora_gamma[adapter_name]).fill_(


Is the zeros_ required if it is followed by a fill_?

This is what is done in the vera implementation. I agree this is not very efficient.
Do you have another suggestion ?

Oh I see. zeros_ + fill_ could be replaced by constant_, right?

src/peft/tuners/randlora/layer.py

BenjaminBossan · 2025-04-02T09:07:36Z

src/peft/tuners/randlora/layer.py

+
+            # cast back the weights
+            # TODO: why?, taken from the VeRA implementation
+            self.randlora_lambda[adapter].data = self.randlora_lambda[adapter].to(dtype)
+            self.randlora_gamma[adapter].data = self.randlora_gamma[adapter].to(dtype)


This can indeed be removed.

BenjaminBossan · 2025-04-02T16:12:00Z

We had some issues with the CI that should now be fixed, could you please merge with/rebase on the latest main?

PaulAlbert31 · 2025-04-08T05:52:52Z

Hey @BenjaminBossan, thanks a lot for all your comments and catching some errors.
I have now updated the code according to your comments and all tests are passing.
I have also added tests for the sparse and very sparse bases.

Let me know if more changes are needed and we can move to more specific tests or to documentation/example scripts

BenjaminBossan

Thanks for the updates. The unit tests are now passing, nice. Overall, this already looks quite good, I mostly have smaller comments. The main job is now to extend the testing and to ensure that those tests also pass. For this, please extend the test matrix here:

peft/tests/testing_common.py

Lines 146 to 164 in 8feea90

    
               }, 
        
           ) 
        
           CLASSES_MAPPING = { 
        
               "ia3": (IA3Config, CONFIG_TESTING_KWARGS[0]), 
        
               "lora": (LoraConfig, CONFIG_TESTING_KWARGS[1]), 
        
               "prefix_tuning": (PrefixTuningConfig, CONFIG_TESTING_KWARGS[2]), 
        
               "prompt_encoder": (PromptEncoderConfig, CONFIG_TESTING_KWARGS[3]), 
        
               "prompt_tuning": (PromptTuningConfig, CONFIG_TESTING_KWARGS[4]), 
        
               "adalora": (AdaLoraConfig, CONFIG_TESTING_KWARGS[5]), 
        
               "boft": (BOFTConfig, CONFIG_TESTING_KWARGS[6]), 
        
               "vera": (VeraConfig, CONFIG_TESTING_KWARGS[7]), 
        
               "fourierft": (FourierFTConfig, CONFIG_TESTING_KWARGS[8]), 
        
               "hra": (HRAConfig, CONFIG_TESTING_KWARGS[9]), 
        
               "vblora": (VBLoRAConfig, CONFIG_TESTING_KWARGS[10]), 
        
               "oft": (OFTConfig, CONFIG_TESTING_KWARGS[11]), 
        
               "bone": (BoneConfig, CONFIG_TESTING_KWARGS[12]), 
        
               "lora+trainable_tokens": (LoraConfig, CONFIG_TESTING_KWARGS[13]), 
        
           }

Moreover, let's add some examples for GPU training with and without BNB, like we have for VeRA here:

peft/tests/test_gpu_examples.py

Lines 1168 to 1392 in 8feea90

    
               @pytest.mark.single_gpu_tests 
        
               def test_causal_lm_training_vera(self): 
        
                   r""" 
        
                   Same as test_causal_lm_training but with VeRA 
        
                   """ 
        
                   with tempfile.TemporaryDirectory() as tmp_dir: 
        
                       model = AutoModelForCausalLM.from_pretrained( 
        
                           self.causal_lm_model_id, 
        
                           quantization_config=BitsAndBytesConfig(load_in_8bit=True), 
        
                           device_map="auto", 
        
                       ) 
        
                       tokenizer = AutoTokenizer.from_pretrained(self.causal_lm_model_id) 
        
                       model = prepare_model_for_kbit_training(model) 
        
                       config = VeraConfig( 
        
                           r=16, 
        
                           target_modules=["q_proj", "v_proj"], 
        
                           vera_dropout=0.05, 
        
                           bias="none", 
        
                           task_type="CAUSAL_LM", 
        
                       ) 
        
                       model = get_peft_model(model, config) 
        
                       data = load_dataset_english_quotes() 
        
                       data = data.map(lambda samples: tokenizer(samples["quote"]), batched=True) 
        
                       trainer = Trainer( 
        
                           model=model, 
        
                           train_dataset=data["train"], 
        
                           args=TrainingArguments( 
        
                               per_device_train_batch_size=4, 
        
                               gradient_accumulation_steps=4, 
        
                               warmup_steps=2, 
        
                               max_steps=3, 
        
                               learning_rate=2e-4, 
        
                               fp16=True, 
        
                               logging_steps=1, 
        
                               output_dir=tmp_dir, 
        
                           ), 
        
                           data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False), 
        
                       ) 
        
                       model.config.use_cache = False 
        
                       trainer.train() 
        
                       model.cpu().save_pretrained(tmp_dir) 
        
                       assert "adapter_config.json" in os.listdir(tmp_dir) 
        
                       assert SAFETENSORS_WEIGHTS_NAME in os.listdir(tmp_dir) 
        
                       # assert loss is not None 
        
                       assert trainer.state.log_history[-1]["train_loss"] is not None 
        
               @pytest.mark.single_gpu_tests 
        
               def test_causal_lm_training_4bit_vera(self): 
        
                   r""" 
        
                   Same as test_causal_lm_training_4bit but with VeRA 
        
                   """ 
        
                   with tempfile.TemporaryDirectory() as tmp_dir: 
        
                       model = AutoModelForCausalLM.from_pretrained( 
        
                           self.causal_lm_model_id, 
        
                           quantization_config=BitsAndBytesConfig(load_in_4bit=True), 
        
                           device_map="auto", 
        
                       ) 
        
                       tokenizer = AutoTokenizer.from_pretrained(self.causal_lm_model_id) 
        
                       model = prepare_model_for_kbit_training(model) 
        
                       config = VeraConfig( 
        
                           r=16, 
        
                           target_modules=["q_proj", "v_proj"], 
        
                           vera_dropout=0.05, 
        
                           bias="none", 
        
                           task_type="CAUSAL_LM", 
        
                       ) 
        
                       model = get_peft_model(model, config) 
        
                       data = load_dataset_english_quotes() 
        
                       data = data.map(lambda samples: tokenizer(samples["quote"]), batched=True) 
        
                       trainer = Trainer( 
        
                           model=model, 
        
                           train_dataset=data["train"], 
        
                           args=TrainingArguments( 
        
                               per_device_train_batch_size=4, 
        
                               gradient_accumulation_steps=4, 
        
                               warmup_steps=2, 
        
                               max_steps=3, 
        
                               learning_rate=2e-4, 
        
                               fp16=True, 
        
                               logging_steps=1, 
        
                               output_dir=tmp_dir, 
        
                           ), 
        
                           data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False), 
        
                       ) 
        
                       model.config.use_cache = False 
        
                       trainer.train() 
        
                       model.cpu().save_pretrained(tmp_dir) 
        
                       assert "adapter_config.json" in os.listdir(tmp_dir) 
        
                       assert SAFETENSORS_WEIGHTS_NAME in os.listdir(tmp_dir) 
        
                       # assert loss is not None 
        
                       assert trainer.state.log_history[-1]["train_loss"] is not None 
        
               @pytest.mark.multi_gpu_tests 
        
               def test_causal_lm_training_multi_gpu_vera(self): 
        
                   r""" 
        
                   Same as test_causal_lm_training_multi_gpu but with VeRA 
        
                   """ 
        
                   with tempfile.TemporaryDirectory() as tmp_dir: 
        
                       model = AutoModelForCausalLM.from_pretrained( 
        
                           self.causal_lm_model_id, 
        
                           device_map="auto", 
        
                           quantization_config=BitsAndBytesConfig(load_in_8bit=True), 
        
                       ) 
        
                       assert set(model.hf_device_map.values()) == set(range(device_count)) 
        
                       model = prepare_model_for_kbit_training(model) 
        
                       setattr(model, "model_parallel", True) 
        
                       setattr(model, "is_parallelizable", True) 
        
                       config = VeraConfig( 
        
                           r=16, 
        
                           target_modules=["q_proj", "v_proj"], 
        
                           vera_dropout=0.05, 
        
                           bias="none", 
        
                           task_type="CAUSAL_LM", 
        
                       ) 
        
                       model = get_peft_model(model, config) 
        
                       data = load_dataset_english_quotes() 
        
                       data = data.map(lambda samples: self.tokenizer(samples["quote"]), batched=True) 
        
                       trainer = Trainer( 
        
                           model=model, 
        
                           train_dataset=data["train"], 
        
                           args=TrainingArguments( 
        
                               per_device_train_batch_size=4, 
        
                               gradient_accumulation_steps=4, 
        
                               warmup_steps=2, 
        
                               max_steps=3, 
        
                               learning_rate=2e-4, 
        
                               fp16=True, 
        
                               logging_steps=1, 
        
                               output_dir=tmp_dir, 
        
                           ), 
        
                           data_collator=DataCollatorForLanguageModeling(self.tokenizer, mlm=False), 
        
                       ) 
        
                       model.config.use_cache = False 
        
                       trainer.train() 
        
                       model.cpu().save_pretrained(tmp_dir) 
        
                       assert "adapter_config.json" in os.listdir(tmp_dir) 
        
                       assert SAFETENSORS_WEIGHTS_NAME in os.listdir(tmp_dir) 
        
                       # assert loss is not None 
        
                       assert trainer.state.log_history[-1]["train_loss"] is not None 
        
               @pytest.mark.multi_gpu_tests 
        
               def test_causal_lm_training_multi_gpu_4bit_vera(self): 
        
                   r""" 
        
                   Same as test_causal_lm_training_multi_gpu_4bit but with VeRA 
        
                   """ 
        
                   with tempfile.TemporaryDirectory() as tmp_dir: 
        
                       model = AutoModelForCausalLM.from_pretrained( 
        
                           self.causal_lm_model_id, 
        
                           device_map="auto", 
        
                           quantization_config=BitsAndBytesConfig(load_in_4bit=True), 
        
                       ) 
        
                       assert set(model.hf_device_map.values()) == set(range(device_count)) 
        
                       model = prepare_model_for_kbit_training(model) 
        
                       setattr(model, "model_parallel", True) 
        
                       setattr(model, "is_parallelizable", True) 
        
                       config = VeraConfig( 
        
                           r=16, 
        
                           target_modules=["q_proj", "v_proj"], 
        
                           vera_dropout=0.05, 
        
                           bias="none", 
        
                           task_type="CAUSAL_LM", 
        
                       ) 
        
                       model = get_peft_model(model, config) 
        
                       data = load_dataset_english_quotes() 
        
                       data = data.map(lambda samples: self.tokenizer(samples["quote"]), batched=True) 
        
                       trainer = Trainer( 
        
                           model=model, 
        
                           train_dataset=data["train"], 
        
                           args=TrainingArguments( 
        
                               per_device_train_batch_size=4, 
        
                               gradient_accumulation_steps=4, 
        
                               warmup_steps=2, 
        
                               max_steps=3, 
        
                               learning_rate=2e-4, 
        
                               fp16=True, 
        
                               logging_steps=1, 
        
                               output_dir=tmp_dir, 
        
                           ), 
        
                           data_collator=DataCollatorForLanguageModeling(self.tokenizer, mlm=False), 
        
                       ) 
        
                       model.config.use_cache = False 
        
                       trainer.train() 
        
                       model.cpu().save_pretrained(tmp_dir) 
        
                       assert "adapter_config.json" in os.listdir(tmp_dir) 
        
                       assert SAFETENSORS_WEIGHTS_NAME in os.listdir(tmp_dir) 
        
                       # assert loss is not None 
        
                       assert trainer.state.log_history[-1]["train_loss"] is not None

There are also some common GPU tests for VeRA that could be copied for RandLoRA (not high priority for me, though), starting here:

peft/tests/test_common_gpu.py

Line 157 in 8feea90

def test_vera_bnb_8bit_quantization(self):

Also, please ensure to run make style on your changes.

I also did a first run of RandLoRA using the PEFT method comparison suite. Here is the direct comparison to LoRA with rank 32:

metric	LoRA rank 32	RandLoRA rank 32
cuda memory max	22.3 GB	22.8 GB
cuda memory reserved avg	11.9 GB	12.7 GB
cuda memory reserved 99th percentile	17.7 GB	18.4 GB
train loss	0.607	0.618
test accuracy	47.8%	46.8%
total time (sec)	1847	2064

I used the default config and all other settings were also the defaults from that train script. As you can see, RandLoRA is roughly on par, though with slightly worse performance and runtime. If you have a suggestion for better settings, LMK. But even if not, this is just a single data point and not important for merging the PR.

BenjaminBossan · 2025-04-08T13:04:04Z

src/peft/tuners/randlora/config.py

+        sparse (`bool`):
+            Whether to use sparse random bases as described in the RandLora paper. The current implementation is a
+            proof of concept where the sparseness is not used to improve speed or memory usage. Defaults to `False`.
+        very_sparse (`bool`):
+            Whether to use very sparse random bases. The current implementation is a proof of concept where the
+            sparseness is not used to improve speed or memory usage. Defaults to `False`.


Since you've already done the work, I'm fine with keeping this. We should just clarify to the user what they should do with these parameters. If you add the info from this comment to the docstrings, it would help.

BenjaminBossan · 2025-04-08T13:06:16Z

src/peft/tuners/randlora/layer.py

+        if adapter_name in self.randlora_lambda.keys():
+            with torch.no_grad():
+                nn.init.zeros_(self.randlora_lambda[adapter_name])
+                nn.init.zeros_(self.randlora_gamma[adapter_name]).fill_(


Oh I see. zeros_ + fill_ could be replaced by constant_, right?

BenjaminBossan · 2025-04-08T13:14:52Z

src/peft/tuners/randlora/layer.py

+                The name of the adapter for which the delta weight should be computed.
+        """
+
+        update_B, update_A, dtype = self.get_scaled_bases(adapter)


dtype is not being used.

BenjaminBossan · 2025-04-08T13:16:39Z

src/peft/tuners/randlora/bnb.py

+                    if active_adapter not in self.randlora_lambda.keys():
+                        continue
+
+                    update_B, update_A, dtype = self.get_scaled_bases(active_adapter)


dtype is not being used

BenjaminBossan · 2025-04-08T13:17:24Z

src/peft/tuners/randlora/bnb.py

+                for active_adapter in self.active_adapters:
+                    if active_adapter not in self.randlora_lambda.keys():
+                        continue
+                    update_B, update_A, dtype = self.get_scaled_bases(active_adapter)


dtype is not used

src/peft/tuners/randlora/layer.py

BenjaminBossan · 2025-04-08T13:26:03Z

tests/test_custom_models.py


 # coding=utf-8
-# Copyright 2023-present the HuggingFace Inc. team.
+# Copyright 2025-present the HuggingFace Inc. team.


No need for this, please revert.

BenjaminBossan · 2025-04-08T13:26:16Z

tests/test_custom_models.py

 from transformers.pytorch_utils import Conv1D

+
+sys.path.append(os.path.join(os.getcwd(), "src"))


Please remove.

BenjaminBossan · 2025-04-08T13:56:32Z

tests/testing_common.py

            **config_kwargs,
        )
        model = model.to(self.torch_device).eval()
-


Please revert.

BenjaminBossan · 2025-04-08T14:08:36Z

examples/boft_controlnet/eval.sh

You have changed the permissions on these files from 755 to 644. Could you please revert this?

BenjaminBossan · 2025-04-11T09:03:58Z

@PaulAlbert31 You may have seen the merge conflicts in test_decoder_models.py and test_encoder_decoder_models.py. This is because we refactored the test files to get rid of unittest and only use pytest instead. Overall, this should simplify the test structure. Don't hesitate to ping me if you have questions when resolving the conflict.

PaulAlbert31 · 2025-04-11T09:50:31Z

Hey @BenjaminBossan
This new update runs the gpu tests correctly, I have also updated the default hyper-parameters of the RandLoRA config to be more comparable to a rank 32 LoRA. I haven't run the benchmark yet, I'll try and do that next week to ensure a fair comparison.
I haven't rebased on the current main yet to include the new bfloat16 tests, still working on that as well.

PaulAlbert31 · 2025-04-15T05:41:12Z

Hey @BenjaminBossan
I have now merged with main and updated hyper-parameters to have better performance on the PEFT method comparison suite.

Here are the numbers for equal trainable parameters to LoRA:

Metric	LoRA rank 32	RandLoRA rank 32
cuda memory max	22.3 GB	22.8 GB
cuda memory reserved avg	11.9 GB	12.7 GB
cuda memory reserved 99th percentile	17.7 GB	18.4 GB
train loss	0.607	0.577
test accuracy	47.8%	52.3%

I had to pause the LoRA training halfway so I do not have a accurate training time for it but it should be close to the results you reported.

I am surprised that the memory usage for RandLoRA is higher than LoRA, this goes against what I saw in my paper but that could also be due to specific matrix sizes. I think that memory savings of RandLora are not as interesting on square matrices (q in the attention) but I have to write a small script to test.

Hopefully I conducted the merge with main the right way, let me know if this is not appropriate.

BenjaminBossan · 2025-04-15T09:42:20Z

Hi @PaulAlbert31 thanks for updating the PR. Unfortunately, it looks like something went wrong during the rebase, maybe the main branch was not up-to-date? As is, the PR contains 42 commits and 95 changed files, which makes it basically impossible to review. Could you please:

Ensure that you fetched the latest main branch on your fork.
Rebase against this latest main again.
If you haven't already, make sure that git rerere is enabled, which could come in handy because of the merge conflicts.

This should hopefully get the PR back into a nice state. If worse comes to worst, there is always the option to create a new PR.

Here are the numbers for equal trainable parameters to LoRA:

Thanks for running this. I will also give it another try when the PR is ready.

I am surprised that the memory usage for RandLoRA is higher than LoRA, this goes against what I saw in my paper

Of course, this experiment is just a single data point, so we shouldn't overgeneralize from this.

…ctices

rebase

PaulAlbert31 · 2025-04-17T05:16:25Z

Hey @BenjaminBossan,
I have reset and redone the rebase properly. Things should be more readable now.

All contributions should still be here but let me know I am missing some tests. The bfloat16-float16 tests are passing as well.

Let me know if this is ok and we can move on to the next step 👍

BenjaminBossan

Thank you very much for rebasing again, otherwise it would have been very hard for me to do another in-depth review.

Overall, this PR is in a very good state, most things look very good. I mostly added some smaller comments, then there is an issue with multi-GPU training. Finally, I think we need some tests along the line of test_vera.py because of the shared variables in RandLoRA. I think those tests can just be copied with small edits and we should be fine. Test rest looks good.

I will also re-run the method comparison suite and report the results later.

BenjaminBossan · 2025-04-17T09:26:40Z

tests/test_gpu_examples.py

+            assert trainer.state.log_history[-1]["train_loss"] is not None
+
+    @pytest.mark.multi_gpu_tests
+    def test_causal_lm_training_multi_gpu_RandLora(self):


Suggested change

def test_causal_lm_training_multi_gpu_RandLora(self):

def test_causal_lm_training_multi_gpu_8bit_randlora(self):

BenjaminBossan · 2025-04-17T09:27:23Z

src/peft/tuners/randlora/layer.py

+        Out = randlora_lambda[:, :, None] * randlora_A * randlora_gamma[None,]
+        ctx.save_for_backward(randlora_A, randlora_lambda, randlora_gamma)
+        return Out


Suggested change

Out = randlora_lambda[:, :, None] * randlora_A * randlora_gamma[None,]

ctx.save_for_backward(randlora_A, randlora_lambda, randlora_gamma)

return Out

out = randlora_lambda[:, :, None] * randlora_A * randlora_gamma[None,]

ctx.save_for_backward(randlora_A, randlora_lambda, randlora_gamma)

return out

Or output.

BenjaminBossan · 2025-04-17T09:30:31Z

tests/test_gpu_examples.py

            assert trainer.state.log_history[-1]["train_loss"] is not None

+    @pytest.mark.single_gpu_tests
+    def test_causal_lm_training_randlora(self):


Suggested change

def test_causal_lm_training_randlora(self):

def test_causal_lm_training_8bit_randlora(self):

src/peft/tuners/randlora/layer.py

src/peft/tuners/randlora/bnb.py

BenjaminBossan · 2025-04-17T09:50:53Z

src/peft/tuners/randlora/bnb.py

+            if self.merged:
+                warnings.warn(
+                    f"Already following adapters were merged {','.join(self.merged_adapters)}. "
+                    f"You are now additionally merging {','.join(self.active_adapters)}."
+                )


Remove this, as check_adapters_to_merge already takes care of it (yes, it's not correct in VeRA and should be removed there too). Same for the other classes.

BenjaminBossan · 2025-04-17T10:18:09Z

src/peft/tuners/randlora/model.py

+
+        return largest_shape
+
+    def _init_randlora_A_randlora_B_sparse(self, config: RandLoraConfig, adapter_name: str, s: int = 3) -> None:


Let's avoid one letter variable names like s and n. Please provide a more expressive name.

src/peft/tuners/randlora/model.py

BenjaminBossan · 2025-04-17T10:44:10Z

tests/test_custom_models.py

        self.check_requires_grad(
            peft_model,
-            "base_model.model.lin1.vera_lambda_b.default",
+            "nobase_model.model.lin1.vera_lambda_b.default",


This change is probably not supposed to be there?

BenjaminBossan · 2025-04-17T10:46:14Z

tests/test_initialization.py

            prepare_layer_inputs_fn = fn
        else:
-            prepare_layer_inputs_fn = {k: fn for k in prepare_layer_inputs_keys}
+            prepare_layer_inputs_fn = dict.fromkeys(prepare_layer_inputs_keys, fn)


Please revert, this ruff rule is excluded in the pyproject.toml so should not be changed by ruff.

BenjaminBossan · 2025-04-17T11:39:40Z

Here are the new results I got, which are pretty much in line with what you posted above.

metric	LoRA rank 32	new RandLoRA rank 32
cuda memory max	22.3 GB	22.8 GB
cuda memory reserved avg	11.9 GB	12.7 GB
cuda memory reserved 99th percentile	17.7 GB	18.4 GB
train loss	0.607	0.578
test accuracy	47.8%	51.9%
total time (sec)	1847	1949

…d docstrings

PaulAlbert31 · 2025-04-20T09:20:11Z

Hey @BenjaminBossan, great to see you were able to reproduce the results !

I have updated the pull request with your suggested changes and added a test_randlora.py test file.

As I commented above, I am not sure how to proceed with the multi-GPU OOM error, let me know if you find an easy fix.

BenjaminBossan

Thanks for the latest changes and for adding the new tests. I only skimmed them, as they are mostly copy-pasted, right?

Apart from that, I only discovered a small mistake, the rest looks good.

As I commented above, I am not sure how to proceed with the multi-GPU OOM error, let me know if you find an easy fix.

Let's not worry too much about this, since it's only DP, which very few people still use. My suggestion is to just move to a smaller base model in the GPU tests, e.g. facebook/opt-350m, with a comment explaining that 6.7b results in OOM.

src/peft/tuners/randlora/model.py

tests/test_randlora.py

Bug fixing Co-authored-by: Benjamin Bossan <[email protected]>

PaulAlbert31 · 2025-04-23T00:20:34Z

Hi @BenjaminBossan,
Great to see we are getting close !

Yes test_randlora.py is a direct copy/paste of test_vera.py

I have not been able to reproduce the OOM error with DP as I only have 1 GPU available. Could you let me know where I can add a condition to switch to facebook/opt-350m for RandLoRA's tests ?

HuggingFaceDocBuilderDev · 2025-04-23T09:37:46Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan · 2025-04-23T09:47:00Z

@PaulAlbert31 I tried running the tests from test_gpu_examples.py again and this time they passed without OOM. I'm not sure what's different this time around, but I think we're good now and no further changes are required.

For the CI to run, could you please call make style and fix the remaining linter issue?

PaulAlbert31 · 2025-04-23T23:26:55Z

@BenjaminBossan Apologies, I keep forgetting to run make style
Great to hear the OOM is gone for now !

BenjaminBossan · 2025-04-24T10:19:41Z

Thanks for the fix @PaulAlbert31. The CI now ran and is mostly green, but as you can see, all the Windows runs failed, all with regard to this test:

tests/test_custom_models.py::PeftCustomModelTester::test_safe_merge_143_Vanilla_MLP_4_RandLora

I'm not sure what's going on and I don't have a Windows machine to test this on. Most likely, it's a matter of lowering precision, but I'm not sure. If you also can't test it, in my book it would be fine to skip this specific test on Windows with a comment to explain, as this feels very much like an edge case.

PaulAlbert31 · 2025-04-24T10:46:39Z

@BenjaminBossan
Damn, so close. I don't have access to a windows system either.

I have had a similar problem before and reducing the randlora_alpha coefficient helped. I have now reduced it even further. Is it ok to try the windows tests again ?

If the test still fails I'll add a skip for windows.

BenjaminBossan · 2025-04-24T12:55:25Z

This seems to have resolved the error, thanks @PaulAlbert31. I assume that lowering alpha helps with numerical stability. Is that something worth documenting? If not, please set the status from draft to ready, I'll do a final quick pass, and then we can hopefully merge the PR.

PaulAlbert31 · 2025-04-25T01:11:17Z

@BenjaminBossan thanks for the suggestion, I have documented the possible instabilities with larger randlora_alpha values and moved the PR status to ready 🎉

BenjaminBossan

Thanks for the final addition, the PR LGTM, great work and fine addition to PEFT.

Let's not forget to also add documentation and an example for RandLoRA to promote it's adoption, but this can be done in a future PR.

PaulAlbert31 marked this pull request as draft March 28, 2025 06:14

BenjaminBossan requested changes Mar 28, 2025

View reviewed changes

BenjaminBossan requested changes Apr 2, 2025

View reviewed changes

BenjaminBossan requested changes Apr 8, 2025

View reviewed changes

BenjaminBossan reviewed Apr 8, 2025

View reviewed changes

examples/boft_controlnet/eval.sh

Copy link

Member

BenjaminBossan Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have changed the permissions on these files from 755 to 644. Could you please revert this?

PaulAlbert31 and others added 7 commits April 17, 2025 04:19

randlora integration - more work to do to conform to quantization pra…

649a35b

…ctices

added randlora tests to test_custom_models

b1aabde

tests passing and fixes following feedback

c25744c

rebase

style changes

e90b52a

rebase

reverting licence change

7f14c50

rebase

reverting file permissions

a1f0539

better hyper-parameters and comformity with new dtype casts

7942a9c

PaulAlbert31 force-pushed the randlora branch from e0c5c0e to 7942a9c Compare April 17, 2025 05:09

BenjaminBossan requested changes Apr 17, 2025

View reviewed changes

new tests for shared weights, temp multi-gpu fix, better var names an…

95c8fb1

…d docstrings

BenjaminBossan requested changes Apr 22, 2025

View reviewed changes

src/peft/tuners/randlora/model.py Outdated Show resolved Hide resolved

tests/test_randlora.py Show resolved Hide resolved

PaulAlbert31 and others added 2 commits April 23, 2025 08:52

Update src/peft/tuners/randlora/model.py

6c6d441

Bug fixing Co-authored-by: Benjamin Bossan <[email protected]>

comment on TesRandLora

409f647

PaulAlbert31 force-pushed the randlora branch from fbd78db to 409f647 Compare April 23, 2025 00:25

make style

9d280af

reduce randlora_alpha in tests

321fb5c

add instability warining for larger randlora_alpha values

43093c2

PaulAlbert31 marked this pull request as ready for review April 25, 2025 01:11

BenjaminBossan approved these changes Apr 25, 2025

View reviewed changes

BenjaminBossan merged commit d3ff133 into huggingface:main Apr 25, 2025
14 checks passed

PaulAlbert31 deleted the randlora branch April 25, 2025 22:22

PaulAlbert31 mentioned this pull request Apr 30, 2025

Randlora documentation and some example usage #2524

Merged

PaulAlbert31 restored the randlora branch May 6, 2025 02:01

	########
	# VeRA #
	########
	("Vanilla MLP 1 VeRA", "MLP", VeraConfig, {"target_modules": "lin0"}),
	("Vanilla MLP 2 VeRA", "MLP", VeraConfig, {"target_modules": ["lin0"]}),
	("Vanilla MLP 3 VeRA", "MLP", VeraConfig, {"target_modules": ["lin1"]}),
	("Vanilla MLP 4 VeRA", "MLP", VeraConfig, {"target_modules": ["lin0", "lin1"]}),
	(
	"Vanilla MLP 5 VeRA",
	"MLP",
	VeraConfig,
	{"target_modules": ["lin0"], "modules_to_save": ["lin1"]},
	),
	(
	"Embedding + transformers Conv1D 1 VeRA",
	"EmbConv1D",
	VeraConfig,
	{"target_modules": ["conv1d"]},
	),

	if randlora_B_param.shape[1] < self.r[adapter_name]:
	if randlora_B_param.shape[-1] < self.r[adapter_name]:

	},
	)

	CLASSES_MAPPING = {
	"ia3": (IA3Config, CONFIG_TESTING_KWARGS[0]),
	"lora": (LoraConfig, CONFIG_TESTING_KWARGS[1]),
	"prefix_tuning": (PrefixTuningConfig, CONFIG_TESTING_KWARGS[2]),
	"prompt_encoder": (PromptEncoderConfig, CONFIG_TESTING_KWARGS[3]),
	"prompt_tuning": (PromptTuningConfig, CONFIG_TESTING_KWARGS[4]),
	"adalora": (AdaLoraConfig, CONFIG_TESTING_KWARGS[5]),
	"boft": (BOFTConfig, CONFIG_TESTING_KWARGS[6]),
	"vera": (VeraConfig, CONFIG_TESTING_KWARGS[7]),
	"fourierft": (FourierFTConfig, CONFIG_TESTING_KWARGS[8]),
	"hra": (HRAConfig, CONFIG_TESTING_KWARGS[9]),
	"vblora": (VBLoRAConfig, CONFIG_TESTING_KWARGS[10]),
	"oft": (OFTConfig, CONFIG_TESTING_KWARGS[11]),
	"bone": (BoneConfig, CONFIG_TESTING_KWARGS[12]),
	"lora+trainable_tokens": (LoraConfig, CONFIG_TESTING_KWARGS[13]),
	}

	@pytest.mark.single_gpu_tests
	def test_causal_lm_training_vera(self):
	r"""
	Same as test_causal_lm_training but with VeRA
	"""
	with tempfile.TemporaryDirectory() as tmp_dir:
	model = AutoModelForCausalLM.from_pretrained(
	self.causal_lm_model_id,
	quantization_config=BitsAndBytesConfig(load_in_8bit=True),
	device_map="auto",
	)

	tokenizer = AutoTokenizer.from_pretrained(self.causal_lm_model_id)
	model = prepare_model_for_kbit_training(model)

	config = VeraConfig(
	r=16,
	target_modules=["q_proj", "v_proj"],
	vera_dropout=0.05,
	bias="none",
	task_type="CAUSAL_LM",
	)

	model = get_peft_model(model, config)

	data = load_dataset_english_quotes()
	data = data.map(lambda samples: tokenizer(samples["quote"]), batched=True)

	trainer = Trainer(
	model=model,
	train_dataset=data["train"],
	args=TrainingArguments(
	per_device_train_batch_size=4,
	gradient_accumulation_steps=4,
	warmup_steps=2,
	max_steps=3,
	learning_rate=2e-4,
	fp16=True,
	logging_steps=1,
	output_dir=tmp_dir,
	),
	data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False),
	)
	model.config.use_cache = False
	trainer.train()

	model.cpu().save_pretrained(tmp_dir)

	assert "adapter_config.json" in os.listdir(tmp_dir)
	assert SAFETENSORS_WEIGHTS_NAME in os.listdir(tmp_dir)

	# assert loss is not None
	assert trainer.state.log_history[-1]["train_loss"] is not None

	@pytest.mark.single_gpu_tests
	def test_causal_lm_training_4bit_vera(self):
	r"""
	Same as test_causal_lm_training_4bit but with VeRA
	"""
	with tempfile.TemporaryDirectory() as tmp_dir:
	model = AutoModelForCausalLM.from_pretrained(
	self.causal_lm_model_id,
	quantization_config=BitsAndBytesConfig(load_in_4bit=True),
	device_map="auto",
	)

	tokenizer = AutoTokenizer.from_pretrained(self.causal_lm_model_id)
	model = prepare_model_for_kbit_training(model)

	config = VeraConfig(
	r=16,
	target_modules=["q_proj", "v_proj"],
	vera_dropout=0.05,
	bias="none",
	task_type="CAUSAL_LM",
	)

	model = get_peft_model(model, config)

	data = load_dataset_english_quotes()
	data = data.map(lambda samples: tokenizer(samples["quote"]), batched=True)

	trainer = Trainer(
	model=model,
	train_dataset=data["train"],
	args=TrainingArguments(
	per_device_train_batch_size=4,
	gradient_accumulation_steps=4,
	warmup_steps=2,
	max_steps=3,
	learning_rate=2e-4,
	fp16=True,
	logging_steps=1,
	output_dir=tmp_dir,
	),
	data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False),
	)
	model.config.use_cache = False
	trainer.train()

	model.cpu().save_pretrained(tmp_dir)

	assert "adapter_config.json" in os.listdir(tmp_dir)
	assert SAFETENSORS_WEIGHTS_NAME in os.listdir(tmp_dir)

	# assert loss is not None
	assert trainer.state.log_history[-1]["train_loss"] is not None

	@pytest.mark.multi_gpu_tests
	def test_causal_lm_training_multi_gpu_vera(self):
	r"""
	Same as test_causal_lm_training_multi_gpu but with VeRA
	"""

	with tempfile.TemporaryDirectory() as tmp_dir:
	model = AutoModelForCausalLM.from_pretrained(
	self.causal_lm_model_id,
	device_map="auto",
	quantization_config=BitsAndBytesConfig(load_in_8bit=True),
	)

	assert set(model.hf_device_map.values()) == set(range(device_count))

	model = prepare_model_for_kbit_training(model)

	setattr(model, "model_parallel", True)
	setattr(model, "is_parallelizable", True)

	config = VeraConfig(
	r=16,
	target_modules=["q_proj", "v_proj"],
	vera_dropout=0.05,
	bias="none",
	task_type="CAUSAL_LM",
	)

	model = get_peft_model(model, config)

	data = load_dataset_english_quotes()
	data = data.map(lambda samples: self.tokenizer(samples["quote"]), batched=True)

	trainer = Trainer(
	model=model,
	train_dataset=data["train"],
	args=TrainingArguments(
	per_device_train_batch_size=4,
	gradient_accumulation_steps=4,
	warmup_steps=2,
	max_steps=3,
	learning_rate=2e-4,
	fp16=True,
	logging_steps=1,
	output_dir=tmp_dir,
	),
	data_collator=DataCollatorForLanguageModeling(self.tokenizer, mlm=False),
	)
	model.config.use_cache = False
	trainer.train()

	model.cpu().save_pretrained(tmp_dir)

	assert "adapter_config.json" in os.listdir(tmp_dir)
	assert SAFETENSORS_WEIGHTS_NAME in os.listdir(tmp_dir)

	# assert loss is not None
	assert trainer.state.log_history[-1]["train_loss"] is not None

	@pytest.mark.multi_gpu_tests
	def test_causal_lm_training_multi_gpu_4bit_vera(self):
	r"""
	Same as test_causal_lm_training_multi_gpu_4bit but with VeRA
	"""

	with tempfile.TemporaryDirectory() as tmp_dir:
	model = AutoModelForCausalLM.from_pretrained(
	self.causal_lm_model_id,
	device_map="auto",
	quantization_config=BitsAndBytesConfig(load_in_4bit=True),
	)

	assert set(model.hf_device_map.values()) == set(range(device_count))

	model = prepare_model_for_kbit_training(model)

	setattr(model, "model_parallel", True)
	setattr(model, "is_parallelizable", True)

	config = VeraConfig(
	r=16,
	target_modules=["q_proj", "v_proj"],
	vera_dropout=0.05,
	bias="none",
	task_type="CAUSAL_LM",
	)

	model = get_peft_model(model, config)

	data = load_dataset_english_quotes()
	data = data.map(lambda samples: self.tokenizer(samples["quote"]), batched=True)

	trainer = Trainer(
	model=model,
	train_dataset=data["train"],
	args=TrainingArguments(
	per_device_train_batch_size=4,
	gradient_accumulation_steps=4,
	warmup_steps=2,
	max_steps=3,
	learning_rate=2e-4,
	fp16=True,
	logging_steps=1,
	output_dir=tmp_dir,
	),
	data_collator=DataCollatorForLanguageModeling(self.tokenizer, mlm=False),
	)
	model.config.use_cache = False
	trainer.train()

	model.cpu().save_pretrained(tmp_dir)

	assert "adapter_config.json" in os.listdir(tmp_dir)
	assert SAFETENSORS_WEIGHTS_NAME in os.listdir(tmp_dir)

	# assert loss is not None
	assert trainer.state.log_history[-1]["train_loss"] is not None

		from transformers.pytorch_utils import Conv1D


		sys.path.append(os.path.join(os.getcwd(), "src"))

	def test_causal_lm_training_multi_gpu_RandLora(self):
	def test_causal_lm_training_multi_gpu_8bit_randlora(self):

	def test_causal_lm_training_randlora(self):
	def test_causal_lm_training_8bit_randlora(self):


		return largest_shape

		def _init_randlora_A_randlora_B_sparse(self, config: RandLoraConfig, adapter_name: str, s: int = 3) -> None:

Feature: RandLora integration into peft #2464

Feature: RandLora integration into peft #2464

Uh oh!

Conversation

PaulAlbert31 commented Mar 28, 2025

Uh oh!

BenjaminBossan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan commented Mar 31, 2025

Uh oh!

PaulAlbert31 commented Mar 31, 2025

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan commented Apr 2, 2025

Uh oh!

PaulAlbert31 commented Apr 8, 2025

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan commented Apr 11, 2025

Uh oh!

PaulAlbert31 commented Apr 11, 2025

Uh oh!

PaulAlbert31 commented Apr 15, 2025

Uh oh!

BenjaminBossan commented Apr 15, 2025

Uh oh!

PaulAlbert31 commented Apr 17, 2025

BenjaminBossan left a comment •

edited

Loading