You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/peft/tuners/randlora/config.py
+15-11Lines changed: 15 additions & 11 deletions
Original file line number
Diff line number
Diff line change
@@ -29,26 +29,30 @@ class RandLoraConfig(PeftConfig):
29
29
30
30
Args:
31
31
r (`int`, *optional*, defaults to `32`):
32
-
RandLora's random basis rank dimension. Contrary to Lora, this parameter is inversely proportional to the amount of trainable
33
-
parameters as reducing it increases trainable parameters.
32
+
RandLora's random basis rank dimension. Contrary to Lora, this parameter is inversely proportional to the
33
+
amount of trainable parameters as reducing it increases trainable parameters.
34
34
target_modules (`Union[list[str], str]`):
35
35
The names of the modules to apply RandLora to. Only linear layers are supported.
36
36
projection_prng_key (`int`):
37
37
RandLora PRNG init key. Used for initialising basis_A and basis_B for new models or when loading a
38
38
checkpoint that did not include these projections. Defaults to `0`.
39
39
save_projection (`bool`):
40
40
Whether to save the global basis_A / basis_B random basis in the state dict alongside per layer lambda /
41
-
gamma diagonal matrices. This will increase the size of the checkpoint, but guarantee that we can
42
-
reload the checkpoint on all system configurations. Defaults to `True`.
41
+
gamma diagonal matrices. This will increase the size of the checkpoint, but guarantee that we can reload
42
+
the checkpoint on all system configurations. Defaults to `True`.
43
43
sparse (`bool`):
44
-
Whether to use sparse random bases as described in the RandLora paper. The bases are ternary sparse bases (only containing -1, 0 and 1) where the attribution probability is 1/6 for -1 and 1 and 2/3 for 0.
45
-
These sparse matrices aim to be used for matmul free computation in the future, see https://arxiv.org/pdf/2406.02528v1
46
-
The current implementation is a proof of concept however where the sparseness is not used to improve speed or memory usage. Using sparse matrices typically does not reduce performance and can even help reduce overfitting.
47
-
Defaults to `False`.
44
+
Whether to use sparse random bases as described in the RandLora paper. The bases are ternary sparse bases
45
+
(only containing -1, 0 and 1) where the attribution probability is 1/6 for -1 and 1 and 2/3 for 0. These
46
+
sparse matrices aim to be used for matmul free computation in the future, see
47
+
https://arxiv.org/pdf/2406.02528v1 The current implementation is a proof of concept however where the
48
+
sparseness is not used to improve speed or memory usage. Using sparse matrices typically does not reduce
49
+
performance and can even help reduce overfitting. Defaults to `False`.
48
50
very_sparse (`bool`):
49
-
Whether to use highly sparse random bases as described in the RandLora paper. The very sparse bases are ternary sparse bases (only containing -1, 0 and 1) given a matrix with smallest dimension d, the attribution probability is 1/√D for -1 and 1 and 1- 2/√D for 0.
50
-
Using these sparse matrices can further reduce overfitting over the `sparse` alternatives but will most likely decrease performance as a results. Use carefully.
51
-
Defaults to `False`.
51
+
Whether to use highly sparse random bases as described in the RandLora paper. The very sparse bases are
52
+
ternary sparse bases (only containing -1, 0 and 1) given a matrix with smallest dimension d, the
53
+
attribution probability is 1/√D for -1 and 1 and 1- 2/√D for 0. Using these sparse matrices can further
54
+
reduce overfitting over the `sparse` alternatives but will most likely decrease performance as a results.
0 commit comments