You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/peft/tuners/gralora/config.py
+71-11Lines changed: 71 additions & 11 deletions
Original file line number
Diff line number
Diff line change
@@ -21,6 +21,57 @@
21
21
22
22
@dataclass
23
23
classGraloraConfig(PeftConfig):
24
+
"""
25
+
This is the configuration class to store the configuration of a [`GraloraModel`].
26
+
27
+
Args:
28
+
r (`int`):
29
+
GraLoRA attention dimension determines the rank of the GraLoRA adapter.
30
+
The total parameter count of the GraLoRA adapter is same as LoRA with same rank r, while the expressivitiy is multiplied by gralora_k.
31
+
hybrid_r (`int`):
32
+
Hybrid GraLoRA rank determines the rank allocated to vanilla LoRA method when using Hybrid GraLoRA method.
33
+
Hybrid GraLoRA, a combination of GraLoRA and vanilla LoRA, becomes available when hybrid_r > 0.
34
+
The parameter count of the GraLoRA adapter is r + hybrid_r.
35
+
target_modules (`Union[List[str], str]`):
36
+
List of module names or regex expression of the module names to replace with GraLoRA. "
37
+
For example, ['q', 'v'] or '.*decoder.*(SelfAttention|EncDecAttention).*(q|v)$'. "
38
+
This can also be a wildcard 'all-linear' which matches all linear/Conv1D "
39
+
"(if the model is a PreTrainedModel, the output layer excluded). "
40
+
If not specified, modules will be chosen according to the model architecture, If the architecture is "
41
+
not known, an error will be raised -- in this case, you should specify the target modules manually. "
42
+
To avoid targeting any modules (because you want to apply `target_parameters`), set "
43
+
`target_modules=[]`.
44
+
gralora_alpha (`int`): GraLoRA alpha.
45
+
GraLoRA alpha is the scaling factor for the GraLoRA adapter.
46
+
Scale becomes gralora_alpha / (r + hybrid_r).
47
+
gralora_dropout (`float`):
48
+
GraLoRA dropout is the dropout probability for the GraLoRA adapter.
49
+
It is used to prevent overfitting and improve the generalization of the GraLoRA adapter.
50
+
gralora_k (`int`):
51
+
GraLoRA k determines the number of subblocks in the GraLoRA adapter.
52
+
The rank r must be divisible by gralora_k for the GraLoRA adapter to be valid.
53
+
The total parameter count is preserved regardles of gralora_k.
54
+
The entire rank of the GraLoRA adapter is increased by gralora_k, while the rank of each subblock is reduced by gralora_k.
55
+
gralora_k=2 is recommended for rank 32 or lower, and gralora_k=4 is recommended for rank 64 or higher.
56
+
fan_in_fan_out (`bool`):
57
+
Set this to True if the layer to replace stores weight like (fan_in, fan_out).
58
+
For example, gpt-2 uses `Conv1D` which stores weights like (fan_in, fan_out) and hence this should be set to `True`.
59
+
bias (`str`):
60
+
Bias type for gralora. Can be 'none', 'all' or 'gralora_only'.
61
+
If 'all' or 'gralora_only', the corresponding biases will be updated during training.
62
+
Be aware that this means that, even when disabling the adapters, the model will not produce the same output as the base model would have without adaptation.
63
+
init_weights (`bool`):
64
+
Whether to initialize the weights of the GraLoRA layers with their default initialization.
65
+
Don't change this setting, except if you know exactly what you're doing.
66
+
layers_to_transform (`Union[List[int], int]`):
67
+
The layer indexes to transform, is this argument is specified, PEFT will transform only the layers indexes that are specified inside this list.
68
+
If a single integer is passed, PEFT will transform only the layer at this index.
69
+
This only works when target_modules is a list of str.
0 commit comments