Skip to content

Commit 430e896

Browse files
UPDATE document format in GraLoRA
1 parent 3f69d8f commit 430e896

File tree

1 file changed

+31
-33
lines changed

1 file changed

+31
-33
lines changed

src/peft/tuners/gralora/config.py

Lines changed: 31 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -26,50 +26,48 @@ class GraloraConfig(PeftConfig):
2626
2727
Args:
2828
r (`int`):
29-
GraLoRA attention dimension determines the rank of the GraLoRA adapter.
30-
The total parameter count of the GraLoRA adapter is same as LoRA with same rank r, while the expressivitiy is multiplied by gralora_k.
29+
GraLoRA attention dimension determines the rank of the GraLoRA adapter. The total parameter count of the
30+
GraLoRA adapter is same as LoRA with same rank r, while the expressivitiy is multiplied by gralora_k.
3131
hybrid_r (`int`):
3232
Hybrid GraLoRA rank determines the rank allocated to vanilla LoRA method when using Hybrid GraLoRA method.
33-
Hybrid GraLoRA, a combination of GraLoRA and vanilla LoRA, becomes available when hybrid_r > 0.
34-
The parameter count of the GraLoRA adapter is r + hybrid_r.
33+
Hybrid GraLoRA, a combination of GraLoRA and vanilla LoRA, becomes available when hybrid_r > 0. The
34+
parameter count of the GraLoRA adapter is r + hybrid_r.
3535
target_modules (`Union[List[str], str]`):
36-
List of module names or regex expression of the module names to replace with GraLoRA. "
37-
For example, ['q', 'v'] or '.*decoder.*(SelfAttention|EncDecAttention).*(q|v)$'. "
38-
This can also be a wildcard 'all-linear' which matches all linear/Conv1D "
39-
"(if the model is a PreTrainedModel, the output layer excluded). "
40-
If not specified, modules will be chosen according to the model architecture, If the architecture is "
41-
not known, an error will be raised -- in this case, you should specify the target modules manually. "
42-
To avoid targeting any modules (because you want to apply `target_parameters`), set "
43-
`target_modules=[]`.
36+
List of module names or regex expression of the module names to replace with GraLoRA. " For example, ['q',
37+
'v'] or '.*decoder.*(SelfAttention|EncDecAttention).*(q|v)$'. " This can also be a wildcard 'all-linear'
38+
which matches all linear/Conv1D " "(if the model is a PreTrainedModel, the output layer excluded). " If not
39+
specified, modules will be chosen according to the model architecture, If the architecture is " not known,
40+
an error will be raised -- in this case, you should specify the target modules manually. " To avoid
41+
targeting any modules (because you want to apply `target_parameters`), set " `target_modules=[]`.
4442
gralora_alpha (`int`): GraLoRA alpha.
45-
GraLoRA alpha is the scaling factor for the GraLoRA adapter.
46-
Scale becomes gralora_alpha / (r + hybrid_r).
43+
GraLoRA alpha is the scaling factor for the GraLoRA adapter. Scale becomes gralora_alpha / (r + hybrid_r).
4744
gralora_dropout (`float`):
48-
GraLoRA dropout is the dropout probability for the GraLoRA adapter.
49-
It is used to prevent overfitting and improve the generalization of the GraLoRA adapter.
45+
GraLoRA dropout is the dropout probability for the GraLoRA adapter. It is used to prevent overfitting and
46+
improve the generalization of the GraLoRA adapter.
5047
gralora_k (`int`):
51-
GraLoRA k determines the number of subblocks in the GraLoRA adapter.
52-
The rank r must be divisible by gralora_k for the GraLoRA adapter to be valid.
53-
The total parameter count is preserved regardles of gralora_k.
54-
The entire rank of the GraLoRA adapter is increased by gralora_k, while the rank of each subblock is reduced by gralora_k.
55-
gralora_k=2 is recommended for rank 32 or lower, and gralora_k=4 is recommended for rank 64 or higher.
48+
GraLoRA k determines the number of subblocks in the GraLoRA adapter. The rank r must be divisible by
49+
gralora_k for the GraLoRA adapter to be valid. The total parameter count is preserved regardles of
50+
gralora_k. The entire rank of the GraLoRA adapter is increased by gralora_k, while the rank of each
51+
subblock is reduced by gralora_k. gralora_k=2 is recommended for rank 32 or lower, and gralora_k=4 is
52+
recommended for rank 64 or higher.
5653
fan_in_fan_out (`bool`):
57-
Set this to True if the layer to replace stores weight like (fan_in, fan_out).
58-
For example, gpt-2 uses `Conv1D` which stores weights like (fan_in, fan_out) and hence this should be set to `True`.
54+
Set this to True if the layer to replace stores weight like (fan_in, fan_out). For example, gpt-2 uses
55+
`Conv1D` which stores weights like (fan_in, fan_out) and hence this should be set to `True`.
5956
bias (`str`):
60-
Bias type for gralora. Can be 'none', 'all' or 'gralora_only'.
61-
If 'all' or 'gralora_only', the corresponding biases will be updated during training.
62-
Be aware that this means that, even when disabling the adapters, the model will not produce the same output as the base model would have without adaptation.
57+
Bias type for gralora. Can be 'none', 'all' or 'gralora_only'. If 'all' or 'gralora_only', the
58+
corresponding biases will be updated during training. Be aware that this means that, even when disabling
59+
the adapters, the model will not produce the same output as the base model would have without adaptation.
6360
init_weights (`bool`):
64-
Whether to initialize the weights of the GraLoRA layers with their default initialization.
65-
Don't change this setting, except if you know exactly what you're doing.
61+
Whether to initialize the weights of the GraLoRA layers with their default initialization. Don't change
62+
this setting, except if you know exactly what you're doing.
6663
layers_to_transform (`Union[List[int], int]`):
67-
The layer indexes to transform, is this argument is specified, PEFT will transform only the layers indexes that are specified inside this list.
68-
If a single integer is passed, PEFT will transform only the layer at this index.
69-
This only works when target_modules is a list of str.
64+
The layer indexes to transform, is this argument is specified, PEFT will transform only the layers indexes
65+
that are specified inside this list. If a single integer is passed, PEFT will transform only the layer at
66+
this index. This only works when target_modules is a list of str.
7067
layers_pattern (`Optional[Union[List[str], str]]`):
71-
The layer pattern name, used only if `layers_to_transform` is different to None and if the layer pattern is not in the common layers pattern.
72-
This only works when target_modules is a list of str. This should target the `nn.ModuleList` of the model, which is often called `'layers'` or `'h'`.
68+
The layer pattern name, used only if `layers_to_transform` is different to None and if the layer pattern is
69+
not in the common layers pattern. This only works when target_modules is a list of str. This should target
70+
the `nn.ModuleList` of the model, which is often called `'layers'` or `'h'`.
7371
"""
7472

7573
r: int = field(

0 commit comments

Comments
 (0)