You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,6 +41,7 @@ Additionally, we are expanding capabilities for other modalities. Currently, we
41
41
SWIFT has rich documentations for users, please check [here](https://github.com/modelscope/swift/tree/main/docs/source_en/LLM).
42
42
43
43
## 🎉 News
44
+
- 🔥2024.05.17: Support peft=0.11.0. Meanwhile support 3 new tuners: `BOFT`, `Vera` and `Pissa`. use `--sft_type boft/vera` to use BOFT or Vera, use `--init_lora_weights pissa` with `--sft_type lora` to use Pissa.
44
45
- 2024.05.16: Supports Llava-Next (Stronger) series models. For best practice, you can refer to [here](https://github.com/modelscope/swift/tree/main/docs/source_en/Multi-Modal/llava-best-practice.md).
45
46
- 🔥2024.05.13: Support Yi-1.5 series models,use `--model_type yi-1_5-9b-chat` to begin!
46
47
- 2024.05.11: Support for qlora training and quantized inference using [hqq](https://github.com/mobiusml/hqq) and [eetq](https://github.com/NetEase-FuXi/EETQ). For more information, see the [LLM Quantization Documentation](https://github.com/modelscope/swift/tree/main/docs/source_en/LLM/LLM-quantization.md).
Copy file name to clipboardExpand all lines: docs/source_en/LLM/Command-line-parameters.md
+18Lines changed: 18 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -63,6 +63,7 @@
63
63
-`--lora_rank`: Default is `8`. Only takes effect when `sft_type` is 'lora'.
64
64
-`--lora_alpha`: Default is `32`. Only takes effect when `sft_type` is 'lora'.
65
65
-`--lora_dropout_p`: Default is `0.05`, only takes effect when `sft_type` is 'lora'.
66
+
-`--init_lora_weights`: Method to initialize LoRA weights, can be specified as `true`, `false`, `gaussian`, `pissa`, or `pissa_niter_[number of iters]`. Default value `true`.
66
67
-`--lora_bias_trainable`: Default is `'none'`, options: 'none', 'all'. Set to `'all'` to make all biases trainable.
67
68
-`--lora_modules_to_save`: Default is `[]`. If you want to train embedding, lm_head, or layer_norm, you can set this parameter, e.g. `--lora_modules_to_save EMBEDDING LN lm_head`. If passed `'EMBEDDING'`, Embedding layer will be added to `lora_modules_to_save`. If passed `'LN'`, `RMSNorm` and `LayerNorm` will be added to `lora_modules_to_save`.
68
69
-`--lora_dtype`: Default is `'AUTO'`, specifies dtype for lora modules. If `AUTO`, follow dtype of original module. Options: 'fp16', 'bf16', 'fp32', 'AUTO'.
@@ -135,6 +136,23 @@
135
136
136
137
-`--sequence_parallel_size`: Default value `1`, a positive value can be used to split a sequence to multiple GPU to reduce memory usage. The value should divide the GPU count.
137
138
139
+
### BOFT Parameters
140
+
141
+
-`--boft_block_size`: BOFT block size, default value is 4.
142
+
-`--boft_block_num`: Number of BOFT blocks, cannot be used simultaneously with `boft_block_size`.
143
+
-`--boft_target_modules`: BOFT target modules. Default is `['DEFAULT']`. If `boft_target_modules` is set to `'DEFAULT'` or `'AUTO'`, it will look up `boft_target_modules` in the `MODEL_MAPPING` based on `model_type` (default specified as qkv). If set to `'ALL'`, all Linear layers (excluding the head) will be designated as BOFT modules.
144
+
-`--boft_dropout`: Dropout value for BOFT, default is 0.0.
145
+
-`--boft_modules_to_save`: Additional modules to be trained and saved, default is `None`.
146
+
147
+
### Vera Parameters
148
+
149
+
-`--vera_rank`: Size of Vera Attention, default value is 256.
150
+
-`--vera_projection_prng_key`: Whether to store the Vera projection matrix, default is True.
151
+
-`--vera_target_modules`: Vera target modules. Default is `['DEFAULT']`. If `vera_target_modules` is set to `'DEFAULT'` or `'AUTO'`, it will look up `vera_target_modules` in the `MODEL_MAPPING` based on `model_type` (default specified as qkv). If set to `'ALL'`, all Linear layers (excluding the head) will be designated as Vera modules. Vera modules need to share a same shape.
152
+
-`--vera_dropout`: Dropout value for Vera, default is 0.0.
153
+
-`--vera_d_initial`: Initial value for Vera's d matrix, default is 0.1.
154
+
-`--vera_modules_to_save`: Additional modules to be trained and saved, default is `None`.
155
+
138
156
### LoRA+ Fine-tuning Parameters
139
157
140
158
-`--lora_lr_ratio`: Default `None`, recommended value `10~16`, specify this parameter when using lora to enable lora+.
0 commit comments