You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Signed-off-by: Sahil Modi <samodi@nvidia.com>
Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: Jonas Yang <joyang@nvidia.com>
Co-authored-by: ruit <ruit@nvidia.com>
Co-authored-by: Jonas Yang <joyang@nvidia.com>
Copy file name to clipboardExpand all lines: docs/guides/sft.md
+47Lines changed: 47 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -161,3 +161,50 @@ As long as your custom dataset has the `formatted_ds` and `task_spec` attributes
161
161
## Evaluate the Trained Model
162
162
163
163
Upon completion of the training process, you can refer to our [evaluation guide](eval.md) to assess model capabilities.
164
+
165
+
166
+
## LoRA Configuration
167
+
168
+
NeMo RL supports LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning. LoRA reduces trainable parameters by using low-rank matrices for weight updates while keeping the base model frozen.
169
+
170
+
Notes:
171
+
- LoRA is supported with DTensor v2 and Megatron backends. DTensor v1 does not support LoRA (ensure `policy.dtensor_cfg._v2=true` when using DTensor).
172
+
- Triton kernels are only used in the DTensor v2 path. For TP > 1, Automodel currently does not support Triton kernels (see note below).
173
+
174
+
### Configuration Parameters
175
+
176
+
The LoRA configuration is specified under the `policy.dtensor_cfg.lora_cfg` section:
177
+
178
+
policy:
179
+
dtensor_cfg:
180
+
lora_cfg:
181
+
enabled: False # Set to True to enable LoRA fine-tuning
182
+
target_modules: [] # List of module names to apply LoRA
183
+
exclude_modules: [] # List of module names to exclude from LoRA
184
+
match_all_linear: true # Apply LoRA to all linear layers
- **`dropout`** (float): Dropout probability for regularization
200
+
- **`dropout_position`** (str): Apply dropout before ("pre") or after ("post") LoRA
201
+
- **`lora_A_init`** (str): Initialization method for LoRA A matrix
202
+
- **`use_triton`** (bool): Use Triton-optimized kernels for better performance. Used for DTensor v2 only. **Note**: [Automodel does not support Triton for TP > 1](https://github.com/NVIDIA-NeMo/Automodel/blob/b2db55eee98dfe81a8bfe5e23ac4e57afd8ab261/nemo_automodel/recipes/llm/train_ft.py#L199). Set to `false` when `tensor_parallel_size > 1` to avoid compatibility issues
203
+
204
+
### Example Usage
205
+
206
+
```bash
207
+
uv run examples/run_sft.py policy.dtensor_cfg.lora_cfg.enabled=true
208
+
```
209
+
210
+
For more details on LoRA, see [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685).
0 commit comments