You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please refer our [document](./docs/training.md) to see how to start [Single GPU](./docs/training.md#single-gpu) or [Multi-GPU](./docs/training.md#multiple-gpus-with-fsdp) runs with fms-hf-tuning.
39
-
40
-
You can also refer the same [document](./docs/training.md#tips-on-parameters-to-set) on how to use various training arguments.
38
+
* Please refer our document on [training](./docs/training.md) to see how to start [Single GPU](./docs/training.md#single-gpu) or [Multi-GPU](./docs/training.md#multiple-gpus-with-fsdp) runs with fms-hf-tuning.
39
+
* You can also refer the same a different [section](./docs/training.md#tips-on-parameters-to-set) of the same document on tips to set various training arguments.
41
40
42
41
### *Debug recommendation:*
43
42
While training, if you encounter flash-attn errors such as `undefined symbol`, you can follow the below steps for clean installation of flash binaries. This may occur when having multiple environments sharing the pip cache directory or torch version is updated.
-For each tuning technique, we run testing on a single large model of each architecture type and claim support for the smaller models. For example, with QLoRA technique, we tested on granite-34b GPTBigCode and claim support for granite-20b-multilingual.
52
+
-While we expect most Hugging Face decoder models to work, we have primarily tested fine-tuning for Granite, Llama, Mistral and GPT-OSS family of models.
54
53
55
54
- LoRA Layers supported : All the linear layers of a model + output `lm_head` layer. Users can specify layers as a list or use `all-linear` as a shortcut. Layers are specific to a model architecture and can be specified as noted [here](https://github.com/foundation-model-stack/fms-hf-tuning?tab=readme-ov-file#lora-tuning-example)
For each of the requested trackers the code expects you to pass a config to the `sft_trainer.train` function which can be specified through `tracker_conifgs` argument [here](https://github.com/foundation-model-stack/fms-hf-tuning/blob/a9b8ec8d1d50211873e63fa4641054f704be8712/tuning/sft_trainer.py#L78) details of which are present below.
Copy file name to clipboardExpand all lines: docs/installation.md
+8-1Lines changed: 8 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,11 +45,18 @@ Experiment tracking in fms-hf-tuning allows users to track their experiments wit
45
45
46
46
The code supports currently these trackers out of the box,
47
47
*`FileLoggingTracker` : A built in tracker which supports logging training loss to a file.
48
+
- Since this is builin no need to install anything.
48
49
*`Aimstack` : A popular opensource tracker which can be used to track any metrics or metadata from the experiments.
50
+
- Install by running
51
+
`pip install fms-hf-tuning[aim]`
49
52
*`MLflow Tracking` : Another popular opensource tracker which stores metrics, metadata or even artifacts from experiments.
53
+
- Install by running
54
+
`pip install fms-hf-tuning[mlflow]`
50
55
*`Clearml Tracking` : Another opensource tracker which stores metrics, metadata or even artifacts from experiments.
56
+
- Install by running
57
+
`pip install fms-hf-tuning[clearml]`
51
58
52
-
Further details on enabling and using the trackers mentioned above can be found [here](./experiment-tracking.md).
59
+
Note. All trackers expect some arguments or can be customized by passing command line arguments which are described in our document on [experiment tracking](./experiment-tracking.md). For further details on enabling and using the trackers use the experiment tracking document.
0 commit comments