-
Notifications
You must be signed in to change notification settings - Fork 66
feat: add online data mixing plugin #612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 35 commits
c9a519e
d63eff0
6f95e56
8370997
88f68b1
a1b485b
be3afbd
6b0d4cd
ad35a69
245e2a2
b5cd791
e8d748b
3ebb844
5953bfb
9a8804a
89b95a8
bb5bc0d
32f2982
5756ad5
8b02967
2086d1e
dadb6f0
6ff740b
beda5df
304de74
9d69c85
a35ca56
0865130
c6f333e
c9945d1
12ed5fe
9fc8238
0bff8b4
dc03331
d5db867
e206f23
81d5917
ae7698e
b72e421
86eb54a
5dd066f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -470,6 +470,7 @@ The list of configurations for various `fms_acceleration` plugins: | |
| - `--multipack`: technique for *multi-gpu training* to balance out number of tokens processed in each device, to minimize waiting time. | ||
| - [fast_moe_config](./tuning/config/acceleration_configs/fast_moe.py) (experimental): | ||
| - `--fast_moe`: trains MoE models in parallel with [Scatter MoE kernels](https://github.com/foundation-model-stack/fms-acceleration/tree/main/plugins/accelerated-moe#fms-acceleration-for-mixture-of-experts), increasing throughput and decreasing memory usage. | ||
| - [odm_config](./tuning/config/acceleration_configs/odm.py) (experimental): See [advanced data preprocessing](./advanced-data-preprocessing.md#online-data-mixing) for usage with data_config. This plugin allows dynamically mixing datasets online during training adapting to training signals. | ||
|
||
|
|
||
| Notes: | ||
| * `quantized_lora_config` requires that it be used along with LoRA tuning technique. See [LoRA tuning section](https://github.com/foundation-model-stack/fms-hf-tuning/tree/main?tab=readme-ov-file#lora-tuning-example) on the LoRA parameters to pass. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,70 @@ | ||
| dataprocessor: | ||
| type: odm | ||
| sampling_stopping_strategy: first_exhausted # ignored | ||
| seed: 66 | ||
| odm: | ||
| update_interval: 1 # update every step | ||
| sampling_interval: 1 # sample category for every sample | ||
| reward_type: validation_loss # uses eval loss of each dataset as reward | ||
| gamma: 0.1 # MAB hyper-parameter | ||
| eta: 0.2 # MAB hyper-parameter | ||
| datasets: | ||
| - name: dataset_1 | ||
| split: | ||
| train: 0.8 | ||
| validation: 0.2 | ||
|
||
| sampling: 0.3 # ignored | ||
| data_paths: | ||
| - "FILE_PATH" | ||
| data_handlers: | ||
| - name: tokenize_and_apply_input_masking | ||
| arguments: | ||
| remove_columns: all | ||
| batched: false | ||
| fn_kwargs: | ||
| input_column_name: input | ||
| output_column_name: output | ||
| - name: dataset_2 | ||
| split: | ||
| train: 0.6 | ||
| validation: 0.2 | ||
| sampling: 0.4 # ignored | ||
| data_paths: | ||
| - "FILE_PATH" | ||
| data_handlers: | ||
| - name: tokenize_and_apply_input_masking | ||
| arguments: | ||
| remove_columns: all | ||
| batched: false | ||
| fn_kwargs: | ||
| input_column_name: input | ||
| output_column_name: output | ||
| - name: dataset_3 | ||
| split: | ||
| train: 0.4 | ||
| validation: 0.1 | ||
| sampling: 0.3 # ignored | ||
| data_paths: | ||
| - "FILE_PATH" | ||
| data_handlers: | ||
| - name: tokenize_and_apply_input_masking | ||
| arguments: | ||
| remove_columns: all | ||
| batched: false | ||
| fn_kwargs: | ||
| input_column_name: input | ||
| output_column_name: output | ||
| - name: dataset_4 | ||
| split: | ||
| train: 0.0 | ||
| validation: 0.3 # ignored | ||
| data_paths: | ||
| - "FILE_PATH" | ||
| data_handlers: | ||
| - name: tokenize_and_apply_input_masking | ||
| arguments: | ||
| remove_columns: all | ||
| batched: false | ||
| fn_kwargs: | ||
| input_column_name: input | ||
| output_column_name: output | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make ODM a separate document so its easy for users to find.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I have made it into a new doc and changed references accordingly.