Release v3.0.0 · foundation-model-stack/fms-hf-tuning

Image: quay.io/modh/fms-hf-tuning:v3.0.0

Summary of Changes

Activated LoRA Support

Support for Activated LoRA model tuning
Usage is very similar to standard LoRA, with the key difference that an invocation_string must be specified
Available by setting --peft_method to alora
Inference with aLoRA models requires insuring that the invocation string is present in the input

Data Preprocessor Changes

Breaking Changes to the data preprocessor interface, now utilizing conventional handler and parameter names from HF datasets in data configs
rename and retain are now their own data handlers, not data config parameters
Add flexible train/test dataset splitting by using the split parameter in data configs
Merge offline data preprocessor script into main library, can now only preprocess data using --do_dataprocessing_only

Dependency Updates

peft from <0.14 to <0.15.2
flash-attn from <3.0 to <2.8
accelerate from <1.1 to <1.7
transformers from <4.51 to <=4.54.4
torch from <2.5 to <2.7

Additional Changes

Updates to tracker framework, additon of ClearML tracker

What's Changed

docs: add instructions on how to correctly specify the chat template by @HarikrishnanBalagopal in #549
feat: Data Handling v3 (Breaking change for data config interface) by @dushyantbehl in #494
docs: Update model architecture in README by @aluu317 in #550
fix: issues related to providing 2 datasets with diff types by @HarikrishnanBalagopal in #554
docs: Added gradient checkpointing to docs by @Luka-D in #552
feat: Add ALoRA support by @kgreenewald in #513
feat: make activated LoRA an optional flag in the Dockerfile by @HarikrishnanBalagopal in #555
fix: saving logic for alora by @kmehant in #559
fix: decouple and update concatenate_dataset functionality from load_dataset by @YashasviChaurasia in #557
chore: Upgrade transformers, torch, and accelerate version by @Akash-Nayak in #561
build(deps): Update peft requirement from <=0.14,>=0.8.0 to >=0.8.0,<=0.15.2 by @dependabot[bot] in #556
feat: add train_test_split functionality via dataconfig by @YashasviChaurasia in #560
fix: docs and minor code by @dushyantbehl in #570
fix: Update flash-attn version constraint to <2.8 for compatibility by @Akash-Nayak in #571
feat: update tracking framework to make it more flexible. Add clearml tracker by @dushyantbehl in #568
fix: Remove the additional closing curly bracket by @Akash-Nayak in #572
fix: Add ENABLE_MLFLOW build argument to Dockerfile to control MLflow integration by @Akash-Nayak in #573
fix: typo and enhanced warning message for jinja and chat template rendering by @dushyantbehl in #574
fix: trackers should be used only on main process by @dushyantbehl in #578
fix: Decouple offline data processing from collators by @dushyantbehl in #579
feat: merge offline processing into the main library by @dushyantbehl in #580
fix: change logging level to info and print flat arguments by @dushyantbehl in #582
feat: add error handling for split dataset feat by @YashasviChaurasia in #581
feat: TC Event to handle final checkpoint by @seshapad in #558
fix: Restructure and rewite sampling logic to be compatible with split. by @dushyantbehl in #587
chore(release): merge set of changes for v3.0.0 by @willmj in #588

New Contributors

@kgreenewald made their first contribution in #513
@Akash-Nayak made their first contribution in #561

Full Changelog: v2.8.2...v3.0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v3.0.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Summary of Changes

Activated LoRA Support

Data Preprocessor Changes

Dependency Updates

Additional Changes

What's Changed

New Contributors

Contributors

Uh oh!