Skip to content

Releases: foundation-model-stack/fms-hf-tuning

v3.1.1-rc1

08 Dec 09:36
890725c

Choose a tag to compare

v3.1.1-rc1 Pre-release
Pre-release

What's Changed

Full Changelog: v3.1.0...v3.1.1-rc1

v3.1.0

11 Nov 17:56
9aca213

Choose a tag to compare

Image: quay.io/modh/fms-hf-tuning:v3.1.0

Summary

  1. Support GPT OSS class of models.
  2. Support Granite 4 series of models.
  3. Support Mamba and Hybrid Architecture Models.
  4. Support Flash Attention3 via HuggingFace kernelhub.
  5. Support loading MxFP4 Quantized models - Models in MxFP4 need to be dequantized before training as MxFP4 training is not supported in HuggingFace.
  6. Fixed major peformance bug which caused memory usage to double in few cases #592
  7. Support Alora directly from PEFT upstream.

Data Preprocessor changes

  1. Supports passing chat template as a .jinja file inside data config.
  2. Improved documentation and various bug fixes.

Additional Changes

  1. Supports experimental Online Data Mixing Plugin as presented in Pytorch Conference'25 by IBM Research Team @kmehant @romitjain @seshapad

List of Changes

New Contributors

Full Changelog: v3.0.0...v3.1.0

v3.0.0

22 Jul 13:33
d8cb1cb

Choose a tag to compare

Image: quay.io/modh/fms-hf-tuning:v3.0.0

Summary of Changes

Activated LoRA Support

  • Support for Activated LoRA model tuning
  • Usage is very similar to standard LoRA, with the key difference that an invocation_string must be specified
  • Available by setting --peft_method to alora
  • Inference with aLoRA models requires insuring that the invocation string is present in the input

Data Preprocessor Changes

  • Breaking Changes to the data preprocessor interface, now utilizing conventional handler and parameter names from HF datasets in data configs
  • rename and retain are now their own data handlers, not data config parameters
  • Add flexible train/test dataset splitting by using the split parameter in data configs
  • Merge offline data preprocessor script into main library, can now only preprocess data using --do_dataprocessing_only

Dependency Updates

  • peft from <0.14 to <0.15.2
  • flash-attn from <3.0 to <2.8
  • accelerate from <1.1 to <1.7
  • transformers from <4.51 to <=4.54.4
  • torch from <2.5 to <2.7

Additional Changes

  • Updates to tracker framework, additon of ClearML tracker

What's Changed

New Contributors

Full Changelog: v2.8.2...v3.0.0

v3.0.0-rc.2

21 Jul 15:43
8c16f2d

Choose a tag to compare

v3.0.0-rc.2 Pre-release
Pre-release

What's Changed

New Contributors

Full Changelog: v2.8.2...v3.0.0-rc.2

v3.0.0-rc.1

11 Jul 18:10
20185d1

Choose a tag to compare

v3.0.0-rc.1 Pre-release
Pre-release

What's Changed

New Contributors

Full Changelog: v2.8.2-rc.1...v3.0.0-rc.1

v2.8.2

30 Apr 20:03
ad594c7

Choose a tag to compare

Image: quay.io/modh/fms-hf-tuning:v2.8.2

Summary of Changes

Vision Model Tuning Support

  • Added support for full and LoRA tuning of vision-language models (granite vision, llama vision, llava) using a chat-style image+text dataset format, with image and text field customization and model-specific configurations.
  • For vision model tuning, the --dataset_image_field flag has been added to select the column which contains images.
  • For vision model tuning, set "--gradient_checkpointing_kwargs": {"use_reentrant": false} as well as "accelerate_launch_args": { "fsdp_transformer_layer_cls_to_wrap": "<DecoderLayer>"} based on the model's architecture.

ScatterMoE Updates

  • With the latest release of fms-acceleration, ScatterMoE for LoRA has been enabled for attention layers.
  • ScatterMoE has been added to tuning image by default, and no longer requires an additional install.
  • New interface for --fast_moe config now accepts either int or bool.
    • If bool is passed, expert shards are set to one and toggles MoE kernels.
    • If int is passed, MoE kernels is turned on and expert shards are set to the value passed.

Data PreProcessor

  • Un-escape templates and strings are now passed correctly through cli.
  • Support for selecting a specific field from the dataset that contains multi-turn dialogue data by specifying --conversation_column.
  • Add OpenInstruct style data handler for chat template with masking outside of data collator: tokenize_and_apply_chat_template_with_masking.
  • Allow specifying the chat template as base64 to avoid escaping and templating issues.

Dependency Updates

  • trl from <0.15 to <0.18
  • pillow <0.12 added
  • transformers locked at <4.51

Additional Changes

  • Experimental support for sum loss trainer.

What's Changed

New Contributors

Full Changelog: v2.7.1...v2.8.2

v2.8.2-rc.1

30 Apr 18:52
dc77c63

Choose a tag to compare

v2.8.2-rc.1 Pre-release
Pre-release

Full Changelog: v2.8.1...v2.8.2-rc.1

v2.8.1

28 Apr 22:00
4fa54e1

Choose a tag to compare

Recommend to use v2.8.2, which includes a bug fix for LoRA tuning. To view set of changes see v2.8.2.

Full Changelog: v2.7.1...v2.8.1

v2.8.1-rc.1

28 Apr 21:31
3c91290

Choose a tag to compare

v2.8.1-rc.1 Pre-release
Pre-release

What's Changed

  • feat: allow specifying the chat template as base64 to avoid weird escaping and templating issues by @HarikrishnanBalagopal in #534
  • ci: Install dnf-plugins-core in cuda-base stage by @aluu317 in #542

Full Changelog: v2.8.0...v2.8.1-rc.1

v2.8.0

28 Apr 18:07
3c17e8e

Choose a tag to compare

Recommend to use v2.8.2. Needed additional dependency update for image upload. To view set of changes see v2.8.1.