Skip to content

v2.8.2

Choose a tag to compare

@willmj willmj released this 30 Apr 20:03
· 87 commits to release since this release
ad594c7

Image: quay.io/modh/fms-hf-tuning:v2.8.2

Summary of Changes

Vision Model Tuning Support

  • Added support for full and LoRA tuning of vision-language models (granite vision, llama vision, llava) using a chat-style image+text dataset format, with image and text field customization and model-specific configurations.
  • For vision model tuning, the --dataset_image_field flag has been added to select the column which contains images.
  • For vision model tuning, set "--gradient_checkpointing_kwargs": {"use_reentrant": false} as well as "accelerate_launch_args": { "fsdp_transformer_layer_cls_to_wrap": "<DecoderLayer>"} based on the model's architecture.

ScatterMoE Updates

  • With the latest release of fms-acceleration, ScatterMoE for LoRA has been enabled for attention layers.
  • ScatterMoE has been added to tuning image by default, and no longer requires an additional install.
  • New interface for --fast_moe config now accepts either int or bool.
    • If bool is passed, expert shards are set to one and toggles MoE kernels.
    • If int is passed, MoE kernels is turned on and expert shards are set to the value passed.

Data PreProcessor

  • Un-escape templates and strings are now passed correctly through cli.
  • Support for selecting a specific field from the dataset that contains multi-turn dialogue data by specifying --conversation_column.
  • Add OpenInstruct style data handler for chat template with masking outside of data collator: tokenize_and_apply_chat_template_with_masking.
  • Allow specifying the chat template as base64 to avoid escaping and templating issues.

Dependency Updates

  • trl from <0.15 to <0.18
  • pillow <0.12 added
  • transformers locked at <4.51

Additional Changes

  • Experimental support for sum loss trainer.

What's Changed

New Contributors

Full Changelog: v2.7.1...v2.8.2