v0.5.7: Gemma3 Support, XPU Tuning Enhancements, GRPO Improvements, and API Compatibility Fixes
What's Changed
- Gemma3 (Text and Multimodal) by @eljandoubi in #621
- Make FLCE compatible with latest
XXXForCausalLM.forward()APIs by @Tcc0403 in #596 - do bias addition in tests in float32 to make testing code similar to torch compile by @shivam15s in #655
- [CI] fix siglip dummy config by @yundai424 in #658
- add XPU tuning to JSD by @rmukhopa in #649
- add XPU tuning to Rmsnorm and Layernorm by @Tarakarevu1 in #653
- Fix imports without transformers by @vaibhavjindal in #659
- Use TYPE_CHECKING to fix static-only imports in IDEs etc by @vaibhavjindal in #660
- [kl_div] Modified block and warp sizes for improved performance by @jgtong in #654
- [GRPO] add support for different loss types by @kashif in #662
- Remove unexpected kwargs passing to flce by @Tcc0403 in #651
- reduce number of tests for grpo by @shivam15s in #663
- Update pyproject.toml by @shivam15s in #665
New Contributors
- @rmukhopa made their first contribution in #649
- @Tarakarevu1 made their first contribution in #653
- @jgtong made their first contribution in #654
Full Changelog: v0.5.6...v0.5.7