v0.2.1: NxD refactoring

dacorvo released this 27 Jun 07:58

c2f557c

What's Changed

Inference

Add qwen2 nxd by @dacorvo in #863
Support Qwen3 by @jlonge4 in #847
Add support for phi3 models using the nxd backend by @dacorvo in #867
Add pixart models to cache CI by @JingyaHuang in #869
Add granite nxd modeling and remove HLO backend by @dacorvo in #873
chore(mixtral): align compile options to NXDi by @tengomucho in #875
Refactoring T5 implementation for NxD support by @JingyaHuang in #876
Improve diffusers cache CIs by @JingyaHuang in #872

Training

Initial PR for peft by @michaelbenayoun in #839
Support for PP with custom modeling by @michaelbenayoun in #857
Cleanup legacy parallelism support by @michaelbenayoun in #866
Fix workflows for training by @tengomucho in #874
Remove optimum/neuron/distributed by @michaelbenayoun in #877

General

feat: add a tool to decode binary HLOs by @dacorvo in #870

Documentation

update guidellm version to reproduce examples properly by @jlonge4 in #852
Tutorial for Qwen3 Fine-tuning by @tengomucho in #865

New Contributors

@jlonge4 made their first contribution in #852

Full Changelog: v0.2.0...v0.2.1

Contributors

dacorvo, tengomucho, and 3 other contributors

Assets 2