v0.2.1: NxD refactoring
What's Changed
Inference
- Add qwen2 nxd by @dacorvo in #863
- Support Qwen3 by @jlonge4 in #847
- Add support for phi3 models using the nxd backend by @dacorvo in #867
- Add pixart models to cache CI by @JingyaHuang in #869
- Add granite nxd modeling and remove HLO backend by @dacorvo in #873
- chore(mixtral): align compile options to NXDi by @tengomucho in #875
- Refactoring T5 implementation for NxD support by @JingyaHuang in #876
- Improve diffusers cache CIs by @JingyaHuang in #872
Training
- Initial PR for peft by @michaelbenayoun in #839
- Support for PP with custom modeling by @michaelbenayoun in #857
- Cleanup legacy parallelism support by @michaelbenayoun in #866
- Fix workflows for training by @tengomucho in #874
- Remove
optimum/neuron/distributedby @michaelbenayoun in #877
General
Documentation
- update guidellm version to reproduce examples properly by @jlonge4 in #852
- Tutorial for Qwen3 Fine-tuning by @tengomucho in #865
New Contributors
Full Changelog: v0.2.0...v0.2.1