v0.3.0

tharapalanivel released this 25 Sep 15:02

· 156 commits to main since this release

4b7f51b

What's Changed

Test decoder long ctx by @kcirred in #117
Get criteria from DT artifact by @lupalby in #118
open gpt.json read-only to support parallel reading. by @gpaulsen in #126
Drive Paged Program Script enhancements by @JRosenkranz in #128
[dpp] eliminated pad_token_id from print by @kcirred in #130
Add the ability to enforce homogeneous program ids in prefill in DPP script by @JRosenkranz in #131
update test scripts to work with 4 layer micro model by @JRosenkranz in #134
fixed inference.py for batch size 1 symbolic sdpa by @JRosenkranz in #135
Make limits more flexible by @ani300 in #138
Allow specific user prompts in DPP script by @JRosenkranz in #139
Fix warmup to match vllm by @JRosenkranz in #141
Add ability in DPP script to select one or many programs that satisfy min batch and min sequence requirements by @JRosenkranz in #137
Fix paged generate with too much padding by @ani300 in #142
clean_up_tokenization_spaces=True (default) causes incorrect number of tokens after sampling by @JRosenkranz in #143
fixed issue where program_id was int when should have been string by @JRosenkranz in #144

New Contributors

@lupalby made their first contribution in #118
@gpaulsen made their first contribution in #126

Full Changelog: v0.2.3...v0.3.0

Contributors

ani300, JRosenkranz, and 3 other contributors

Assets 2