Skip to content

v0.3.0

Choose a tag to compare

@tharapalanivel tharapalanivel released this 25 Sep 15:02
· 156 commits to main since this release
4b7f51b

What's Changed

  • Test decoder long ctx by @kcirred in #117
  • Get criteria from DT artifact by @lupalby in #118
  • open gpt.json read-only to support parallel reading. by @gpaulsen in #126
  • Drive Paged Program Script enhancements by @JRosenkranz in #128
  • [dpp] eliminated pad_token_id from print by @kcirred in #130
  • Add the ability to enforce homogeneous program ids in prefill in DPP script by @JRosenkranz in #131
  • update test scripts to work with 4 layer micro model by @JRosenkranz in #134
  • fixed inference.py for batch size 1 symbolic sdpa by @JRosenkranz in #135
  • Make limits more flexible by @ani300 in #138
  • Allow specific user prompts in DPP script by @JRosenkranz in #139
  • Fix warmup to match vllm by @JRosenkranz in #141
  • Add ability in DPP script to select one or many programs that satisfy min batch and min sequence requirements by @JRosenkranz in #137
  • Fix paged generate with too much padding by @ani300 in #142
  • clean_up_tokenization_spaces=True (default) causes incorrect number of tokens after sampling by @JRosenkranz in #143
  • fixed issue where program_id was int when should have been string by @JRosenkranz in #144

New Contributors

Full Changelog: v0.2.3...v0.3.0