Skip to content

Releases: sdpython/onnx-diagnostic

0.8.3

01 Dec 16:28
cccd7cd

Choose a tag to compare

  • #331: adds a helper to convert an onnx model into dot
  • #330: fixes access rope_parameters for transformers>=5
  • #329: supports lists with OnnxruntimeEvaluator
  • #326: use ConcatFromSequence in LoopMHA with the loop
  • #325: adds plug for LoopMHA, extends the unit tests to measure the discrepancies
  • #324: supports FunctionProto with arguments in OnnxruntimeEvaluator
  • #323: drops torch 2.8 on CI
  • #322: support rerunning onnx kernels with torch intermediate results in side-by-side
  • #314: fix modelbuilder download needed after this change microsoft/onnxruntime-genai#1862
  • #311: use custom and local function to use PackedMultiHeadAttention from onnxruntime
  • #310: splits patches into multiple files
  • #308: add option --save_ep to dump the exported program as well as torch input
  • #304, #306, #316, #317, #318, #319: improves side-by-side comparison, creates command line sbs

0.8.2

14 Nov 17:28
a682d15

Choose a tag to compare

  • #303: fix inputs for summarization, feature extraction tasks
  • #302: adds helpers to analyse onnxruntime profiling
  • #297: experiment around a higher ops loop_for
  • #292, #293, #294, #295: first version of new patches for Qwen models

0.8.1

07 Nov 17:00
11f2c83

Choose a tag to compare

  • #290: adds one prompt for text2text-generation
  • #289: adds command line options --exppo to give the exporter additional options
  • #287: adds input 'inputs_prompt' to test a LLM, meant to be used during validation
  • #288: add .contiguous in torch.cond branch (attention patch for sdpa implementation)
  • #286: adds variable to track random nodes in models

0.8.0

03 Nov 14:05
455c998

Choose a tag to compare

  • #283: fix historical aggregation when multiple input sets are used
  • #282: add tools to understand better which functions were patched
  • #280: fixes patches for sdpa_attention_forward for different version of transformers
  • #278: implements onnx_generate_with_genai
  • #277: changes the serialization for all caches to reorder the model outputs (key_1, value_1, key_2, ...)
  • #276: implements onnx_generate which implements method generate for an onnx model,
  • #275: fixes function patched_vmap

0.7.16

24 Oct 21:27
3f08c5f

Choose a tag to compare

  • #273: enables export with FakeTensor
  • #272: makes patches work with FakeTensor
  • #270: add export sample code to export a specific model id with the appropriate inputs
  • #269: adds one unit test to track a patch fixing broadcast output shape
  • #267: patches sdpa_attention_forward because of a control flow (transformers>=5.0)
  • #266: makes patch_torch an integer in torch_export_patches to enable more patches

0.7.15

17 Oct 16:36
05a2e93

Choose a tag to compare

  • #264: allows to validate a model with inputs defined from another task
  • #261: first updates to support transformers>=5.0, not yet completed

0.7.14

10 Oct 16:58
fde0173

Choose a tag to compare

  • #257: patch to disable one exception in pytorch
  • #256: extract subfolder from modelid//subfolder
  • #252: adds new sets of inputs for task texgt-generation
  • #250: add variables to track sequence nodes
  • #249: patches _maybe_broadcast to support a corner case

0.7.13

03 Oct 15:58
d8e0dd8

Choose a tag to compare

  • #247: supports more gemma models with ModelBuilder
  • #246: add a set of inputs checking models works for an empty cache on task text-generation
  • #237: dummy inputs for google/gemma-3-4b-it (task image-text-to-text)
  • #244: add a patch to bypass the exception raised when the dynamic dimension is in {0,1}

0.7.12

26 Sep 16:48
4866853

Choose a tag to compare

  • #232: fixes --patch argument so that --patch=0 works
  • #231: better statistics about fusions
  • #227: better support for model_id//pretrained, adds speed up when running command validate
  • #226: fix input order for models created with modelbuilder

0.7.11

19 Sep 17:25
c7afba2

Choose a tag to compare

0.7.11

  • #224: support model_id with // to specify a subfolder
  • #223: adds task image-to-video
  • #220: adds option --ort-logs to display onnxruntime logs when creating the session
  • #220: adds a patch for PR #40791 huggingface/transformers#40791_ in transformers

0.7.10

  • #218: patches used sdpa_mask_recent_torch used from _vmap_for_bhqkv

0.7.9

  • #214: fix modelbuilder export
  • #213: use DYNAMIC on batch size

0.7.8

  • #210: add utilities to investigate models
  • #208: add a patch for Qwen3 (rewrite a loop)