Releases · sdpython/onnx-diagnostic · GitHub

01 Dec 16:28

sdpython

0.8.3 Latest

Latest

#331: adds a helper to convert an onnx model into dot
#330: fixes access rope_parameters for transformers>=5
#329: supports lists with OnnxruntimeEvaluator
#326: use ConcatFromSequence in LoopMHA with the loop
#325: adds plug for LoopMHA, extends the unit tests to measure the discrepancies
#324: supports FunctionProto with arguments in OnnxruntimeEvaluator
#323: drops torch 2.8 on CI
#322: support rerunning onnx kernels with torch intermediate results in side-by-side
#314: fix modelbuilder download needed after this change microsoft/onnxruntime-genai#1862
#311: use custom and local function to use PackedMultiHeadAttention from onnxruntime
#310: splits patches into multiple files
#308: add option --save_ep to dump the exported program as well as torch input
#304, #306, #316, #317, #318, #319: improves side-by-side comparison, creates command line sbs

Assets 2

14 Nov 17:28

sdpython

0.8.2

#303: fix inputs for summarization, feature extraction tasks
#302: adds helpers to analyse onnxruntime profiling
#297: experiment around a higher ops loop_for
#292, #293, #294, #295: first version of new patches for Qwen models

Assets 2

07 Nov 17:00

sdpython

0.8.1

#290: adds one prompt for text2text-generation
#289: adds command line options --exppo to give the exporter additional options
#287: adds input 'inputs_prompt' to test a LLM, meant to be used during validation
#288: add .contiguous in torch.cond branch (attention patch for sdpa implementation)
#286: adds variable to track random nodes in models

Assets 2

03 Nov 14:05

sdpython

0.8.0

#283: fix historical aggregation when multiple input sets are used
#282: add tools to understand better which functions were patched
#280: fixes patches for sdpa_attention_forward for different version of transformers
#278: implements onnx_generate_with_genai
#277: changes the serialization for all caches to reorder the model outputs (key_1, value_1, key_2, ...)
#276: implements onnx_generate which implements method generate for an onnx model,
#275: fixes function patched_vmap

Assets 2

24 Oct 21:27

xadupre

0.7.16

#273: enables export with FakeTensor
#272: makes patches work with FakeTensor
#270: add export sample code to export a specific model id with the appropriate inputs
#269: adds one unit test to track a patch fixing broadcast output shape
#267: patches sdpa_attention_forward because of a control flow (transformers>=5.0)
#266: makes patch_torch an integer in torch_export_patches to enable more patches

Assets 2

17 Oct 16:36

xadupre

0.7.15

#264: allows to validate a model with inputs defined from another task
#261: first updates to support transformers>=5.0, not yet completed

Assets 2

10 Oct 16:58

sdpython

0.7.14

#257: patch to disable one exception in pytorch
#256: extract subfolder from modelid//subfolder
#252: adds new sets of inputs for task texgt-generation
#250: add variables to track sequence nodes
#249: patches _maybe_broadcast to support a corner case

Assets 2

03 Oct 15:58

sdpython

0.7.13

#247: supports more gemma models with ModelBuilder
#246: add a set of inputs checking models works for an empty cache on task text-generation
#237: dummy inputs for google/gemma-3-4b-it (task image-text-to-text)
#244: add a patch to bypass the exception raised when the dynamic dimension is in {0,1}

Assets 2

26 Sep 16:48

sdpython

0.7.12

#232: fixes --patch argument so that --patch=0 works
#231: better statistics about fusions
#227: better support for model_id//pretrained, adds speed up when running command validate
#226: fix input order for models created with modelbuilder

Assets 2

19 Sep 17:25

sdpython

0.7.11

0.7.11

#224: support model_id with // to specify a subfolder
#223: adds task image-to-video
#220: adds option --ort-logs to display onnxruntime logs when creating the session
#220: adds a patch for PR #40791 huggingface/transformers#40791_ in transformers

0.7.10

#218: patches used sdpa_mask_recent_torch used from _vmap_for_bhqkv

0.7.9

#214: fix modelbuilder export
#213: use DYNAMIC on batch size

0.7.8

#210: add utilities to investigate models
#208: add a patch for Qwen3 (rewrite a loop)

Assets 2