File tree Expand file tree Collapse file tree 2 files changed +25
-0
lines changed
Expand file tree Collapse file tree 2 files changed +25
-0
lines changed Original file line number Diff line number Diff line change @@ -4,6 +4,7 @@ Change Logs
440.7.0
55+++++
66
7+ * :pr: `149 `: supports for StaticCache
78* :pr: `147 `: simplified log processing
89* :pr: `146 `: patch for IdeficsAttention, IdeficsEmbedding
910* :pr: `145 `: patch for _compute_dynamic_ntk_parameters (Phi3RotaryEmbedding)
Original file line number Diff line number Diff line change @@ -121,3 +121,27 @@ of function :func:`onnx_diagnostic.torch_models.validate.run_ort_fusion`.
121121 from onnx_diagnostic._command_lines_parser import main
122122
123123 main("validate -m arnir0/Tiny-LLM --run -v 1 --export onnx-dynamo -o dump_models --patch --opt ir --ortfusiontype ALL".split())
124+
125+ Sdpa or Eager implementation or Use a StaticCache
126+ +++++++++++++++++++++++++++++++++++++++++++++++++
127+
128+ Add ``--mop cache_implementation=static --iop cls_cache=StaticCache `` to use a StaticCache instead of a DynamicCache (default).
129+ Add ``--mop attn_implementation=eager `` to explicitly select eager implementation for attention.
130+
131+ .. code-block :: bash
132+
133+ python -m onnx_diagnostic validate \
134+ -m google/gemma-2b \
135+ --run \
136+ -v 1 \
137+ --export custom \
138+ -o dump_test \
139+ --dtype float16 \
140+ --device cpu \
141+ --patch \
142+ --no-quiet \
143+ --opt default \
144+ --rewrite \
145+ --mop attn_implementation=eager \
146+ --mop cache_implementation=static \
147+ --iop cls_cache=StaticCache
You can’t perform that action at this time.
0 commit comments