Skip to content

Commit f3f167a

Browse files
authored
documentation (#150)
* documentation * spelling
1 parent 1f0ca98 commit f3f167a

File tree

2 files changed

+25
-0
lines changed

2 files changed

+25
-0
lines changed

CHANGELOGS.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ Change Logs
44
0.7.0
55
+++++
66

7+
* :pr:`149`: supports for StaticCache
78
* :pr:`147`: simplified log processing
89
* :pr:`146`: patch for IdeficsAttention, IdeficsEmbedding
910
* :pr:`145`: patch for _compute_dynamic_ntk_parameters (Phi3RotaryEmbedding)

_doc/cmds/validate.rst

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,3 +121,27 @@ of function :func:`onnx_diagnostic.torch_models.validate.run_ort_fusion`.
121121
from onnx_diagnostic._command_lines_parser import main
122122

123123
main("validate -m arnir0/Tiny-LLM --run -v 1 --export onnx-dynamo -o dump_models --patch --opt ir --ortfusiontype ALL".split())
124+
125+
Sdpa or Eager implementation or Use a StaticCache
126+
+++++++++++++++++++++++++++++++++++++++++++++++++
127+
128+
Add ``--mop cache_implementation=static --iop cls_cache=StaticCache`` to use a StaticCache instead of a DynamicCache (default).
129+
Add ``--mop attn_implementation=eager`` to explicitly select eager implementation for attention.
130+
131+
.. code-block:: bash
132+
133+
python -m onnx_diagnostic validate \
134+
-m google/gemma-2b \
135+
--run \
136+
-v 1 \
137+
--export custom \
138+
-o dump_test \
139+
--dtype float16 \
140+
--device cpu \
141+
--patch \
142+
--no-quiet \
143+
--opt default \
144+
--rewrite \
145+
--mop attn_implementation=eager \
146+
--mop cache_implementation=static \
147+
--iop cls_cache=StaticCache

0 commit comments

Comments
 (0)