documentation (#150)

xadupre · web-flow · commit f3f167ad7887 · 2025-06-17T11:57:31.000+02:00
* documentation

* spelling
diff --git a/CHANGELOGS.rst b/CHANGELOGS.rst
@@ -4,6 +4,7 @@ Change Logs
 0.7.0
 +++++
 
+* :pr:`149`: supports for StaticCache
 * :pr:`147`: simplified log processing
 * :pr:`146`: patch for IdeficsAttention, IdeficsEmbedding
 * :pr:`145`: patch for _compute_dynamic_ntk_parameters (Phi3RotaryEmbedding)
diff --git a/_doc/cmds/validate.rst b/_doc/cmds/validate.rst
@@ -121,3 +121,27 @@ of function :func:`onnx_diagnostic.torch_models.validate.run_ort_fusion`.
     from onnx_diagnostic._command_lines_parser import main
 
     main("validate -m arnir0/Tiny-LLM --run -v 1 --export onnx-dynamo -o dump_models --patch --opt ir --ortfusiontype ALL".split())
+
+Sdpa or Eager implementation or Use a StaticCache
++++++++++++++++++++++++++++++++++++++++++++++++++
+
+Add ``--mop cache_implementation=static --iop cls_cache=StaticCache`` to use a StaticCache instead of a DynamicCache (default).
+Add ``--mop attn_implementation=eager`` to explicitly select eager implementation for attention.
+
+.. code-block:: bash
+
+    python -m onnx_diagnostic validate \
+                -m google/gemma-2b \
+                --run \
+                -v 1 \
+                --export custom \
+                -o dump_test \
+                --dtype float16 \
+                --device cpu \
+                --patch \
+                --no-quiet \
+                --opt default \
+                --rewrite \
+                --mop attn_implementation=eager \
+                --mop cache_implementation=static \
+                --iop cls_cache=StaticCache