You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[FA] Add option to specify tuning parameters (#3293)
Add option specifying tuning parameters. Users can override default
parameters passing a list of options to autotune from. This way, passing
`-BLOCK-M 64 32 128` would mean these values for `BLOCK_M` are used for
autotuning.
Also split options in two different option groups so the help string
looks something like:
```
usage: flash-attention [-h] -Z Z -H H -N-CTX N_CTX -D-HEAD D_HEAD [-causal] [-backward] [-BLOCK-M BLOCK_M [BLOCK_M ...]] [-BLOCK-N BLOCK_N [BLOCK_N ...]] [-stages STAGES [STAGES ...]] [-warps WARPS [WARPS ...]]
Run Intel XPU Flash-Attention implementation
options:
-h, --help show this help message and exit
Model description:
Options setting different model metaparameters
-Z Z Batch size
-H H Head count
-N-CTX N_CTX Sequence length
-D-HEAD D_HEAD Embedding dimension
-causal Run causal attention
-backward Run backward attention
Tuning configuration:
Options setting different tuning parameters
-BLOCK-M BLOCK_M [BLOCK_M ...]
Sizes of M
-BLOCK-N BLOCK_N [BLOCK_N ...]
Sizes of N
-stages STAGES [STAGES ...]
Numbers of stages
-warps WARPS [WARPS ...]
Numbers of warps
```
---------
Signed-off-by: victor-eds <[email protected]>
0 commit comments