Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Commit ed702af

Browse files
authored
Create cuda-32.json
add gs=32 cuda quantization for use w/ stories15M
1 parent 79c4a23 commit ed702af

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"executor": {"accelerator": "cuda"},
3+
"precision": {"dtype": "bf16"},
4+
"linear:int4": {"groupsize" : 32}
5+
}

0 commit comments

Comments
 (0)