Skip to content

Commit a7fbfac

Browse files
authored
Update README.md
1 parent a95b81d commit a7fbfac

File tree

1 file changed

+34
-1
lines changed

1 file changed

+34
-1
lines changed

README.md

Lines changed: 34 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,34 @@
1-
# AutoFP8
1+
# AutoFP8
2+
3+
Example model with static scales for activations and weights: https://huggingface.co/nm-testing/Meta-Llama-3-8B-Instruct-FP8
4+
5+
Command to produce:
6+
```bash
7+
python quantize.py --model-id meta-llama/Meta-Llama-3-8B-Instruct --save-dir Meta-Llama-3-8B-Instruct-FP8
8+
```
9+
10+
## Checkpoint structure
11+
12+
Here we detail the experimental structure for the fp8 checkpoints.
13+
14+
The following is added to config.json
15+
```python
16+
"quantization_config": {
17+
"quant_method": "fp8",
18+
"activation_scheme": "static" or "dynamic"
19+
},
20+
```
21+
22+
Each quantized layer in the state_dict will have:
23+
24+
If the config has `"activation_scheme": "static"`:
25+
```
26+
model.layers.0.mlp.down_proj.weight < F8_E4M3
27+
model.layers.0.mlp.down_proj.act_scale < F32
28+
model.layers.0.mlp.down_proj.weight_scale < F32
29+
```
30+
If config has `"activation_scheme": "dynamic"`:
31+
```
32+
model.layers.0.mlp.down_proj.weight < F8_E4M3
33+
model.layers.0.mlp.down_proj.weight_scale < F32
34+
```

0 commit comments

Comments
 (0)