You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
then we can operate modelopt in the docker pod as (Trtllm example)[https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/models/core/deepseek_v3/README.md?plain=1]
55
+
but we should notice that just using the latest DeepSeek-V3.git is ok, because there is a dtype bug in bias proto at commit 1398800.
56
+
57
+
#### W4AFP8 for V3.1
58
+
59
+
The basic operation is the same as V3.
60
+
But we need to notice two point:
61
+
1. use config_v3.1.json or add "scale_fmt":"ue8m0" in config_671B.json.ue8m0 is a key item as it was used in training of V3.1
62
+
2. set gemm_impl to fp8 (default is bf16) to enbale ue8m0 quant kernel
0 commit comments