Update docs/source/en/optimization/para_attn.md

chengzeyi · stevhliu · web-flow · commit 8e9ba97c2213 · 2025-01-16T11:06:38.000+08:00
Co-authored-by: Steven Liu &lt;59462357+stevhliu@users.noreply.github.com&gt;
diff --git a/docs/source/en/optimization/para_attn.md b/docs/source/en/optimization/para_attn.md
@@ -140,7 +140,7 @@ First Block Cache reduced the inference speed to 2271.06 seconds compared to the
 </hfoption>
 </hfoptions>
 
-### FP8 Quantization
+## fp8 quantization
 
 fp8 with dynamic quantization further speeds up inference and reduces memory usage. Both the activations and weights must be quantized in order to use the 8-bit [NVIDIA Tensor Cores](https://www.nvidia.com/en-us/data-center/tensor-cores/).