You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/tutorials/features/float8.md
+9-13Lines changed: 9 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,8 +7,6 @@ Float8 (FP8) is a 8-bit floating point data type, which is used to reduce memory
7
7
8
8
Two formats are used in FP8 training and inference, in order to meet the required value range and precision of activation, weight and gradient in Deep Neural Network (DNN). One is E4M3 (sign-exponent-mantissa) for activation and weight, the other is E5M2 for gradients. These two formats are defined in [FP8 FORMATS FOR DEEP LEARNING](https://arxiv.org/pdf/2209.05433.pdf).
9
9
10
-
FP8 data type is used for memory storage only in current stage. It will be converted to the BFloat16 data type for computation.
11
-
12
10
## FP8 Quantization
13
11
14
12
On GPU, online Dynamic Quantization is used for FP8 data compression and decompression. Delayed Scaling algorithm is used for accelerating the quantizaiton process.
@@ -30,15 +28,13 @@ from intel_extension_for_pytorch.xpu.fp8.fp8 import fp8_autocast
30
28
from intel_extension_for_pytorch.xpu.fp8.recipe import DelayedScaling
31
29
from intel_extension_for_pytorch.nn.utils._fp8_convert import convert_fp8_model
32
30
33
-
## AMP is optionally to be used for FP8
34
-
with torch.xpu.amp.autocast(enabled=True, dtype=optimize_dtype):
35
-
## 'fp8_autocase' is the handler of FP8
36
-
with fp8_autocast(enabled=True, fp8_recipe=DelayedScaling()):
37
-
## The original model will be automatically converted to a new model with FP8 operators with 'convert_fp8_model'
38
-
convert_fp8_model(model)
39
-
outputs = model(input_ids=input_ids,
40
-
token_type_ids=segment_ids,
41
-
attention_mask=input_mask,
42
-
labels=masked_lm_labels,
43
-
next_sentence_label=next_sentence_labels)
31
+
## 'fp8_autocase' is the handler of FP8
32
+
with fp8_autocast(enabled=True, fp8_recipe=DelayedScaling()):
33
+
## The original model will be automatically converted to a new model with FP8 operators with 'convert_fp8_model'
0 commit comments