From 9f4824576f42109736a5df963c3174f56018f998 Mon Sep 17 00:00:00 2001 From: "Sidorov, Dmitry" Date: Fri, 24 Oct 2025 12:03:16 -0700 Subject: [PATCH 1/2] [DOC][SPIR-V] Add mini-float extensions Signed-off-by: Sidorov, Dmitry --- .../SPV_INTEL_float4.asciidoc | 258 ++++++++++++++ .../SPV_INTEL_fp_conversions.asciidoc | 322 ++++++++++++++++++ .../mini_float_conversions_env.asciidoc | 145 ++++++++ 3 files changed, 725 insertions(+) create mode 100644 sycl/doc/design/spirv-extensions/SPV_INTEL_float4.asciidoc create mode 100644 sycl/doc/design/spirv-extensions/SPV_INTEL_fp_conversions.asciidoc create mode 100644 sycl/doc/design/spirv-extensions/mini_float_conversions_env.asciidoc diff --git a/sycl/doc/design/spirv-extensions/SPV_INTEL_float4.asciidoc b/sycl/doc/design/spirv-extensions/SPV_INTEL_float4.asciidoc new file mode 100644 index 000000000000..0ea88f7bdb13 --- /dev/null +++ b/sycl/doc/design/spirv-extensions/SPV_INTEL_float4.asciidoc @@ -0,0 +1,258 @@ +:extension_name: SPV_INTEL_float4 + +:hf4_capability_name: Float4E2M1INTEL +:hf4_capability_token: 6212 +:hf4_matrix_capability_name: Float4E2M1CooperativeMatrixINTEL +:hf4_matrix_capability_token: 6213 +:hf4_encoding: 6214 + +:khr_matrix_capability_name: CooperativeMatrixKHR + +:joint_matrix_url: https://https://github.com/intel/llvm/tree/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_joint_matrix.asciidoc +:fp_conv_url: https://github.com/intel/llvm/tree/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_fp_conversions.asciidoc +:coop_matrix_url: https://github.khronos.org/SPIRV-Registry/extensions/KHR/SPV_KHR_cooperative_matrix.html +:bfloat16_url: https://github.khronos.org/SPIRV-Registry/extensions/KHR/SPV_KHR_bfloat16.html +:fp8_url: https://github.khronos.org/SPIRV-Registry/extensions/EXT/SPV_EXT_float8.html + +{extension_name} +================ + + +== Name Strings + +{extension_name} + +== Contributors + +- Dmitry Sidorov, Intel + +- Victor Mustya, Intel + +- Ben Ashbaugh, Intel + +- Dounia Khaldi, Intel + +- Joe Garvey, Intel + +- Greg Lueck, Intel + +- Pawel Jurek, Intel + + +Notice +------ + +Copyright (c) 2025 Intel Corporation. All rights reserved. + +Status +------ + +* Working Draft + +This is a preview extension specification, intended to provide early access to +a feature for review and community feedback. When the feature matures, this +specification may be released as a formal extension. + +Because the interfaces defined by this specification are not final and are +subject to change they are not intended to be used by shipping software +products. If you are interested in using this feature in your software product, +please let us know! + +== Version + +[width="40%",cols="25,25"] +|======================================== +| Last Modified Date | 2025-10-24 +| Revision | 2 +|======================================== + +== Dependencies + +This extension is written against the SPIR-V Specification, +Version 1.6 Revision 4. + +This extension interacts with {coop_matrix_url}[*SPV_KHR_cooperative_matrix*] extension. + +This extension interacts with {joint_matrix_url}[*SPV_INTEL_joint_matrix*] extension. + +This extension interacts with {bfloat16_url}[*SPV_KHR_bfloat16*] extension. + +This extension interacts with {fp8_url}[*SPV_EXT_float8*] extension. + +This extension interacts with {fp_conv_url}[*SPV_INTEL_fp_conversions*] extension. + +This extension requires SPIR-V 1.0. + +Overview +-------- + +This extension extends the *OpTypeFloat* instruction to enable the definition of `FP4E2M1` +floating-point format that has one sign bit, two exponent bits and one mantissa bits. + +The `FP4E2M1` special values are defined by the table below. + +[options="header"] +[width="80%"] +[cols="1,2"] +|==== +| ^| `FP4E2M1` +| Exponent Bias | 1 +| Max normal +| S.11.1 = 6.0 (1.5 * 2^2^) + +| Min normal +| S.01.0 = 1.0 (1.0 * 2^0^) + +| Max subnormal +| S.00.1 = 0.5 (0.5 * 2^0^) + +| Min subnormal +| S.00.1 = 0.5 (0.5 * 2^0^) + +| Infinity | N/A +| NaN | N/A + +|==== + +== Modifications to the SPIR-V Specification, Version 1.6 + +Binary Form +~~~~~~~~~~~ + +FP Encoding +~~~~~~~~~~~ + +Add a new enum: + +-- +[cols="^2,14,2,4",options="header",width = "100%"] +|==== +2+^.^| FP Encoding | Width(s) | Enabling Capabilities +| {hf4_encoding} | *Float4E2M1INTEL* + +The floating point type is encoded as a 4-bit float type. +This is encoded with the following encoding parameters: + + + - _bias_ is 1 + + + - _sign bit_ is 1 + + + - _w_ (exponent) is 2 + + + - _t_ (significand) is 1 + + + - _k_ (width) is 4 +| 4 | *Float4E2M1INTEL* + +|=== +-- + +=== Capabilities + +Modify Section 3.31, Capability, adding rows to the Capability table: + +-- +[options="header"] +|==== +2+^| Capability ^| Implicitly Declares +| {hf4_capability_token} | *{hf4_capability_name}* + +Uses *Float4E2M1INTEL* floating-point encoding. + +| +| {hf4_matrix_capability_token} | *{hf4_matrix_capability_name}* | *{khr_matrix_capability_name}* +|==== +-- + +=== Memory Layout + +Add to Section 2.18.1. Memory Layout, FPE2M1 4 layout: + +Scalar floating point variables with a `Width` of 4 can only be declared in the `Private` or `Function` storage classes. +In other storage classes, they must be included in an `OpTypeVector` with an even `Component Count`, where the first component in every pair is in bits 0-3 of the corresponding byte, and the second component is in bits 4-7. + +=== Instructions + +==== 3.42.11. Conversion Instructions + +* Add the following paragraphs to *OpFConvert*: + + +When converting to floating-point values with the *Float4E2M1INTEL* encoding, out-of-range +values and infinity and are converted to largest representable finite value with a matching sign. +Conversion from NaNs is implementation-defined. + + + + +==== 3.49.6. Type-Declaration Instructions + +Add the following requirement to *OpTypeCooperativeMatrixKHR*: + +If _Component Type_ has a *Float4E2M1INTEL* encoding then *{hf4_matrix_capability_name}* must be declared. + +Validation Rules +~~~~~~~~~~~~~~~~ + +Add the following bullets to section 2.16.1, Universal Validation Rules: + + * Variables with a type that is or includes a floating-point type with the *Float4E2M1INTEL* encoding must only be used with the following instructions: + ** https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_miscellaneous_instructions[Miscellaneous Instructions] : + *** OpUndef + ** https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_constant_creation_instructions[Constant Creation Instructions] : + *** OpConstant + *** OpConstantNull + *** OpConstantOp + *** OpConstantComposite + *** OpConstantCompositeContinuedINTEL + *** OpCooperativeMatrixConstructCheckedINTEL + *** OpSpecConstant + *** OpSpecConstantOp + *** OpSpecConstantComposite + *** OpSpecConstantCompositeContinuedINTEL + ** https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_arithmetic_instructions[Arithmetic Instructions] : + *** OpCooperativeMatrixMulAddKHR + *** OpCooperativeMatrixMulAddScaledINTEL + ** https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_composite_instructions[Composite Instructions] : + *** OpVectorExtractDynamic + *** OpVectorInsertDynamic + *** OpVectorShuffle + *** OpCompositeConstruct + *** OpCompositeExtract + *** OpCompositeInsert + *** OpCopyObject + *** OpCopyLogical + ** https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_memory_instructions[Memory Instructions] : + *** OpPtrEqual + *** OpPtrNotEqual + *** OpPtrDiff + *** OpCooperativeMatrixLoadKHR + *** OpCooperativeMatrixStoreKHR + *** OpCooperativeMatrixLoadCheckedINTEL + *** OpCooperativeMatrixStoreCheckedINTEL + ** https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_function_instructions[Function Instructions] : + *** OpFunction + *** OpFunctionParameter + *** OpFunctionCall + ** https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_conversion_instructions[Conversion Instructions] : + *** OpConvertSToF + *** OpFConvert + *** OpConvertPtrToU + *** OpConvertUToPtr + *** OpPtrCastToGeneric + *** OpGenericCastToPtr + *** OpGenericCastToPtrExplicit + *** OpBitcast + *** OpClampConvertFToFINTEL + *** OpBiasedRoundFToFINTEL + *** OpClampBiasedRoundFToFINTEL + *** OpBiasedRoundFToSINTEL + ** https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_control_flow_instructions[Control-Flow Instructions] : + *** OpReturnValue + *** OpSelect + *** OpPhi + *** OpLifetimeStart + *** OpLifetimeStop + +=== Issues + +- + +Revision History +---------------- + +[cols="5,15,15,70"] +[grid="rows"] +[options="header"] +|======================================== +|Rev|Date|Author|Changes +|1|2024-06-15|Dmitry Sidorov|Initial revision +|2|2025-10-24|Dmitry Sidorov|Prepare to publish +|======================================== diff --git a/sycl/doc/design/spirv-extensions/SPV_INTEL_fp_conversions.asciidoc b/sycl/doc/design/spirv-extensions/SPV_INTEL_fp_conversions.asciidoc new file mode 100644 index 000000000000..4093fa813948 --- /dev/null +++ b/sycl/doc/design/spirv-extensions/SPV_INTEL_fp_conversions.asciidoc @@ -0,0 +1,322 @@ +:extension_name: SPV_INTEL_fp_conversions + +:convert_capability_name: FloatConversionsINTEL +:convert_capability_token: 6215 +:OpClampConvertFToFINTEL_token: 6216 +:OpClampConvertFToSINTEL_token: 6424 +:OpStochasticRoundFToFINTEL_token: 6217 +:OpClampStochasticRoundFToFINTEL_token: 6218 +:OpClampStochasticRoundFToSINTEL_token: 6219 + +:coop_matrix_url: https://github.khronos.org/SPIRV-Registry/extensions/KHR/SPV_KHR_cooperative_matrix.html +:bfloat16_url: https://github.khronos.org/SPIRV-Registry/extensions/KHR/SPV_KHR_bfloat16.html +:fp8_url: https://github.khronos.org/SPIRV-Registry/extensions/EXT/SPV_EXT_float8.html +:fp4_url: https://github.com/intel/llvm/tree/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_float4.asciidoc + +{extension_name} +================ + + +== Name Strings + +{extension_name} + +== Contributors + +- Dmitry Sidorov, Intel + +- Victor Mustya, Intel + +- Ben Ashbaugh, Intel + +- Dounia Khaldi, Intel + +- Joe Garvey, Intel + +- Greg Lueck, Intel + +- Pawel Jurek, Intel + +- John Lu, Intel + +- Eyal Radiano, Intel + +- Mateusz Garbowski, Intel + + +Notice +------ + +Copyright (c) 2024 Intel Corporation. All rights reserved. + +Status +------ + +* Working Draft + +This is a preview extension specification, intended to provide early access to +a feature for review and community feedback. When the feature matures, this +specification may be released as a formal extension. + +Because the interfaces defined by this specification are not final and are +subject to change they are not intended to be used by shipping software +products. If you are interested in using this feature in your software product, +please let us know! + +== Version + +[width="40%",cols="25,25"] +|======================================== +| Last Modified Date | 2025-10-24 +| Revision | 2 +|======================================== + +== Dependencies + +This extension is written against the SPIR-V Specification, +Version 1.6 Revision 4. + +This extension interacts with {coop_matrix_url}[*SPV_KHR_cooperative_matrix*] extension. + +This extension interacts with {bfloat16_url}[*SPV_KHR_bfloat16*] extension. + +This extension interacts with {fp8_url}[*SPV_EXT_float8*] extension. + +This extension interacts with {fp4_url}[*SPV_INTEL_float4*] extension. + +This extension requires SPIR-V 1.0. + +Overview +-------- + +== Modifications to the SPIR-V Specification, Version 1.6 + +=== Capabilities + +Modify Section 3.31, Capability, adding rows to the Capability table: + +-- +[options="header"] +|==== +2+^| Capability ^| Implicitly Declares +| {convert_capability_token} | *{convert_capability_name}* + +Uses *OpClampConvertFToFINTEL*, *OpStochasticRoundFToFINTEL*, *OpClampStochasticRoundFToFINTEL* and *OpStochasticRoundFToSINTEL* +instructions. + +| +|==== +-- + +=== Instructions + +==== 3.42.11. Conversion Instructions + +[cols="1a,1,3*3",width="100%"] +|===== +4+|[[OpClampConvertFToFINTEL]]*OpClampConvertFToFINTEL* + + + +Converts numerically one floating point value to another. +In case of overflow, the positive result clamps to maximum normal value. +The negative result clamps to lowest negative normal value, which is equal to +maximum normal value multiplied by -1. + + + +_Result Type_ is the type of the converted object, it must be a scalar or +vector of _float type_. + + + +_Value_ must be a scalar or vector of _float type_. It must have a wider range +than the _Result Type_ and it must have the same number of components as 'Result Type'. + + + +Results are computed per component. + + + +1+|Capability: + +*{convert_capability_name}* +1+| 4 | {OpClampConvertFToFINTEL_token} +| __ + +_Result Type_ +| _Result _ +| __ + +_Value_ +|===== + +[cols="1a,1,3*3",width="100%"] +|===== +4+|[[OpClampConvertFToSINTEL]]*OpClampConvertFToSINTEL* + + + +Converts numerically a floating point value to integer. +In case of overflow, the positive result is saturated to INT_MAX or INT_MIN depending on a sign bit. + + + +_Result Type_ is the type of the converted object, it must be a scalar or +vector of _integer type_. + + + +_Value_ must be a scalar or vector of _float type_. +It must have the same number of components as 'Result Type'. + + + +Results are computed per component. + + + +1+|Capability: + +*{convert_capability_name}* +1+| 4 | {OpClampConvertFToSINTEL_token} +| __ + +_Result Type_ +| _Result _ +| __ + +_Value_ +|===== + +[cols="1a,1,5*",width="100%"] +|===== +6+|[[OpStochasticRoundFToFINTEL]]*OpStochasticRoundFToFINTEL* + + + +Converts numerically one floating point value to another using stochastic rounding. + + +Stochastic rounding is performed by adding a pseudo-random bias value to the mantissa +of the converted value as follows. The bias is first added to the mantissa of the converted value. +If this causes the mantissa to overflow, the the exponent of the converted value +is increased by 1 and the mantissa bits are shifted right. The value is then converted +to the _Result Type_, rounding towards zero. If the exponent overflows when converting +to the _Result Type_, the result of the conversion is +/- Inf. If _Result Type_ doesn't have +Inf representation, then in case of overflow the result saturates to max normal value representable +by the type preserving the sign. + + + +As described above, each input requires a bias value in order to perform the conversion. +These bias values are generated by executing an implementation-defined algorithm +that produces pseudo-random values that uses _Seed_ as a starting point. This algorithm is +guaranteed to produce repeatable bias values when the same value is passed for _Seed_. + + + +The instruction also returns a value in _Next Seed_, which client code can use to generate +good quality random biases. If the client intends to call *OpStochasticRoundFToFINTEL* +again from the same kernel invocation, it should use this value as a new seed that it +passes as _Seed_ in that next call. + + + +_Result Type_ is the type of the converted object, it must be a scalar or +vector of _float type_. + + + +_Value_ must be a scalar or vector of _float type_. It must have a wider range +than the _Result Type_ and it must have the same number of components as 'Result Type'. + + + +_Seed_ must be a 32-bit scalar _integer type_. + + + +_Next Seed_ must be of a _pointer type_ with *Function* storage class and 32-bit scalar _integer_ element type. + + + +Results are computed per component. + + + + +1+|Capability: + +*{convert_capability_name}* +1+| 4+ | {OpStochasticRoundFToFINTEL_token} +| __ + +_Result Type_ +| _Result _ +| __ + +_Value_ +| __ + +_Seed_ +| Optional __ + +_Next Seed_ +|===== + + +[cols="1a,1,5*3",width="100%"] +|===== +6+|[[OpClampStochasticRoundFToFINTEL]]*OpClampStochasticRoundFToFINTEL* + + + +Has the same semantics as *OpStochasticRoundFToFINTEL*, with an addition, that +in case of overflow, the positive result clamps to maximum normal value. +The negative result clamps to lowest negative normal value, which is equal to +maximum normal value multiplied by -1. + +This instruction may be used for stochastic rounding operation, if a producer passes +pseudo-random _Seed_ value. + + + +_Result Type_ is the type of the converted object, it must be a scalar or +vector of _float type_. + + + +_Value_ must be a scalar or vector of _float type_. It must have a wider range +than the _Result Type_ and it must have the same number of components as 'Result Type'. + + + +_Seed_ must be a 32-bit scalar _integer type_. + + + +_Next Seed_ must be of a _pointer type_ with *Function* storage class and 32-bit scalar _integer_ element type. + + + +Results are computed per component. + + + +1+|Capability: + +*{convert_capability_name}* +1+| 5+ | {OpClampStochasticRoundFToFINTEL_token} +| __ + +_Result Type_ +| _Result _ +| __ + +_Value_ +| __ + +_Seed_ +| Optional __ + +_Next Seed_ +|===== + + +[cols="1a,1,5*3",width="100%"] +|===== +6+|[[OpClampStochasticRoundFToSINTEL]]*OpStochasticRoundFToSINTEL* + + + +Converts a floating point value to integer using stochastic rounding. +Has the same semantics as *OpStochasticRoundFToFINTEL*. +In case of overflow, the positive result is saturated to INT_MAX or INT_MIN depending on a sign bit. +This instruction may be used for stochastic rounding operation, if a producer +passes pseudo-random _Seed_ value. + + + +_Result Type_ is the type of the converted object, it must be a scalar or +vector of _integer type_. + + + +_Value_ must be a scalar or vector of _float type_. It must have a wider range +than the _Result Type_ and it must have the same number of components as 'Result Type'. + + + +_Seed_ must be a 32-bit scalar _integer type_. + + + +Results are computed per component. + + + +_Next Seed_ must be of a _pointer type_ with *Function* storage class and 32-bit scalar _integer_ element type. + + + +1+|Capability: + +*{convert_capability_name}* +1+| 4+ | {OpStochasticRoundFToSINTEL_token} +| __ + +_Result Type_ +| _Result _ +| __ + +_Value_ +| __ + +_Seed_ +| Optional __ + +_Next Seed_ +|===== + + +Validation Rules +~~~~~~~~~~~~~~~~ + +Add the following bullets to section 2.16.11, Universal Validation Rules: + + * Variables with a type that is or includes a floating-point type with the *BFloat16KHR*, *Float8E4M3EXT* and *Float8E5M2EXT* encodings can also be used with the following instructions: + ** *OpClampConvertFToFINTEL* + + * Variables with a type that is or includes a floating-point type with the *BFloat16KHR*, *Float8E5M2EXT* and *Float4E2M1INTEL* encodings can also be used with the following instructions: + ** *OpStochasticRoundFToFINTEL* + ** *OpClampStochasticRoundFToFINTEL* + + * Variables with a type that is or includes a floating-point type with the *BFloat16KHR* encoding can also be used with the following instructions: + ** *OpClampConvertFToSINTEL* and *OpClampStochasticRoundFToSINTEL* + + +== Interactions with SPV_KHR_cooperative_matrix + +When *CooperativeMatrixKHR* capability is declared it is allowed to convert a _cooperative matrix_ +using the instructions added by this extensions. + +If _Value_ is _cooperative matrix_, then the _Result Type_ must be a _cooperative matrix type_ +with the same _Rows_, _Columns_, _Scope_ and _Use_ operands. _Seed_ operand can be non-uniform, all +other operands to these instructions must be dynamically +uniform within every instance of the _Scope_ of the _cooperative matrix_. + + +=== Issues + +- + +Revision History +---------------- + +[cols="5,15,15,70"] +[grid="rows"] +[options="header"] +|======================================== +|Rev|Date|Author|Changes +|1|2024-06-15|Dmitry Sidorov|Initial revision +|1|2025-10-24|Dmitry Sidorov|Prepare to publish +|======================================== diff --git a/sycl/doc/design/spirv-extensions/mini_float_conversions_env.asciidoc b/sycl/doc/design/spirv-extensions/mini_float_conversions_env.asciidoc new file mode 100644 index 000000000000..10bc39e3012d --- /dev/null +++ b/sycl/doc/design/spirv-extensions/mini_float_conversions_env.asciidoc @@ -0,0 +1,145 @@ +Mini-float Types and Conversions Environment Specification +========================================================== + +This document provides list of supported conversions for types and instructions +added in *SPV_EXT_float8*, *SPV_INTEL_int4*, *SPV_INTEL_float4* and *SPV_INTEL_fp_conversions* extensons +in Level-Zero and OpenCL Environments for Intel platforms. + + +Conversion from NaNs to `Float4E2M1INTEL` +----------------------------------------- + +NaNs are converted to largest representable finite value with a matching sign. + +Float to float conversions via OpFConvert +----------------------------------------- + +Conversions to *OpTypeFloat* with *Float4E2M1INTEL* encoding are being done with +round to the nearest even (RTE) mode by default. It's illegal to put any *FPRoundingMode* +decoration other than *RTE* on the instruction in these cases. *RoundingModeRTZ* +execution mode has no affect on these conversions. + + +Only the following conversions via *OpFConvert* to or from 4-bit floating-point values with the `Float4E2M1INTEL` encoding +are supported: + + + + +[cols="1,1", options="header"] +|=== +| To *Float4E2M1INTEL* 'Result' | From *Float4E2M1INTEL* 'Value' +| From 16-bit *IEEE754* | To 16-bit *IEEE754* +| From 16-bit *BFloat16KHR* | To 16-bit *BFloat16KHR* +| | To 8-bit *Float8E4M3EXT* +| | To 8-bit *Float8E5M2EXT* +|=== + +Only the following conversions via *OpFConvert* to or from 8-bit floating-point values with the `Float8E4M3EXT` and `Float8E5M2EXT` encodings +are supported: + + + + +[cols="1,1", options="header"] +|=== +| To *Float8E4M3EXT* 'Result' | From *Float8E4M3EXT* 'Value' +| From 16-bit *IEEE754* | To 16-bit *IEEE754* +| From 16-bit *BFloat16KHR* | To 16-bit *BFloat16KHR* +| From 4-bit *Float4E2M1INTEL* | +|=== + +[cols="1,1", options="header"] +|=== +| To *Float8E5M2EXT* 'Result' | From *Float8E5M2EXT* 'Value' +| From 16-bit *IEEE754* | To 16-bit *IEEE754* +| From 16-bit *BFloat16KHR* | To 16-bit *BFloat16KHR* +| From 4-bit *Float4E2M1INTEL* | +|=== + +Float to integer conversions via OpConvertFToS +---------------------------------------------- + +Only the following conversions via *OpConvertFToS* from float to 4-bit integer values are supported: + + + + +[cols="1,1", options="header"] +|=== +| _Result_ | 'Value' +| 4-bit integer | 16-bit *IEEE754* +| 4-bit integer | 16-bit *BFloat16KHR* +|=== + +Float to float conversions via OpClampConvertFToFINTEL +------------------------------------------------------ + +Only the following conversions via *OpClampConvertFToFINTEL* are supported: + + + + +[cols="1,1", options="header"] +|=== +| _Result_ | 'Value' +| 16-bit *IEEE754* | 32-bit *IEEE754* +| 8-bit *Float8E5M2EXT* | 16-bit *IEEE754* +| 8-bit *Float8E5M2EXT* | 16-bit *BFloat16KHR* +| 8-bit *Float8E4M3EXT* | 16-bit *IEEE754* +| 8-bit *Float8E4M3EXT* | 16-bit *BFloat16KHR* +| 4-bit *Float4E2M1INTEL* | 16-bit *IEEE754* +| 4-bit *Float4E2M1INTEL* | 16-bit *BFloat16KHR* +|=== + +Float to integer conversions via OpClampConvertFToS +--------------------------------------------------- + +Only the following conversions via *OpClampConvertFToS* from float to 4-bit integer values are supported: + + + + +[cols="1,1", options="header"] +|=== +| _Result_ | 'Value' +| 4-bit integer | 16-bit *IEEE754* +| 4-bit integer | 16-bit *BFloat16KHR* +|=== + +Float to float conversions via OpStochasticRoundFToFINTEL +--------------------------------------------------------- + +Only the following conversions via *OpStochasticRoundFToFINTEL* are supported: + + + + +[cols="1,1", options="header"] +|=== +| _Result_ | 'Value' +| 16-bit *IEEE754* | 32-bit *IEEE754* +| 8-bit *Float8E5M2EXT* | 16-bit *IEEE754* +| 8-bit *Float8E5M2EXT* | 16-bit *BFloat16KHR* +| 8-bit *Float8E4M3EXT* | 16-bit *IEEE754* +| 8-bit *Float8E4M3EXT* | 16-bit *BFloat16KHR* +| 4-bit *Float4E2M1INTEL* | 16-bit *IEEE754* +| 4-bit *Float4E2M1INTEL* | 16-bit *BFloat16KHR* +|=== + +Float to float conversions via OpClampStochasticRoundFToFINTEL +-------------------------------------------------------------- + +Only the following conversions via *OpClampStochasticRoundFToFINTEL* are supported: + + + +[cols="1,1", options="header"] +|=== +| _Result_ | 'Value' +| 16-bit *IEEE754* | 32-bit *IEEE754* +| 8-bit *Float8E5M2EXT* | 16-bit *IEEE754* +| 8-bit *Float8E5M2EXT* | 16-bit *BFloat16KHR* +| 8-bit *Float8E4M3EXT* | 16-bit *IEEE754* +| 8-bit *Float8E4M3EXT* | 16-bit *BFloat16KHR* +| 4-bit *Float4E2M1INTEL* | 16-bit *IEEE754* +| 4-bit *Float4E2M1INTEL* | 16-bit *BFloat16KHR* +|=== + + +Float to integer conversions via OpClampStochasticRoundFToSINTEL +---------------------------------------------------------------- + +Only the following conversions via *OpStochasticRoundFToSINTEL* from float to 4-bit integer values are supported: + + + + +[cols="1,1", options="header"] +|=== +| _Result_ | 'Value' +| 4-bit integer | 16-bit *IEEE754* +| 4-bit integer | 16-bit *BFloat16KHR* +|=== From 2e4e1b90c0fb384bb79b338ff3f8da5b8b2a3b3e Mon Sep 17 00:00:00 2001 From: "Sidorov, Dmitry" Date: Mon, 3 Nov 2025 05:25:42 -0800 Subject: [PATCH 2/2] fix few typos Signed-off-by: Sidorov, Dmitry --- .../SPV_INTEL_fp_conversions.asciidoc | 18 +++++++++--------- .../mini_float_conversions_env.asciidoc | 8 ++++---- 2 files changed, 13 insertions(+), 13 deletions(-) diff --git a/sycl/doc/design/spirv-extensions/SPV_INTEL_fp_conversions.asciidoc b/sycl/doc/design/spirv-extensions/SPV_INTEL_fp_conversions.asciidoc index 4093fa813948..6fa3f4986787 100644 --- a/sycl/doc/design/spirv-extensions/SPV_INTEL_fp_conversions.asciidoc +++ b/sycl/doc/design/spirv-extensions/SPV_INTEL_fp_conversions.asciidoc @@ -37,7 +37,7 @@ Notice ------ -Copyright (c) 2024 Intel Corporation. All rights reserved. +Copyright (c) 2025 Intel Corporation. All rights reserved. Status ------ @@ -90,7 +90,7 @@ Modify Section 3.31, Capability, adding rows to the Capability table: |==== 2+^| Capability ^| Implicitly Declares | {convert_capability_token} | *{convert_capability_name}* + -Uses *OpClampConvertFToFINTEL*, *OpStochasticRoundFToFINTEL*, *OpClampStochasticRoundFToFINTEL* and *OpStochasticRoundFToSINTEL* +Uses *OpClampConvertFToFINTEL*, *OpStochasticRoundFToFINTEL*, *OpClampStochasticRoundFToFINTEL* and *OpClampStochasticRoundFToSINTEL* instructions. + | |==== @@ -132,7 +132,7 @@ _Value_ 4+|[[OpClampConvertFToSINTEL]]*OpClampConvertFToSINTEL* + + Converts numerically a floating point value to integer. -In case of overflow, the positive result is saturated to INT_MAX or INT_MIN depending on a sign bit. + +In case of overflow, the result is saturated to INT_MAX or INT_MIN depending on a sign bit. + + _Result Type_ is the type of the converted object, it must be a scalar or vector of _integer type_. + @@ -160,7 +160,7 @@ Converts numerically one floating point value to another using stochastic roundi + Stochastic rounding is performed by adding a pseudo-random bias value to the mantissa of the converted value as follows. The bias is first added to the mantissa of the converted value. -If this causes the mantissa to overflow, the the exponent of the converted value +If this causes the mantissa to overflow, then the exponent of the converted value is increased by 1 and the mantissa bits are shifted right. The value is then converted to the _Result Type_, rounding towards zero. If the exponent overflows when converting to the _Result Type_, the result of the conversion is +/- Inf. If _Result Type_ doesn't have @@ -245,11 +245,11 @@ _Next Seed_ [cols="1a,1,5*3",width="100%"] |===== -6+|[[OpClampStochasticRoundFToSINTEL]]*OpStochasticRoundFToSINTEL* + +6+|[[OpClampStochasticRoundFToSINTEL]]*OpClampStochasticRoundFToSINTEL* + + Converts a floating point value to integer using stochastic rounding. Has the same semantics as *OpStochasticRoundFToFINTEL*. -In case of overflow, the positive result is saturated to INT_MAX or INT_MIN depending on a sign bit. +In case of overflow, the result is saturated to INT_MAX or INT_MIN depending on a sign bit. This instruction may be used for stochastic rounding operation, if a producer passes pseudo-random _Seed_ value. + + @@ -267,7 +267,7 @@ _Next Seed_ must be of a _pointer type_ with *Function* storage class and 32-bit + 1+|Capability: + *{convert_capability_name}* -1+| 4+ | {OpStochasticRoundFToSINTEL_token} +1+| 4+ | {OpClampStochasticRoundFToSINTEL_token} | __ + _Result Type_ | _Result _ @@ -285,10 +285,10 @@ Validation Rules Add the following bullets to section 2.16.11, Universal Validation Rules: - * Variables with a type that is or includes a floating-point type with the *BFloat16KHR*, *Float8E4M3EXT* and *Float8E5M2EXT* encodings can also be used with the following instructions: + * Variables with a type that is or includes a floating-point type with the *BFloat16KHR*, *Float8E4M3EXT*, *Float8E5M2EXT* and *Float4E2M1INTEL* encodings can also be used with the following instructions: ** *OpClampConvertFToFINTEL* - * Variables with a type that is or includes a floating-point type with the *BFloat16KHR*, *Float8E5M2EXT* and *Float4E2M1INTEL* encodings can also be used with the following instructions: + * Variables with a type that is or includes a floating-point type with the *BFloat16KHR*, *Float8E4M3EXT*, *Float8E5M2EXT* and *Float4E2M1INTEL* encodings can also be used with the following instructions: ** *OpStochasticRoundFToFINTEL* ** *OpClampStochasticRoundFToFINTEL* diff --git a/sycl/doc/design/spirv-extensions/mini_float_conversions_env.asciidoc b/sycl/doc/design/spirv-extensions/mini_float_conversions_env.asciidoc index 10bc39e3012d..a1be11304109 100644 --- a/sycl/doc/design/spirv-extensions/mini_float_conversions_env.asciidoc +++ b/sycl/doc/design/spirv-extensions/mini_float_conversions_env.asciidoc @@ -82,10 +82,10 @@ Only the following conversions via *OpClampConvertFToFINTEL* are supported: + | 4-bit *Float4E2M1INTEL* | 16-bit *BFloat16KHR* |=== -Float to integer conversions via OpClampConvertFToS ---------------------------------------------------- +Float to integer conversions via OpClampConvertFToSINTEL +-------------------------------------------------------- -Only the following conversions via *OpClampConvertFToS* from float to 4-bit integer values are supported: + +Only the following conversions via *OpClampConvertFToSINTEL* from float to 4-bit integer values are supported: + + [cols="1,1", options="header"] @@ -134,7 +134,7 @@ Only the following conversions via *OpClampStochasticRoundFToFINTEL* are support Float to integer conversions via OpClampStochasticRoundFToSINTEL ---------------------------------------------------------------- -Only the following conversions via *OpStochasticRoundFToSINTEL* from float to 4-bit integer values are supported: + +Only the following conversions via *OpClampStochasticRoundFToSINTEL* from float to 4-bit integer values are supported: + + [cols="1,1", options="header"]