Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 14 additions & 4 deletions flang/lib/Optimizer/CodeGen/CodeGen.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -835,10 +835,20 @@ struct ConvertOpConversion : public fir::FIROpConversion<fir::ConvertOp> {
return mlir::success();
}
if (mlir::isa<mlir::IntegerType>(toTy)) {
if (toTy.isUnsignedInteger())
rewriter.replaceOpWithNewOp<mlir::LLVM::FPToUIOp>(convert, toTy, op0);
else
rewriter.replaceOpWithNewOp<mlir::LLVM::FPToSIOp>(convert, toTy, op0);
// NOTE: We are checking the fir type here because toTy is an LLVM type
// which is signless, and we need to use the intrinsic that matches the
// sign of the output in fir.
if (toFirTy.isUnsignedInteger()) {
auto intrinsicName =
mlir::StringAttr::get(convert.getContext(), "llvm.fptoui.sat");
rewriter.replaceOpWithNewOp<mlir::LLVM::CallIntrinsicOp>(
convert, toTy, intrinsicName, op0);
} else {
auto intrinsicName =
mlir::StringAttr::get(convert.getContext(), "llvm.fptosi.sat");
rewriter.replaceOpWithNewOp<mlir::LLVM::CallIntrinsicOp>(
convert, toTy, intrinsicName, op0);
}
Comment on lines +841 to +851
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you know what these saturation intrinsics get lowered to? And is there a big difference in performance? Would it be possible to use the saturation intrinsic only when necessary? Or can that not be determined at compile time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They produce more instructions on x86 (when they cannot be const-folded away) (x86 godbolt link, more instructions, aarch64 godbolt link, both using fcvtzs), and if someone converted reals to integers in a hot loop they might see worse performance, however I was unable to find a difference in the performance tests that I ran. I'll be watching performance numbers after this is merged in case something comes up.

Would it be possible to use the saturation intrinsic only when necessary?

As long as we want the correct semantics for values only known at runtime, I don't think so. However, especially if performance issues come up, I think it would make sense to use the fptosi/fptoui instructions under some flag, maybe enabled by default above some optimization level. Do you think using the instructions instead of the saturated intrinsics under (for example) -ffast-math would be a good compromise if performance issues show up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think using the instructions instead of the saturated intrinsics under (for example) -ffast-math would be a good compromise if performance issues show up?

Personally, I agree with that approach. I think it is better to avoid having too many code generation paths unless there is an actual use case for it, in which case -ffast-math would sounds like the right flag to deviate from the requirements.

Please wait for @kiranchandramohan's feedback on the matter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ashermancinelli for the reply. Just a few points, thinking out loud.

Would it be possible to use the saturation intrinsic only when necessary? Or can that not be determined at compile time?

This question I was asking here was about inferring from the real and integer types involved in the conversion. Like if we are converting from real(kind=2)/half-precision to integer(kind=4) then probably integer(kind=4) can hold all values without saturation.

gfortran (without fast-math) seems to be calling __fixtfsi.

There is also a question of whether vectorisation will work for these saturation intrinsics. I can see one issue filed against this topic by the rust community.
#59682

Do you think using the instructions instead of the saturated intrinsics under (for example) -ffast-math would be a good compromise if performance issues show up?

Personally, I agree with that approach. I think it is better to avoid having too many code generation paths unless there is an actual use case for it, in which case -ffast-math would sounds like the right flag to deviate from the requirements.

Please wait for @kiranchandramohan's feedback on the matter.

Makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like if we are converting from real(kind=2)/half-precision to integer(kind=4) then probably integer(kind=4) can hold all values without saturation.

I see what you mean now and that seems like a great idea. I see you've approved this PR so I'll merge for now, but I would like to address this in a follow-up. Thanks!

return mlir::success();
}
} else if (mlir::isa<mlir::IntegerType>(fromTy)) {
Expand Down
18 changes: 13 additions & 5 deletions flang/test/Fir/convert-to-llvm.fir
Original file line number Diff line number Diff line change
Expand Up @@ -701,6 +701,10 @@ func.func @convert_from_float(%arg0 : f32) {
%7 = fir.convert %arg0 : (f32) -> i16
%8 = fir.convert %arg0 : (f32) -> i32
%9 = fir.convert %arg0 : (f32) -> i64
%10 = fir.convert %arg0 : (f32) -> ui8
%11 = fir.convert %arg0 : (f32) -> ui16
%12 = fir.convert %arg0 : (f32) -> ui32
%13 = fir.convert %arg0 : (f32) -> ui64
return
}

Expand All @@ -711,11 +715,15 @@ func.func @convert_from_float(%arg0 : f32) {
// CHECK: %{{.*}} = llvm.fpext %[[ARG0]] : f32 to f64
// CHECK: %{{.*}} = llvm.fpext %[[ARG0]] : f32 to f80
// CHECK: %{{.*}} = llvm.fpext %[[ARG0]] : f32 to f128
// CHECK: %{{.*}} = llvm.fptosi %[[ARG0]] : f32 to i1
// CHECK: %{{.*}} = llvm.fptosi %[[ARG0]] : f32 to i8
// CHECK: %{{.*}} = llvm.fptosi %[[ARG0]] : f32 to i16
// CHECK: %{{.*}} = llvm.fptosi %[[ARG0]] : f32 to i32
// CHECK: %{{.*}} = llvm.fptosi %[[ARG0]] : f32 to i64
// CHECK: %{{.*}} = llvm.call_intrinsic "llvm.fptosi.sat"(%[[ARG0]]) : (f32) -> i1
// CHECK: %{{.*}} = llvm.call_intrinsic "llvm.fptosi.sat"(%[[ARG0]]) : (f32) -> i8
// CHECK: %{{.*}} = llvm.call_intrinsic "llvm.fptosi.sat"(%[[ARG0]]) : (f32) -> i16
// CHECK: %{{.*}} = llvm.call_intrinsic "llvm.fptosi.sat"(%[[ARG0]]) : (f32) -> i32
// CHECK: %{{.*}} = llvm.call_intrinsic "llvm.fptosi.sat"(%[[ARG0]]) : (f32) -> i64
// CHECK: %{{.*}} = llvm.call_intrinsic "llvm.fptoui.sat"(%[[ARG0]]) : (f32) -> i8
// CHECK: %{{.*}} = llvm.call_intrinsic "llvm.fptoui.sat"(%[[ARG0]]) : (f32) -> i16
// CHECK: %{{.*}} = llvm.call_intrinsic "llvm.fptoui.sat"(%[[ARG0]]) : (f32) -> i32
// CHECK: %{{.*}} = llvm.call_intrinsic "llvm.fptoui.sat"(%[[ARG0]]) : (f32) -> i64

// -----

Expand Down
236 changes: 236 additions & 0 deletions flang/test/Integration/fp-convert.f90
Original file line number Diff line number Diff line change
@@ -0,0 +1,236 @@
! RUN: %flang -funsigned %s -o %t && %t | FileCheck %s
! RUN: %flang -funsigned -emit-llvm -S -o - %s | FileCheck %s --check-prefix=LLVMIR

module fp_convert_m
implicit none
interface set_and_print
module procedure set_and_print_r16
module procedure set_and_print_r8
end interface
contains
subroutine set_and_print_r16(value)
real(kind=16), intent(in) :: value
integer(kind=1) :: i8
integer(kind=2) :: i16
integer(kind=4) :: i32
integer(kind=8) :: i64
integer(kind=16) :: i128
unsigned(kind=1) :: u8
unsigned(kind=2) :: u16
unsigned(kind=4) :: u32
unsigned(kind=8) :: u64
unsigned(kind=16) :: u128
print *, "Original real(16) value:", value
i8 = int(value, kind=1)
i16 = int(value, kind=2)
i32 = int(value, kind=4)
i64 = int(value, kind=8)
i128 = int(value, kind=16)
u8 = uint(value, kind=1)
u16 = uint(value, kind=2)
u32 = uint(value, kind=4)
u64 = uint(value, kind=8)
u128 = uint(value, kind=16)
print *, "Converted to 8-bit integer:", i8
print *, "Converted to 16-bit integer:", i16
print *, "Converted to 32-bit integer:", i32
print *, "Converted to 64-bit integer:", i64
print *, "Converted to 128-bit integer:", i128
print *, "Converted to 8-bit unsigned integer:", u8
print *, "Converted to 16-bit unsigned integer:", u16
print *, "Converted to 32-bit unsigned integer:", u32
print *, "Converted to 64-bit unsigned integer:", u64
print *, "Converted to 128-bit unsigned integer:", u128
end subroutine

subroutine set_and_print_r8(value)
real(kind=8), intent(in) :: value
integer(kind=1) :: i8
integer(kind=2) :: i16
integer(kind=4) :: i32
integer(kind=8) :: i64
integer(kind=16) :: i128
unsigned(kind=1) :: u8
unsigned(kind=2) :: u16
unsigned(kind=4) :: u32
unsigned(kind=8) :: u64
unsigned(kind=16) :: u128
print *, "Original real(8) value:", value
i8 = int(value, kind=1)
i16 = int(value, kind=2)
i32 = int(value, kind=4)
i64 = int(value, kind=8)
i128 = int(value, kind=16)
u8 = uint(value, kind=1)
u16 = uint(value, kind=2)
u32 = uint(value, kind=4)
u64 = uint(value, kind=8)
u128 = uint(value, kind=16)
print *, "Converted to 8-bit integer:", i8
print *, "Converted to 16-bit integer:", i16
print *, "Converted to 32-bit integer:", i32
print *, "Converted to 64-bit integer:", i64
print *, "Converted to 128-bit integer:", i128
print *, "Converted to 8-bit unsigned integer:", u8
print *, "Converted to 16-bit unsigned integer:", u16
print *, "Converted to 32-bit unsigned integer:", u32
print *, "Converted to 64-bit unsigned integer:", u64
print *, "Converted to 128-bit unsigned integer:", u128
end subroutine
end module fp_convert_m

program fp_convert
use ieee_arithmetic, only: ieee_value, ieee_quiet_nan, ieee_positive_inf, ieee_negative_inf
use fp_convert_m, only: set_and_print
implicit none

real(kind=8) :: nan, inf, ninf
nan = ieee_value(nan, ieee_quiet_nan)
inf = ieee_value(inf, ieee_positive_inf)
ninf = ieee_value(ninf, ieee_negative_inf)

call set_and_print(huge(0.0_8))
call set_and_print(-huge(0.0_8))
call set_and_print(huge(0.0_16))
call set_and_print(-huge(0.0_16))
call set_and_print(tiny(0.0_8))
call set_and_print(-tiny(0.0_8))
call set_and_print(tiny(0.0_16))
call set_and_print(-tiny(0.0_16))
call set_and_print(nan)
call set_and_print(inf)
call set_and_print(ninf)

end program fp_convert

! LLVMIR: call i8 @llvm.fptosi.sat.i8.f128(fp128 %{{.+}})
! LLVMIR: call i16 @llvm.fptosi.sat.i16.f128(fp128 %{{.+}})
! LLVMIR: call i32 @llvm.fptosi.sat.i32.f128(fp128 %{{.+}})
! LLVMIR: call i64 @llvm.fptosi.sat.i64.f128(fp128 %{{.+}})
! LLVMIR: call i128 @llvm.fptosi.sat.i128.f128(fp128 %{{.+}})
! LLVMIR: call i8 @llvm.fptoui.sat.i8.f128(fp128 %{{.+}})
! LLVMIR: call i16 @llvm.fptoui.sat.i16.f128(fp128 %{{.+}})
! LLVMIR: call i32 @llvm.fptoui.sat.i32.f128(fp128 %{{.+}})
! LLVMIR: call i64 @llvm.fptoui.sat.i64.f128(fp128 %{{.+}})
! LLVMIR: call i128 @llvm.fptoui.sat.i128.f128(fp128 %{{.+}})
! LLVMIR: call i8 @llvm.fptosi.sat.i8.f64(double %{{.+}})
! LLVMIR: call i16 @llvm.fptosi.sat.i16.f64(double %{{.+}})
! LLVMIR: call i32 @llvm.fptosi.sat.i32.f64(double %{{.+}})
! LLVMIR: call i64 @llvm.fptosi.sat.i64.f64(double %{{.+}})
! LLVMIR: call i128 @llvm.fptosi.sat.i128.f64(double %{{.+}})
! LLVMIR: call i8 @llvm.fptoui.sat.i8.f64(double %{{.+}})
! LLVMIR: call i16 @llvm.fptoui.sat.i16.f64(double %{{.+}})
! LLVMIR: call i32 @llvm.fptoui.sat.i32.f64(double %{{.+}})
! LLVMIR: call i64 @llvm.fptoui.sat.i64.f64(double %{{.+}})
! LLVMIR: call i128 @llvm.fptoui.sat.i128.f64(double %{{.+}})

! CHECK: Converted to 8-bit integer: 127
! CHECK: Converted to 16-bit integer: 32767
! CHECK: Converted to 32-bit integer: 2147483647
! CHECK: Converted to 64-bit integer: 9223372036854775807
! CHECK: Converted to 128-bit integer: 170141183460469231731687303715884105727
! CHECK: Converted to 8-bit unsigned integer: 255
! CHECK: Converted to 16-bit unsigned integer: 65535
! CHECK: Converted to 32-bit unsigned integer: 4294967295
! CHECK: Converted to 64-bit unsigned integer: 18446744073709551615
! CHECK: Converted to 128-bit unsigned integer: 340282366920938463463374607431768211455
! CHECK: Converted to 8-bit integer: -128
! CHECK: Converted to 16-bit integer: -32768
! CHECK: Converted to 32-bit integer: -2147483648
! CHECK: Converted to 64-bit integer: -9223372036854775808
! CHECK: Converted to 128-bit integer: -170141183460469231731687303715884105728
! CHECK: Converted to 8-bit unsigned integer: 0
! CHECK: Converted to 16-bit unsigned integer: 0
! CHECK: Converted to 32-bit unsigned integer: 0
! CHECK: Converted to 64-bit unsigned integer: 0
! CHECK: Converted to 128-bit unsigned integer: 0
! CHECK: Converted to 8-bit integer: 127
! CHECK: Converted to 16-bit integer: 32767
! CHECK: Converted to 32-bit integer: 2147483647
! CHECK: Converted to 64-bit integer: 9223372036854775807
! CHECK: Converted to 128-bit integer: 170141183460469231731687303715884105727
! CHECK: Converted to 8-bit unsigned integer: 255
! CHECK: Converted to 16-bit unsigned integer: 65535
! CHECK: Converted to 32-bit unsigned integer: 4294967295
! CHECK: Converted to 64-bit unsigned integer: 18446744073709551615
! CHECK: Converted to 128-bit unsigned integer: 340282366920938463463374607431768211455
! CHECK: Converted to 8-bit integer: -128
! CHECK: Converted to 16-bit integer: -32768
! CHECK: Converted to 32-bit integer: -2147483648
! CHECK: Converted to 64-bit integer: -9223372036854775808
! CHECK: Converted to 128-bit integer: -170141183460469231731687303715884105728
! CHECK: Converted to 8-bit unsigned integer: 0
! CHECK: Converted to 16-bit unsigned integer: 0
! CHECK: Converted to 32-bit unsigned integer: 0
! CHECK: Converted to 64-bit unsigned integer: 0
! CHECK: Converted to 128-bit unsigned integer: 0
! CHECK: Converted to 8-bit integer: 0
! CHECK: Converted to 16-bit integer: 0
! CHECK: Converted to 32-bit integer: 0
! CHECK: Converted to 64-bit integer: 0
! CHECK: Converted to 128-bit integer: 0
! CHECK: Converted to 8-bit unsigned integer: 0
! CHECK: Converted to 16-bit unsigned integer: 0
! CHECK: Converted to 32-bit unsigned integer: 0
! CHECK: Converted to 64-bit unsigned integer: 0
! CHECK: Converted to 128-bit unsigned integer: 0
! CHECK: Converted to 8-bit integer: 0
! CHECK: Converted to 16-bit integer: 0
! CHECK: Converted to 32-bit integer: 0
! CHECK: Converted to 64-bit integer: 0
! CHECK: Converted to 128-bit integer: 0
! CHECK: Converted to 8-bit unsigned integer: 0
! CHECK: Converted to 16-bit unsigned integer: 0
! CHECK: Converted to 32-bit unsigned integer: 0
! CHECK: Converted to 64-bit unsigned integer: 0
! CHECK: Converted to 128-bit unsigned integer: 0
! CHECK: Converted to 8-bit integer: 0
! CHECK: Converted to 16-bit integer: 0
! CHECK: Converted to 32-bit integer: 0
! CHECK: Converted to 64-bit integer: 0
! CHECK: Converted to 128-bit integer: 0
! CHECK: Converted to 8-bit unsigned integer: 0
! CHECK: Converted to 16-bit unsigned integer: 0
! CHECK: Converted to 32-bit unsigned integer: 0
! CHECK: Converted to 64-bit unsigned integer: 0
! CHECK: Converted to 128-bit unsigned integer: 0
! CHECK: Converted to 8-bit integer: 0
! CHECK: Converted to 16-bit integer: 0
! CHECK: Converted to 32-bit integer: 0
! CHECK: Converted to 64-bit integer: 0
! CHECK: Converted to 128-bit integer: 0
! CHECK: Converted to 8-bit unsigned integer: 0
! CHECK: Converted to 16-bit unsigned integer: 0
! CHECK: Converted to 32-bit unsigned integer: 0
! CHECK: Converted to 64-bit unsigned integer: 0
! CHECK: Converted to 128-bit unsigned integer: 0
! CHECK: Converted to 8-bit integer: 0
! CHECK: Converted to 16-bit integer: 0
! CHECK: Converted to 32-bit integer: 0
! CHECK: Converted to 64-bit integer: 0
! CHECK: Converted to 128-bit integer: 0
! CHECK: Converted to 8-bit unsigned integer: 0
! CHECK: Converted to 16-bit unsigned integer: 0
! CHECK: Converted to 32-bit unsigned integer: 0
! CHECK: Converted to 64-bit unsigned integer: 0
! CHECK: Converted to 128-bit unsigned integer: 0
! CHECK: Converted to 8-bit integer: 127
! CHECK: Converted to 16-bit integer: 32767
! CHECK: Converted to 32-bit integer: 2147483647
! CHECK: Converted to 64-bit integer: 9223372036854775807
! CHECK: Converted to 128-bit integer: 170141183460469231731687303715884105727
! CHECK: Converted to 8-bit unsigned integer: 255
! CHECK: Converted to 16-bit unsigned integer: 65535
! CHECK: Converted to 32-bit unsigned integer: 4294967295
! CHECK: Converted to 64-bit unsigned integer: 18446744073709551615
! CHECK: Converted to 128-bit unsigned integer: 340282366920938463463374607431768211455
! CHECK: Converted to 8-bit integer: -128
! CHECK: Converted to 16-bit integer: -32768
! CHECK: Converted to 32-bit integer: -2147483648
! CHECK: Converted to 64-bit integer: -9223372036854775808
! CHECK: Converted to 128-bit integer: -170141183460469231731687303715884105728
! CHECK: Converted to 8-bit unsigned integer: 0
! CHECK: Converted to 16-bit unsigned integer: 0
! CHECK: Converted to 32-bit unsigned integer: 0
! CHECK: Converted to 64-bit unsigned integer: 0
! CHECK: Converted to 128-bit unsigned integer: 0
Loading