Commit e2cb6a3
committed
[AArch64] Tweak truncate costs for some scalable vector types
* We were previously returning an invalid cost when truncating
anything to <vscale x 2 x i1>, which is incorrect since we can
generate perfectly good code for this.
* The costs for truncating legal or unpacked types to predicates
seemed overly optimistic. For example, when truncating
<vscale x 8 x i16> to <vscale x 8 x i1> we typically do
something like
and z0.h, z0.h, #0x1
cmpne p0.h, p0/z, z0.h, #0
I guess it might depend upon whether the input value is
generated in the same block or not and if we can avoid the
inreg zero-extend. However, it feels safe to take the more
conservative cost here.
* The costs for some truncates such as
trunc <vscale x 2 x i32> %a to <vscale x 2 x i16>
were 1, whereas in actual fact they are free and no instructions
are required. Also, for this
trunc <vscale x 8 x i32> %a to <vscale x 8 x i16>
it's just a single uzp1 instruction so I reduced the cost to 1.
In general, I've added costs for all cases where the destination
type is legal or unpacked. One unfortunate side effect of this
is the costs for some fixed-width truncates when using SVE now
look too optimistic.1 parent 8408492 commit e2cb6a3
File tree
4 files changed
+84
-67
lines changed- llvm
- lib/Target/AArch64
- test/Analysis/CostModel/AArch64
4 files changed
+84
-67
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2766 | 2766 | | |
2767 | 2767 | | |
2768 | 2768 | | |
2769 | | - | |
2770 | | - | |
2771 | | - | |
2772 | | - | |
2773 | | - | |
2774 | | - | |
2775 | | - | |
2776 | | - | |
2777 | | - | |
2778 | | - | |
2779 | | - | |
2780 | | - | |
2781 | | - | |
2782 | | - | |
2783 | | - | |
2784 | | - | |
| 2769 | + | |
| 2770 | + | |
| 2771 | + | |
| 2772 | + | |
| 2773 | + | |
| 2774 | + | |
| 2775 | + | |
| 2776 | + | |
| 2777 | + | |
| 2778 | + | |
| 2779 | + | |
| 2780 | + | |
| 2781 | + | |
| 2782 | + | |
| 2783 | + | |
| 2784 | + | |
| 2785 | + | |
| 2786 | + | |
| 2787 | + | |
| 2788 | + | |
| 2789 | + | |
| 2790 | + | |
| 2791 | + | |
| 2792 | + | |
| 2793 | + | |
| 2794 | + | |
| 2795 | + | |
| 2796 | + | |
| 2797 | + | |
| 2798 | + | |
| 2799 | + | |
| 2800 | + | |
| 2801 | + | |
2785 | 2802 | | |
2786 | 2803 | | |
2787 | 2804 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
629 | 629 | | |
630 | 630 | | |
631 | 631 | | |
632 | | - | |
| 632 | + | |
633 | 633 | | |
634 | | - | |
| 634 | + | |
635 | 635 | | |
636 | 636 | | |
637 | 637 | | |
638 | | - | |
| 638 | + | |
639 | 639 | | |
640 | | - | |
| 640 | + | |
641 | 641 | | |
642 | 642 | | |
643 | 643 | | |
644 | | - | |
| 644 | + | |
645 | 645 | | |
646 | | - | |
| 646 | + | |
647 | 647 | | |
648 | 648 | | |
649 | 649 | | |
650 | | - | |
| 650 | + | |
651 | 651 | | |
652 | | - | |
| 652 | + | |
653 | 653 | | |
654 | 654 | | |
655 | 655 | | |
| |||
674 | 674 | | |
675 | 675 | | |
676 | 676 | | |
677 | | - | |
| 677 | + | |
678 | 678 | | |
679 | 679 | | |
680 | 680 | | |
681 | | - | |
| 681 | + | |
682 | 682 | | |
683 | | - | |
| 683 | + | |
684 | 684 | | |
685 | 685 | | |
686 | 686 | | |
687 | | - | |
| 687 | + | |
688 | 688 | | |
689 | | - | |
| 689 | + | |
690 | 690 | | |
691 | 691 | | |
692 | 692 | | |
| |||
711 | 711 | | |
712 | 712 | | |
713 | 713 | | |
714 | | - | |
| 714 | + | |
715 | 715 | | |
716 | 716 | | |
717 | 717 | | |
718 | | - | |
| 718 | + | |
719 | 719 | | |
720 | | - | |
| 720 | + | |
721 | 721 | | |
722 | 722 | | |
723 | 723 | | |
724 | | - | |
| 724 | + | |
725 | 725 | | |
726 | | - | |
| 726 | + | |
727 | 727 | | |
728 | 728 | | |
729 | 729 | | |
| |||
0 commit comments