Commit 2b6e103
Automerge: [flang] Inline hlfir.matmul[_transpose]. (#122821)
Inlining `hlfir.matmul` as `hlfir.eval_in_mem` does not allow
to get rid of a temporary array in many cases, but it may still be
much better allowing to:
* Get rid of any overhead related to calling runtime MATMUL
(such as descriptors creation).
* Use CPU-specific vectorization cost model for matmul loops,
which Fortran runtime cannot currently do.
* Optimize matmul of known-size arrays by complete unrolling.
One of the drawbacks of `hlfir.eval_in_mem` inlining is that
the ops inside it with store memory effects block the current
MLIR CSE, so I decided to run this inlining late in the pipeline.
There is a source commen explaining the CSE issue in more detail.
Straightforward inlining of `hlfir.matmul` as an `hlfir.elemental`
is not good for performance, and I got performance regressions
with it comparing to Fortran runtime implementation. I put it
under an enigneering option for experiments.
At the same time, inlining `hlfir.matmul_transpose` as `hlfir.elemental`
seems to be a good approach, e.g. it allows getting rid of a temporay
array in cases like: `A(:)=B(:)+MATMUL(TRANSPOSE(C(:,:)),D(:))`.
This patch improves performance of galgel and tonto a little bit.File tree
10 files changed
+1183
-3
lines changed- flang
- include/flang/Optimizer
- Builder
- HLFIR
- lib/Optimizer
- Builder
- HLFIR/Transforms
- Passes
- test
- Driver
- Fir
- HLFIR
10 files changed
+1183
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
804 | 804 | | |
805 | 805 | | |
806 | 806 | | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
807 | 816 | | |
808 | 817 | | |
809 | 818 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
508 | 508 | | |
509 | 509 | | |
510 | 510 | | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
511 | 516 | | |
512 | 517 | | |
513 | 518 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
46 | 57 | | |
47 | 58 | | |
48 | 59 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1740 | 1740 | | |
1741 | 1741 | | |
1742 | 1742 | | |
| 1743 | + | |
| 1744 | + | |
| 1745 | + | |
| 1746 | + | |
| 1747 | + | |
| 1748 | + | |
| 1749 | + | |
| 1750 | + | |
| 1751 | + | |
| 1752 | + | |
| 1753 | + | |
| 1754 | + | |
| 1755 | + | |
| 1756 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
939 | 939 | | |
940 | 940 | | |
941 | 941 | | |
942 | | - | |
943 | | - | |
| 942 | + | |
| 943 | + | |
| 944 | + | |
| 945 | + | |
944 | 946 | | |
945 | 947 | | |
946 | 948 | | |
| |||
955 | 957 | | |
956 | 958 | | |
957 | 959 | | |
958 | | - | |
| 960 | + | |
| 961 | + | |
959 | 962 | | |
960 | 963 | | |
961 | 964 | | |
| |||
1410 | 1413 | | |
1411 | 1414 | | |
1412 | 1415 | | |
| 1416 | + | |
| 1417 | + | |
| 1418 | + | |
| 1419 | + | |
| 1420 | + | |
| 1421 | + | |
| 1422 | + | |
| 1423 | + | |
0 commit comments