Commit cea35da
[AMD] Add basics to allow bypass LDS for dot RHS (#5350)
The AMDBypassLDSForDotOperandPass implements a strategy to bypass using
the Local Data Share (LDS) for one of the operands in an MFMA dot operation.
Under certain conditions, the dot layout of one of the operands allows direct
loading from HBM to VGPRs in the MFMA dot layout, without losing of
vectorization of global loads or increasing the number of global loads due to
shared data between threads.
---------
Co-authored-by: Ognjen Plavsic <[email protected]>
Co-authored-by: Ognjen Plavsic <[email protected]>1 parent 734d9f2 commit cea35da
File tree
14 files changed
+473
-57
lines changed- bin
- include/triton
- Dialect/TritonGPU/Transforms
- Tools/Sys
- lib/Dialect/TritonGPU/Transforms
- test/TritonGPU
- amd
- third_party/amd
- backend
- include/TritonAMDGPUTransforms
- lib/TritonAMDGPUTransforms
- python
14 files changed
+473
-57
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| 63 | + | |
63 | 64 | | |
64 | 65 | | |
65 | 66 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
205 | 205 | | |
206 | 206 | | |
207 | 207 | | |
| 208 | + | |
| 209 | + | |
208 | 210 | | |
209 | 211 | | |
210 | 212 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
104 | 104 | | |
105 | 105 | | |
106 | 106 | | |
107 | | - | |
108 | | - | |
109 | | - | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | | - | |
129 | | - | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
138 | | - | |
139 | | - | |
140 | | - | |
141 | | - | |
142 | | - | |
143 | | - | |
144 | | - | |
145 | | - | |
146 | | - | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | | - | |
155 | | - | |
156 | 107 | | |
157 | 108 | | |
158 | 109 | | |
| |||
187 | 138 | | |
188 | 139 | | |
189 | 140 | | |
190 | | - | |
| 141 | + | |
191 | 142 | | |
192 | 143 | | |
193 | 144 | | |
| |||
Lines changed: 42 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1022 | 1022 | | |
1023 | 1023 | | |
1024 | 1024 | | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
| 1033 | + | |
| 1034 | + | |
| 1035 | + | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
| 1049 | + | |
| 1050 | + | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
| 1056 | + | |
| 1057 | + | |
| 1058 | + | |
| 1059 | + | |
| 1060 | + | |
| 1061 | + | |
1025 | 1062 | | |
1026 | 1063 | | |
1027 | 1064 | | |
| |||
1040 | 1077 | | |
1041 | 1078 | | |
1042 | 1079 | | |
1043 | | - | |
1044 | | - | |
1045 | 1080 | | |
1046 | | - | |
| 1081 | + | |
1047 | 1082 | | |
| 1083 | + | |
| 1084 | + | |
1048 | 1085 | | |
1049 | 1086 | | |
1050 | 1087 | | |
| |||
1083 | 1120 | | |
1084 | 1121 | | |
1085 | 1122 | | |
1086 | | - | |
1087 | | - | |
1088 | 1123 | | |
1089 | | - | |
| 1124 | + | |
1090 | 1125 | | |
| 1126 | + | |
1091 | 1127 | | |
1092 | 1128 | | |
1093 | 1129 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1057 | 1057 | | |
1058 | 1058 | | |
1059 | 1059 | | |
| 1060 | + | |
| 1061 | + | |
| 1062 | + | |
| 1063 | + | |
| 1064 | + | |
| 1065 | + | |
| 1066 | + | |
| 1067 | + | |
| 1068 | + | |
| 1069 | + | |
| 1070 | + | |
| 1071 | + | |
| 1072 | + | |
| 1073 | + | |
| 1074 | + | |
| 1075 | + | |
| 1076 | + | |
| 1077 | + | |
| 1078 | + | |
| 1079 | + | |
| 1080 | + | |
| 1081 | + | |
| 1082 | + | |
| 1083 | + | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
| 1090 | + | |
| 1091 | + | |
| 1092 | + | |
| 1093 | + | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
| 1102 | + | |
| 1103 | + | |
| 1104 | + | |
| 1105 | + | |
| 1106 | + | |
| 1107 | + | |
1060 | 1108 | | |
1061 | 1109 | | |
1062 | 1110 | | |
| |||
0 commit comments