Commit d40f3c8
[AUTOGENERATED] [release/2.5] [ROCm][layer_norm] Use __builtin_amdgcn_rcpf(x) instead of 1.f/x (#1800)
Cherry-pick of #1688
Co-authored-by: Michael Halkenhäuser <[email protected]>
Co-authored-by: Hashem Hashemi <[email protected]>
(cherry picked from commit f8544af)
(cherry picked from commit ed48754)
(cherry picked from commit d62a39e)1 parent e4d62b1 commit d40f3c8
File tree
3 files changed
+28
-0
lines changed- aten/src/ATen/native/cuda
- cmake
3 files changed
+28
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
131 | 131 | | |
132 | 132 | | |
133 | 133 | | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
134 | 137 | | |
| 138 | + | |
135 | 139 | | |
136 | 140 | | |
137 | 141 | | |
| |||
145 | 149 | | |
146 | 150 | | |
147 | 151 | | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
148 | 155 | | |
| 156 | + | |
149 | 157 | | |
150 | 158 | | |
151 | 159 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1048 | 1048 | | |
1049 | 1049 | | |
1050 | 1050 | | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
| 1056 | + | |
| 1057 | + | |
| 1058 | + | |
| 1059 | + | |
| 1060 | + | |
| 1061 | + | |
| 1062 | + | |
| 1063 | + | |
| 1064 | + | |
| 1065 | + | |
| 1066 | + | |
1051 | 1067 | | |
1052 | 1068 | | |
1053 | 1069 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
153 | 153 | | |
154 | 154 | | |
155 | 155 | | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
156 | 160 | | |
157 | 161 | | |
158 | 162 | | |
| |||
0 commit comments