Commit 2ce9bdd
authored
avoid pointer mutation in layer_norm kernel (#1006)
## Summary
Rewrite layer_norm kernel to use explicit channel offsets instead of
mutating X/Y base pointers inside loops.
This improves Triton compiler optimization opportunities, enables more
predictable memory access patterns, and avoids loop-carried pointer
dependencies.
## Testing Done
<!--- This is a required section; please describe how this change was
tested. --->
<img width="1810" height="448" alt="image"
src="https://github.com/user-attachments/assets/958b2c21-2f1c-42a2-af56-49a22e5c1b83"
/>
- Hardware Type: Ascend NPU 910B4
- [x] run `make test` to ensure correctness
- [x] run `make checkstyle` to ensure code style
- [ ] run `make test-convergence` to ensure convergence1 parent 41d4bcf commit 2ce9bdd
1 file changed
+11
-13
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
| |||
124 | 125 | | |
125 | 126 | | |
126 | 127 | | |
127 | | - | |
128 | | - | |
129 | | - | |
130 | | - | |
131 | | - | |
132 | | - | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
133 | 135 | | |
134 | | - | |
135 | 136 | | |
136 | 137 | | |
137 | 138 | | |
| |||
160 | 161 | | |
161 | 162 | | |
162 | 163 | | |
163 | | - | |
164 | | - | |
165 | | - | |
166 | | - | |
167 | | - | |
168 | | - | |
169 | 164 | | |
170 | 165 | | |
171 | 166 | | |
| |||
254 | 249 | | |
255 | 250 | | |
256 | 251 | | |
| 252 | + | |
| 253 | + | |
257 | 254 | | |
258 | 255 | | |
259 | 256 | | |
| |||
301 | 298 | | |
302 | 299 | | |
303 | 300 | | |
| 301 | + | |
304 | 302 | | |
305 | 303 | | |
306 | 304 | | |
| |||
0 commit comments