Commit 8da9324
[Mosaic GPU] Fuse slicing into s4 -> bf16 upcasts
This allows us to significantly simplify the generated PTX/SASS,
which is currently cluttered with LLVM trying to align slices to
start at bit 0 and failing to CSE the right shifts.
PiperOrigin-RevId: 7379678901 parent 7a459f0 commit 8da9324
File tree
4 files changed
+78
-18
lines changed- jaxlib/mosaic/gpu
- jax/experimental/mosaic/gpu
4 files changed
+78
-18
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1373 | 1373 | | |
1374 | 1374 | | |
1375 | 1375 | | |
1376 | | - | |
1377 | | - | |
1378 | | - | |
1379 | | - | |
1380 | | - | |
1381 | | - | |
1382 | | - | |
1383 | | - | |
1384 | | - | |
| 1376 | + | |
| 1377 | + | |
| 1378 | + | |
| 1379 | + | |
| 1380 | + | |
| 1381 | + | |
| 1382 | + | |
| 1383 | + | |
| 1384 | + | |
| 1385 | + | |
| 1386 | + | |
| 1387 | + | |
| 1388 | + | |
| 1389 | + | |
| 1390 | + | |
| 1391 | + | |
| 1392 | + | |
| 1393 | + | |
| 1394 | + | |
| 1395 | + | |
| 1396 | + | |
| 1397 | + | |
| 1398 | + | |
| 1399 | + | |
1385 | 1400 | | |
1386 | 1401 | | |
1387 | 1402 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
346 | 346 | | |
347 | 347 | | |
348 | 348 | | |
349 | | - | |
| 349 | + | |
350 | 350 | | |
351 | 351 | | |
352 | 352 | | |
| |||
1237 | 1237 | | |
1238 | 1238 | | |
1239 | 1239 | | |
1240 | | - | |
1241 | 1240 | | |
1242 | 1241 | | |
1243 | | - | |
| 1242 | + | |
1244 | 1243 | | |
1245 | | - | |
1246 | | - | |
1247 | | - | |
1248 | | - | |
1249 | | - | |
1250 | | - | |
| 1244 | + | |
| 1245 | + | |
| 1246 | + | |
| 1247 | + | |
| 1248 | + | |
1251 | 1249 | | |
1252 | 1250 | | |
1253 | 1251 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
| 68 | + | |
68 | 69 | | |
69 | 70 | | |
70 | 71 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| |||
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| 27 | + | |
26 | 28 | | |
27 | 29 | | |
28 | 30 | | |
| |||
36 | 38 | | |
37 | 39 | | |
38 | 40 | | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
39 | 84 | | |
40 | 85 | | |
41 | 86 | | |
| |||
58 | 103 | | |
59 | 104 | | |
60 | 105 | | |
| 106 | + | |
61 | 107 | | |
62 | 108 | | |
63 | 109 | | |
| |||
0 commit comments