|
4 | 4 | connect: "[Link to zoom](https://princeton.zoom.us/j/92811687505?pwd=Zk13akpuUlowbjhqK0t1TDVPbS9Ydz09)"
|
5 | 5 | label: caas_01Feb2024
|
6 | 6 | agenda:
|
7 |
| - - title: "Exocompilation for productive programming of hardware accelerators" |
| 7 | + - title: "User-Schedulable Languages: Exo and Beyond" |
8 | 8 | speaker:
|
9 | 9 | image: "https://media.licdn.com/dms/image/C5603AQHzxnclCX9tEw/profile-displayphoto-shrink_800_800/0/1597949617547?e=1712188800&v=beta&t=-vVNhjYV5MvUcua07Go3u6nPi43NVmPwnaIyULx8IOI"
|
10 | 10 | name: "Yuka Ikarashi"
|
|
17 | 17 | She previously worked at Apple, Amazon, and CERN. She received
|
18 | 18 | Masason Foundation Fellowship and Funai Foundation Fellowship awards.
|
19 | 19 | description: |
|
20 |
| - High-performance kernel libraries are critical to exploiting |
21 |
| - accelerators and specialized instructions in many applications. Because |
22 |
| - compilers are difficult to extend to support diverse and |
23 |
| - rapidly-evolving hardware targets, and automatic optimization is often |
24 |
| - insufficient to guarantee state-of-the-art performance, these libraries |
25 |
| - are commonly still coded and optimized by hand, at great expense, in |
26 |
| - low-level C and assembly. To better support development of |
27 |
| - high-performance libraries for specialized hardware, we propose a new |
28 |
| - programming language, Exo, based on the principle of exocompilation: |
29 |
| - externalizing target-specific code generation support and optimization |
30 |
| - policies to user-level code. Exo allows custom hardware instructions, |
31 |
| - specialized memories, and accelerator configuration state to be defined |
32 |
| - in user libraries. It builds on the idea of user scheduling to |
33 |
| - externalize hardware mapping and optimization decisions. Schedules are |
34 |
| - defined as composable rewrites within the language, and we develop a |
35 |
| - set of effect analyses which guarantee program equivalence and memory |
36 |
| - safety through these transformations. We show that Exo enables rapid |
37 |
| - development of state-of-the-art matrix-matrix multiply and convolutional |
38 |
| - neural network kernels, for both an embedded neural accelerator and x86 |
39 |
| - with AVX-512 extensions, in a few dozen lines of code each. |
| 20 | + Single-core performance has long been saturated, and it is critical to |
| 21 | + exploit the peak performance of heterogeneous accelerators and |
| 22 | + specialized instructions in many applications. Because compilers are |
| 23 | + difficult to extend to support diverse and rapidly evolving hardware |
| 24 | + targets, and automatic optimization is often insufficient to guarantee |
| 25 | + state-of-the-art performance, high-performance libraries are commonly |
| 26 | + still coded and optimized by hand, at great expense, in low-level C and |
| 27 | + assembly. User schedulable languages, or USLs in short, have been |
| 28 | + proposed to address the challenge by decoupling algorithms and |
| 29 | + scheduling. In this talk, I will focus on one such USL, Exo, based on |
| 30 | + the principle of exocompilation: externalizing target-specific code |
| 31 | + generation support and optimization policies to user-level code. Exo |
| 32 | + allows custom hardware instructions, specialized memories, and |
| 33 | + accelerator configuration states to be defined in user libraries. |
| 34 | + I will also talk about other projects that borrow the idea from USLs |
| 35 | + and lessons we learned from the industry adoption of Exo. |
40 | 36 | - title: Update
|
41 | 37 | speaker: Vassil Vassilev
|
42 | 38 | - title: Next meeting
|
|
0 commit comments