This course explores the design, programming, and performance of modern AI accelerators. It covers architectural techniques, dataflow, tensor processing, memory hierarchies, compilation for accelerators, and emerging trends in AI computing. This course will cover modern AI/ML algorithms such as convolutional neural nets, and Transformer-based models / LLMs. We will consider both training and inference for these models and discuss the impact of parameters such as batch size, precision, sparsity and compression on the accuracy of these models. Students will become familiar with hardware implementation techniques for using parallelism, locality, and low precision to implement the core computational kernels used in ML. Students will develop intuitions to make system-level trade-offs to design energy-efficient accelerators. Students will apply these concepts by implementing block-level designs in C++/SystemC, synthesizing them via high-level synthesis (HLS), and then instrumenting and evaluating the resulting system on cloud FPGAs. Prerequisites: CS 149 or EE 180. CS 229 is ideal, but not required.
-
Notifications
You must be signed in to change notification settings - Fork 13
Course Webpage for CS 217 Hardware Accelerators for Machine Learning, Stanford University
License
cs217/cs217.github.io
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
Course Webpage for CS 217 Hardware Accelerators for Machine Learning, Stanford University
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published