The course involves 8 programming lab assignments of steadily growing complexity. All assignments will involve programming a massively parallel GPU system using CUDA, which is a popular commercial language extension of C/C++ for GPU programming. Assignments involve tasks such as matrix multiplication, convolution, reduction, histogram calculation, and sparse matrix-vector multiplication. During the final third of the semester, students work on a larger, more complex project/competition.
Lab0: Device Query
Lab1: Vector Addition
Lab2: Simple Matrix Multiplication
Lab3: Tiled Matrix Multiplication
Lab4: 3D Convolution
Lab5: List Reduction
Lab6: Scan
Lab7: Histogram
Lab8: Sparse Matrix Multiplication
Project1: Simple Convolution
Project2: Unrolled Matrix Optimization
Project3: CUDA Kernel Optimizations