Course Description
This course is a graduate-level seminar in computer architecture with special topics in hardware acceleration. This course surveys the landscape of hardware acceleration from historical contexts to recent trends in system designs spanning a collection of architectural techniques (e.g. stream processing, dataflow architecture, parallelism applied to acceleration) and a variety of application domains (e.g. GPU, ML, Database, Graph, Genomics). This course also covers the taxonomy of accelerators, the hardware-software co-design of accelerators, and the deployment of accelerators using the AWS cloud.
The goal of the course is to provide an essential background in architectural concepts that are applicable to accelerator designs, survey the latest trends in accelerated systems, and introduce an example simple hardware design flow from conceptualizing an accelerator architecture, specifying the accelerator in Chisel, to deploying it on an AWS EC2 F2 instance (FPGAs-in-the-cloud).
Students are expected to read the assigned papers (about 3 papers a week), provide written reviews/discussion questions for the papers, participate in the paper discussions in class, prepare three lectures on three papers of choice throughout the semester, complete three labs to familiarize themselves with Chisel and the accelerator deployment infrastructure, and complete a course project of a hardware accelerated system (one to two students per project). The success of the project will be measured in a project proposal checkpoint, end-of-semester poster presentation, a CAL (Computer Architecture Letters)-style written report, and a demo of the accelerator deployed on the AWS Cloud.
When
Spring 2025 Tuesdays and Thursdays 1:25-2:40pm
Where
208 Hudson Hall
Instructor
Lisa Wu Wills
Office Hour Tuesdays 2:45-4pm @ D304 LSRC (Please schedule a
15-min appointment
using my Calendly link) or email me to
make an appointment outside of my OH
Teaching Assistant
Chris Kjellqvist
Office Hour Thursdays and Fridays 3-4pm @ D309 LSRC
Prerequisite
Computer Architecture (e.g. CS/ECE 250 or CS 550/ECE 552) and Digital Logic Design (e.g. CS/ECE 350 or ECE 550) or consent of instructor
Resources
CS/ECE 557 Spring 2025 Canvas Website is a supplement to the main course website for posting lecture slides, paper/project presentations, lab assignments, Ed, and your gradebook.
CS/ECE 557 (this course used to be CS/ECE 590) Past Class Projects website contains brief overviews and results of previous accelerated systems completed by students in this course to serve as an inspiration and example for your projects.
Sign Up Sheets
Paper Reading Presentations and Summaries
Chisel Related Resources
Course Syllabus
Course schedule is tentative and subject to change. Send email to Professor Wills if you have paper suggestions.
-
Jan 9 · 14
Historical Evolution of Hardware Accelerators
-
Jan 16 · 21
Why Accelerators
-
Jan 23 · 28
Accelerator Taxonomy and Integration
-
Jan 30 · Feb 4
Accelerated System Performance
-
Feb 6
Stream Processing
● 2.6 The Imagine Stream Processor
● (Optional) Programmable Stream Processor
● 2.6 Merrimac: Supercomputing with Streams
● (Optional) Executing Irregular Scientific Applications on Stream Architectures
-
Feb 11
Parallelism and Efficiency
-
Feb 13
Vector Processing and GPU
-
Feb 18
Project Proposal Checkpoint
-
Feb 20 · Feb 25
Dataflow Architecture
-
Feb 27 · Mar 4 · Mar 6
Domain Specific Accelerators: Machine Learning
● 2.27 DianNao: A Small-Footprint high-throughput accelerator for ubiquitous machine-learning
● 2.27 Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks
● 3.4 In-Datacenter Performance Analysis of a Tensor Processing Unit
● 3.4 Ten Lessons From Three Generations Shaped Google’s TPUv4i
● 3.6 A Configurable Cloud-Scale DNN Processor for Real-Time AI
● 3.6 A3: Accelerating Attention Mechanisms in Neural Networks with Approximation
Mar 18
Domain Specific Accelerators: Natural Language Processing
Mar 20 · 25
Domain Specific Accelerators: Database and Graph Analtyics, Robot Motion Planning, and Molecular Dynamics Simulation
● 3.20 Q100: The Architecture and Design of a Database Processing Unit
● (Optional) The Mondrian Data Engine
● 3.20 Graphicionado: A High-Performance and Energy-Efficient Accelerator for Graph Analytics
● 3.25 The Microarchitecture of a Real-Time Robot Motion Planning Accelerator
● 3.27 Anton, a Special-Purpose Machine for Molecular Dynamics Simulation
Mar 27 · Apr 1
Domain Specific Languages and Compilers, Reconfigurable Architectures
Apr 8 · 10
Misc: Computing in Space, Accelerating Genomics Analytics, Synthesis Prediction
Apr 10
Course Summary
Apr 15
Poster Session and Demo