Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Sessions will be conducted by **graduate students and faculty** from the **RRC L
[![Detailed Schedule + Topics](https://img.shields.io/badge/View%20Detailed%20Schedule%20%2B%20Topics-Google%20Sheets-34A853?logo=google-sheets&logoColor=white&style=flat-square)](https://docs.google.com/spreadsheets/d/1qjU-zWitD6S8JJlbWS90PVDoHJdfmojjqB4BuxkT4w8/edit?usp=sharing)


| # | Date | Topic | Presenter(s) | Lecture Notes | Assignments |
| # | Date | Topic | Presenter(s) | Lecture Notes | Assignments/Demos |
|----|--------------|-----------------------------------|------------------------------------|---------------|-------------|
| 1 | May 17, 2025 | Introduction | Prof. Madhava Krishna | -- | -- |
| 2 | May 19, 2025 | Linear Algebra & Probability | Vishal |[Linear Algebra Resources](lectures/02-linear-algebra-probability/README.md) | [Linear Algebra Problem Set](lectures/02-linear-algebra-probability/lec-02-linear-algebra-problems.pdf) |
Expand All @@ -50,9 +50,9 @@ Sessions will be conducted by **graduate students and faculty** from the **RRC L
| 19 | Jun 10, 2025 (PM) | Motion Planning - III | Meet | [Motion Planning - III Resources](lectures/19-motion-planning-3/README.md) | [🚀 Collision Cones & Velocity Obstacles Interactive Demo](https://roboticsiiith.github.io/summer-school-2025/demos/lec-19-collision-cones-vo/)|
| 20 | Jun 11, 2025 | ROS - I | Tarun, Soham | [ROS Deployment - I Resources](lectures/20-ros-deployment-1/README.md) | 🎓 Capstone 1/2 <br> Robot Tele-operation <br> [![Start Project](https://img.shields.io/badge/Start-Project-blue?logo=ros&logoColor=white)](lectures/20-ros-deployment-1/README.md#-capstone-project---part-1)|
| 21 | Jun 12, 2025 | ROS - II | Tarun, Soham | [ROS Deployment - II Resources](lectures/21-ros-deployment-2/README.md) | 🎓 Capstone 2/2 <br> Autonomous Navigation <br> [![Launch](https://img.shields.io/badge/Start-Project-blue?logo=ros&logoColor=white)](lectures/21-ros-deployment-2/README.md#-capstone-project---part-2) |
| 22 | Jun 13, 2025 | Reinforcement Learning | Vishal | | |
| 23 | Jun 14, 2025 | Diffusion Models - Basics | Anant | | 🚧 WIP |
| 24 | Jun 14, 2025 | Diffusion Models for Robotics | Jayaram | | |
| 22 | Jun 13, 2025 | Reinforcement Learning | Vishal, Tejas | [Reinforcement Learning Resources](lectures/22-reinforcement-learning/README.md) | [🧠 Policy Gradient & Actor-Critic Colab Walkthrough](https://colab.research.google.com/drive/1TWPHz3udlKqsdSyMvTiZG9Y5P7VrY3gH?usp=sharing) |
| 23 | Jun 14, 2025 | Diffusion Models - Basics | Anant | [Diffusion Models - Basics Resources](lectures/23-diffusion-basics/README.md) | [DDPM & Stable Diffusion Walkthroughs](lectures/23-diffusion-basics/README.md#-assignment) |
| 24 | Jun 14, 2025 | Diffusion Models for Robotics | Jayaram | [Diffusion Models for Robotics Resources](lectures/24-diffusion-robotics/README.md) | [Diffusion Policy for Robot Manipulation Hands-On Colab](https://colab.research.google.com/drive/1gxdkgRVfM55zihY9TFLja97cSVZOZq2B?usp=sharing) <br> [🤗 HF Push Task Demo](https://huggingface.co/lerobot/diffusion_pusht)|

📌 **Note:**
The schedule will be regularly updated with slides, reference materials, and coding assignments as sessions conclude. Stay tuned by clicking on **Watch** for this repository or subscribing to its RSS feed.
Expand Down
3 changes: 3 additions & 0 deletions lectures/06-dynamics-control-2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,4 +47,7 @@ Please raise doubts or engage in discussion on the **`#module-2-dynamics-control
|----------------------------------|----------------------------------------------------------------------------------------|
| Lecture Slides (Sarthak) - Controls - Introduction | [lec-06-controls-introduction.pdf](./lec-06-controls-introduction.pdf) |
| Lecture Slides (Astik) - Controls - PID, LQR | [lec-06-controls-pid-lqr.pdf](./lec-06-controls-pid-lqr.pdf) |
| **Modern Robotics: Mechanics, Planning, and Control** – Kevin M. Lynch & Frank C. Park (Northwestern University) | [![Textbook](https://img.shields.io/badge/Open-Textbook-blue?logo=readthedocs)](https://hades.mech.northwestern.edu/index.php/Modern_Robotics)<br>[![Videos](https://img.shields.io/badge/Watch-Lecture_Videos-red?logo=youtube&logoColor=white)](https://hades.mech.northwestern.edu/index.php/Modern_Robotics_Videos) |


---
2 changes: 1 addition & 1 deletion lectures/15-learning-3d-vision/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Lecture 15: Learning for 3D Vision

**Instructor:** Akash Kumbar
**Instructor:** Akash Kumbar ([portfolio](https://akash-kumbar.github.io/))
**Date:** June 4, 2025

---
Expand Down
100 changes: 100 additions & 0 deletions lectures/22-reinforcement-learning/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Lecture 22: Reinforcement Learning
**Instructors:** Vishal, Tejas
**Date:** June 13, 2025

## 📖 Topics Covered:

- **1. Why Reinforcement Learning?**
- Why is it hard to generate data for robots with frequently changing morphologies?
- Why are traditional approaches (e.g., explicit physics models, controllers) inefficient for skill learning?
- How does RL (and supervised learning) help bridge this gap?
- What is an example where RL enabled fast adaptation (e.g., quadrupeds using rapid motor adaptation)?

- **2. RL Notation and Terminology**
- What are stochastic processes and the Markovian property?
- What is a Markov Decision Process (MDP), and how is it defined?

- **3. Anatomy of the Reinforcement Learning Pipeline**
- How do we collect samples from the environment using the current policy?
- What does model fitting or sample evaluation involve?
- How is the policy improved based on evaluation?
- How do modern simulators and sim-to-real transfer help overcome sample collection bottlenecks?

- **4. Policy Gradient Methods**

**4.1 Goal of RL**
- What is the objective function \( J(\theta) \) in RL?
- How does the formulation differ in finite vs. infinite horizon settings?
- Why is the goal to maximize expected return?

**4.2 Policy Gradient**
- How do we compute the gradient of the objective function?
- What is the REINFORCE trick and algorithm?

- **5. Reducing Variance in REINFORCE**
- Why does REINFORCE have high variance despite being unbiased?
- How does the reward-to-go trick exploit causality to reduce variance?
- What are baseline methods for variance reduction?
- How do we choose an optimal baseline to minimize variance?
- What are actor-critic methods, and how do they combine value estimation with policy updates?

- **6. Value-Based Methods**
- Value function and Q-function
- What are SARSA and Q-learning?
- How does Deep Q-Learning extend traditional Q-learning?


## 📄 Assignment

- 🧠 **Policy Gradient & Actor-Critic Walkthrough:**
Open the following Colab notebook to implement and experiment with Policy Gradient methods from scratch:
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1TWPHz3udlKqsdSyMvTiZG9Y5P7VrY3gH?usp=sharing)

This walkthrough is designed to help you implement a working **Policy Gradient agent** using PyTorch on environments like *CartPole*.

---

**📚 What You'll Learn**
- Core ideas behind Policy Gradient algorithms
- How to implement and train a neural network policy
- How to collect rollouts and compute returns
- Policy updates using gradient ascent
- (Optional) Baseline methods & Generalized Advantage Estimation (GAE)

**🛠 Prerequisites**
- Python + PyTorch basics
- Key RL concepts: Policy, Reward, Return, Advantage, Value Function

**🗂 Notebook Structure**
- **Environment Setup**: Logging and configuration
- **Policy Network**: Implementation and sampling
- **Training Loop**: Computing returns and updating the policy
- **Variance Reduction (Optional)**: Baselines, GAE for stability

**👨‍🏫 Tips for Students**
- Run cells in order — don’t skip!
- Print out observations, actions, rewards to debug.
- Try different hyperparameters and Gym environments.
- Use TensorBoard or video logs to visualize progress.

> 📘 Inspired by [CS285: Deep RL (Berkeley)](https://rail.eecs.berkeley.edu/deeprlcourse/)

_Courtesy: Tejas_

📢 Do post doubts on the `#module-7-robot-learning` Slack channel!

## 🔗 Resources

| 📚 Topic | 🔗 Link |
|----------|---------|
|Lecture Slides -- Reinforcement Learning| See Lectures 4-7 from RAIL Course (linked below) |
| 🎓 Deep Reinforcement Learning – Sergey Levine (RAIL, Berkeley) | [![Website](https://img.shields.io/badge/Open-Course-blue?logo=googleclassroom)](https://rail.eecs.berkeley.edu/deeprlcourse/) |
| 🧠 Policy Gradient Algorithms – Lilian Weng | [![Blog](https://img.shields.io/badge/Read-Blog-orange?logo=readthedocs)](https://lilianweng.github.io/posts/2018-04-08-policy-gradient/) |
| ⚙️ PPO Implementation Details – ICLR Blog Track | [![Blog](https://img.shields.io/badge/Read-PPO_Insights-orange?logo=readthedocs)](https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/) |
| 📘 Mathematical Foundations of RL – Shiyu Zhao (Westlake University) | [![GitHub](https://img.shields.io/badge/View-on_GitHub-181717?logo=github)](https://github.com/MathFoundationRL/Book-Mathematical-Foundation-of-Reinforcement-Learning) |
| ⚡ RL Quickstart Guide – Joseph Suarez (Pufferlib Creator) | [![X/Twitter](https://img.shields.io/badge/View-Quickstart_Guide-1DA1F2?logo=x)](https://x.com/jsuarez5341/status/1854855861295849793) |
| 📦 Stable Baselines3 – RL Library (DLR-RM) | [![GitHub](https://img.shields.io/badge/View-Stable--Baselines3-181717?logo=github)](https://github.com/DLR-RM/stable-baselines3) |
| 🧼 CleanRL – Minimal RL Implementations | [![GitHub](https://img.shields.io/badge/View-CleanRL-181717?logo=github)](https://github.com/vwxyzjn/cleanrl) |
| 🐉 Decisions & Dragons – FAQs About RL | [![Website](https://img.shields.io/badge/Explore-Decisions_&_Dragons-blueviolet?logo=readthedocs)](https://www.decisionsanddragons.com/) |

---
59 changes: 59 additions & 0 deletions lectures/23-diffusion-basics/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Lecture 23: Diffusion Models Basics
**Instructor:** Anant Garg
**Date:** June 14, 2025

## Topics Covered:

- Noise, Gaussians + Setup
- Autoencoders, VAE, Reparameterization Trick
- The Forward Process: Adding Noise Step-by-Step
- The Reverse Process: Learning to Denoise
- DDPM: Predicting Noise to Reconstruct Data
- Guidance: Making Diffusion Outputs Useful
- Classifier-Based
- Classifier-Free
- Score Matching
- Latent Diffusion

## 📄 Assignment

- 🎨 **Diffusion Models – DDPM & Stable Diffusion Walkthroughs:**
Clone and run through the following two PyTorch implementations to understand the fundamentals of diffusion models:

- 📦 **DDPM (Denoising Diffusion Probabilistic Models):**
[explainingai-code/DDPM-PyTorch](https://github.com/explainingai-code/DDPM-Pytorch)

This repo walks through the original DDPM algorithm in PyTorch. Run the code, visualize the forward and reverse diffusion process, and study how noise schedules influence generation.

🔁 **Extension Task – Implement DDIM:**
Read [DDIM paper (arXiv:2010.02502)](https://arxiv.org/abs/2010.02502) and extend the code to include deterministic sampling via DDIM.
Suggested steps:
- Modify the sampling loop to use DDIM's non-Markovian formulation
- Add support for fewer inference steps (fast sampling)
- Compare image quality vs sampling speed with DDPM

- 🎨 **Stable Diffusion (from scratch):**
[explainingai-code/StableDiffusion-PyTorch](https://github.com/explainingai-code/StableDiffusion-PyTorch)

This repo walks through a simplified but faithful re-implementation of Stable Diffusion.
Explore how text prompts are encoded, how the UNet denoiser operates, and how the latent diffusion process differs from vanilla DDPM.

💡 Feel free to experiment with prompts, noise schedules, and decoder resolutions! Post all your findings, doubts on the `#module-7-robot-learning` Slack channel.

## 🔗 Resources

| 📚 Topic | 🔗 Link |
|----------|--------|
| 📑 Lecture Slides – Diffusion Basics | [![PDF](https://img.shields.io/badge/Open-Slides-red?logo=adobeacrobatreader&logoColor=white)](./lec-23-diffusion-basics.pdf) |
| 🧠 From Autoencoder to Beta-VAE – Lilian Weng | [![Blog](https://img.shields.io/badge/Read-Blog-orange?logo=readthedocs)](https://lilianweng.github.io/posts/2018-08-12-vae/) |
| 🌫️ What Are Diffusion Models? – Lilian Weng | [![Blog](https://img.shields.io/badge/Read-Blog-orange?logo=readthedocs)](https://lilianweng.github.io/posts/2021-07-11-diffusion-models/) |
| 📄 DDPM – Denoising Diffusion Probabilistic Models (Ho et al.) | [![PDF](https://img.shields.io/badge/Open-Paper-blue?logo=readthedocs)](https://hojonathanho.github.io/diffusion/) |
| 🧬 Latent Diffusion Models – High-Res Image Synthesis | [![arXiv](https://img.shields.io/badge/arXiv-2112.10752-b31b1b?logo=arxiv)](https://arxiv.org/pdf/2112.10752) |
| 🎥 Explaining Diffusion – YouTube Playlist | [![YouTube](https://img.shields.io/badge/Watch-Playlist-red?logo=youtube&logoColor=white)](https://www.youtube.com/playlist?list=PL8VDJoEXIjpo2S7X-1YKZnbHyLGyESDCe) |
| 🧪 Stable Diffusion (from scratch) – PyTorch Codebase | [![GitHub](https://img.shields.io/badge/View-Code-181717?logo=github)](https://github.com/explainingai-code/StableDiffusion-PyTorch) |
| 🌊 Introduction to Flow Matching & Diffusion Models – MIT 6.S184 (Generative AI with SDEs) | [![Website](https://img.shields.io/badge/Open-Course-blue?logo=mit&logoColor=white)](https://diffusion.csail.mit.edu/) |
| 🎥 Diffusion Models – Paper Explanation & Math | [![YouTube](https://img.shields.io/badge/Watch-Video-red?logo=youtube&logoColor=white)](https://www.youtube.com/watch?v=HoKDTa5jHvg) |
| 🎓 CS 198-126: Lecture 12 – Diffusion Models (ML@Berkeley) | [![YouTube](https://img.shields.io/badge/Watch-Lecture-red?logo=youtube&logoColor=white)](https://www.youtube.com/watch?v=687zEGODmHA&t=23s) |


---
Binary file not shown.
52 changes: 52 additions & 0 deletions lectures/24-diffusion-robotics/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Lecture 24: Diffusion Models for Robotics
**Instructor:** Jayaram Reddy
**Date:** June 14, 2025

## Topics Covered:
- Why Diffusion for Control
- Diffusion Policies
- Diffusion for Motion Planning, EDMP
- Diffusion for World-Modeling
- Tradeoffs Autoregressive vs Diffusion Models
- Latent Diffusion

## 📄 Assignments

- 🤖 **Diffusion Policy for Robot Manipulation:**
Explore how diffusion models can be applied to learn robotic manipulation behaviors, such as pushing, directly from demonstrations.

- 📓 **Official Colab Notebook:**
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1gxdkgRVfM55zihY9TFLja97cSVZOZq2B?usp=sharing)

- 🤗 **Hugging Face Playground – Push Task Demo:**
[![Open on Hugging Face](https://img.shields.io/badge/Launch-HF_Notebook-blueviolet?logo=huggingface&logoColor=white)](https://huggingface.co/lerobot/diffusion_pusht)

**What to Do:**
- Run the official notebook to understand the structure of the Diffusion Policy model and how it leverages conditional generation for trajectory prediction.
- Try out the push environment on Hugging Face to see the learned policy in action.
- Reflect on how diffusion-based imitation compares with classical behavioral cloning.
- (Optional) Try swapping out the dataset or varying inference steps to observe differences in performance.

> 🧠 This exercise builds intuition for how generative models can drive robotic agents with flexibility and generalization.

📢 Please feel free to post all questions over at the `#module-7-robot-learning` Slack channel.

## 🔗 Resources

| 📚 Topic | 🔗 Link |
|----------|--------|
| 📑 Lecture Slides – Diffusion for Robotics | [![PDF](https://img.shields.io/badge/Open-PDF-red?logo=adobeacrobatreader&logoColor=white)](./lec-24-diffusion-robotics.pdf) [![Slides](https://img.shields.io/badge/Open-Google_Slides-yellow?logo=googleslides&logoColor=white)](https://docs.google.com/presentation/d/1YjRIxj32OXhiaPgXihWKW40aPGhIcgBFr4CPkJLY19Q/edit?usp=sharing) |
| 🤖 Diffusion Policy – Columbia University | [![Website](https://img.shields.io/badge/Open-Project-blue?logo=googlechrome)](https://diffusion-policy.cs.columbia.edu/) |
| 🧠 Imitating Human Behavior with Diffusion Models (2023) | [![arXiv](https://img.shields.io/badge/arXiv-2301.10677-b31b1b?logo=arxiv)](https://arxiv.org/abs/2301.10677) |
| 📐 Geometry of Diffusion Models for Robotics – Sander Dieleman | [![Blog](https://img.shields.io/badge/Read-Blog-orange?logo=readthedocs)](https://sander.ai/2023/08/28/geometry.html) |
| 🧩 Ensemble of Costs for Diffusion Planning | [![Website](https://img.shields.io/badge/Open-Project-blue?logo=googlechrome)](https://ensemble-of-costs-diffusion.github.io/) |
| 💎 DIAMOND – Diffusion Models for Diverse Robot Behavior | [![Website](https://img.shields.io/badge/Open-Project-blue?logo=googlechrome)](https://diamond-wm.github.io/) |
| 🚗 Imagine2Drive – Open Vocabulary Driving Skills | [![Website](https://img.shields.io/badge/Open-Project-blue?logo=googlechrome)](https://anantagrg.github.io/Imagine-2-Drive.github.io/) |
| 🧞 GENIE (Diffusion + LLMs) – Google DeepMind | [![Website](https://img.shields.io/badge/Open-Project-blue?logo=googlechrome)](https://sites.google.com/view/genie-2024/home) |
| 🌌 DreamGen – Scene-Level Robot Imagination (NVIDIA) | [![Website](https://img.shields.io/badge/Open-Project-blue?logo=googlechrome)](https://research.nvidia.com/labs/gear/dreamgen/) |
| 🌀 Diffusion Forcing – Diffusion Models for Human Motion Imitation | [![Website](https://img.shields.io/badge/Open-Project-blue?logo=googlechrome)](https://boyuan.space/diffusion-forcing/) |
| 🔄 Flow Matching – Machine Learning Group, University of Cambridge | [![Blog](https://img.shields.io/badge/Read-Blog-orange?logo=readthedocs)](https://mlg.eng.cam.ac.uk/blog/2024/01/20/flow-matching.html) |



---
Binary file not shown.