🤖 MLX8-W3-Transformers

Week 3: Advanced Transformer Architectures & Implementation

Building state-of-the-art transformer models from scratch with modern MLOps practices

📺 Educational Resources

🎥 Neural Networks Fundamentals - 3Blue1Brown Series

🧠 Neural Networks Fundamentals Playlist by 3Blue1Brown

Click the image above to watch the complete series 📹

📚 What you'll learn from this series:

🎯 Core Concepts

Neural network basics
Gradient descent intuition
Backpropagation explained
Mathematical foundations

🔬 Visual Understanding

Interactive visualizations
Mathematical animations
Intuitive explanations
Beautiful graphics

🚀 Foundation for Transformers

Building blocks of deep learning
Optimization principles
Network architecture design
Mathematical rigor

🎬 Series Breakdown:

Episode	Topic	Duration	Key Concepts
1	But what is a neural network?	19 min	Neurons, layers, MNIST
2	Gradient descent, how neural networks learn	21 min	Cost functions, optimization
3	What is backpropagation really doing?	14 min	Chain rule, derivatives
4	Backpropagation calculus	10 min	Mathematical details

🎓 Course Content - Advanced Transformers

🎥 Week 3 Main Lecture: From Neural Networks to Transformers

📋 Advanced Transformer Implementation Workshop

📋 What you'll learn in this video:

🔧 Transformer Architecture Deep Dive: Understanding attention mechanisms, positional encoding, and layer normalization
🚀 Implementation from Scratch: Building transformers with PyTorch, including multi-head attention and feed-forward networks
📊 Training Strategies: Advanced techniques for training large transformer models efficiently
🎯 Fine-tuning & Transfer Learning: Adapting pre-trained models for specific tasks
🛠️ MLOps Integration: Using modern tools like UV for dependency management and reproducible environments
📈 Performance Optimization: Memory management, gradient checkpointing, and distributed training

🚀 Quick Start

Prerequisites

Python 3.12+ (3.13 for GPU environments)
CUDA 12.06+ (for GPU training)
UV Package Manager
Recommended: Watch the 3Blue1Brown series first! 🎥

🏃‍♂️ Get Running in 60 Seconds

# 1. Clone the repository
git clone https://github.com/your-username/MLX8-W3-Transformers.git
cd MLX8-W3-Transformers

# 2. Install UV (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# 3. Setup environment (auto-detects your platform)
uv sync

# 4. Run your first transformer!
uv run python examples/basic_transformer.py

🖥️ Platform-Specific Setup

🪟 Windows 11 Development

echo "3.12" > .python-version
uv sync --extra dev
uv run python examples/cpu_training.py

🍎 macOS (Intel & Apple Silicon)

echo "3.12" > .python-version  
uv sync --extra dev
uv run python examples/cpu_training.py

🐧 Ubuntu 22.04 + CUDA 12.06

echo "3.13" > .python-version
export UV_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cu121"
uv sync --extra gpu-dev
uv run python examples/gpu_training.py

🐧 Ubuntu 24.04 + CUDA 12.8

echo "3.13" > .python-version
export UV_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cu128"  
uv sync --extra gpu-dev
uv run python examples/gpu_training.py

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.charles		.charles
.tom		.tom
.vscode		.vscode
course-materials		course-materials
scripts		scripts
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 MLX8-W3-Transformers

Week 3: Advanced Transformer Architectures & Implementation

📺 Educational Resources

🎥 Neural Networks Fundamentals - 3Blue1Brown Series

📚 What you'll learn from this series:

🎬 Series Breakdown:

🎓 Course Content - Advanced Transformers

🎥 Week 3 Main Lecture: From Neural Networks to Transformers

🚀 Quick Start

Prerequisites

🏃‍♂️ Get Running in 60 Seconds

🖥️ Platform-Specific Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🤖 MLX8-W3-Transformers

Week 3: Advanced Transformer Architectures & Implementation

📺 Educational Resources

🎥 Neural Networks Fundamentals - 3Blue1Brown Series

📚 What you'll learn from this series:

🎬 Series Breakdown:

🎓 Course Content - Advanced Transformers

🎥 Week 3 Main Lecture: From Neural Networks to Transformers

🚀 Quick Start

Prerequisites

🏃‍♂️ Get Running in 60 Seconds

🖥️ Platform-Specific Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages