Skip to content

DataTalksClub/machine-learning-zoomcamp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Machine Learning Zoomcamp: A Free 4-Month Course on ML Engineering

Machine Learning Zoomcamp

Course platform with deadlines and submission forms for homework assignments and projects β€’ Course Channel on Slack (#course-ml-zoomcamp) β€’ Telegram Announcements β€’ Course Playlist β€’ FAQ β€’ Tweet about the Course

Learn machine learning engineering end-to-end, from core models to deploying real applications.

Build regression and classification models in Python, work with key algorithms like linear/logistic regression, decision trees, and deep learning, and then take them to production using Docker, FastAPI, Kubernetes, and AWS Lambda.

Table of Contents

About ML Zoomcamp

Machine Learning Zoomcamp teaches you the complete machine learning engineering, covering the entire pipeline: from building models with Python to deploying them in production environments.

ML Zoomcamp course overview showing progression from ML algorithms (Python, NumPy, Pandas, Scikit-learn) to deployment (Docker, FastAPI, Kubernetes)

You’ll master the key ML algorithms like linear regression, logistic regression, decision trees, and deep learning with TensorFlow and PyTorch, then learn to containerize with Docker, build APIs with FastAPI, and scale with Kubernetes and AWS Lambda.

Prerequisites

You'll need:

  • Prior programming experience (at least 1+ year)
  • Comfort with command line basics

You don't need any prior experience with machine learning. We'll start from the basics.

Technical setup: For machine learning modules, you only need a laptop with an internet connection. For deep learning sections, we'll use cloud resources for more intensive computations.

How to Join

You can join ML Zoomcamp either by following a live cohort or learning at your own pace.

All materials are freely available in this repository. Each module has its own folder (e.g., 01-intro, 03-classification), and cohort-specific homework and deadlines are in the cohorts directory. Lectures are pre-recorded and available in this YouTube playlist.

flowchart TD
    A["Want to learn Machine Learning Zoomcamp"] --> B{"Do you want<br/>deadlines & a certificate?"}

    B -->|Yes| C["Join Live Cohort"]
    B -->|No / Not sure| D["Self-Paced Learning"]

    C --> C1["Fixed schedule (Sept–Dec)"]
    C --> C2["Scored homework + leaderboard"]
    C --> C3["2 projects + peer review"]
    C --> C4["Eligible for certificate"]

    D --> D1["Start anytime, go at your own pace"]
    D --> D2["Unscored homework & optional projects"]
    D --> D3["No certificate"]
Loading

Option 1: Self-Paced Learning

Start anytime. You get full access to materials and community support on Slack.

Complete homework assignments: homework and solutions are available on the course platform. Build a project for your portfolio.

Under self-paced learning, homework isn't scored, your project isn't peer-reviewed, and you can't earn a certificate.

Option 2: Live Cohort

2025 Cohort: Starts September 15. Register here: Fill in this form

Runs once per year (September–December).

Includes:

  • Updated homework
  • Automatic homework scoring and a leaderboard
  • Project peer review
  • Eligibility for a certificate after meeting all requirements

Even if you join after the official start date, you can still follow along β€” but note that some homework forms may already be closed. All active deadlines are listed on the course platform.

To earn a certificate, you'll need enough time to complete two projects and the required peer reviews. Details are in the Projects and Certificate sections.

Comparison

We summarized the key differences between the two joining options in this table:

Feature Self-Paced Live Cohort
Timing Learn at your own pace, start anytime Fixed 4-month schedule (September–December each year)
Course Materials Full access to GitHub repository and YouTube lectures Full access to GitHub repository and YouTube lectures
Community Access to Slack community (#course-ml-zoomcamp) Access to Slack community (#course-ml-zoomcamp)
Homework Available but not scored Scored automatically, appears on leaderboard
Projects Build on your own, no evaluation Submit 2 projects (midterm + capstone OR two capstones) with peer review
Certificate Not available Available after completing projects and peer reviews
Structure Flexible, no deadlines Weekly rhythm with deadlines and peer accountability

Ready to start? Join the 2025 cohort or start with Module 1

Syllabus

Module Description Topics
Module 1: Introduction to Machine Learning Learn the fundamentals: what ML is, when to use it, and how to approach ML problems using the CRISP-DM framework. β€’ ML vs rule-based systems
β€’ Supervised learning basics
β€’ CRISP-DM methodology
β€’ Model selection concepts
β€’ Environment setup
Module 2: Machine Learning for Regression Build a car price prediction model while learning linear regression, feature engineering, and regularization. β€’ Linear regression (from scratch and with scikit-learn)
β€’ Exploratory data analysis
β€’ Feature engineering
β€’ Regularization techniques
β€’ Model validation
Module 3: Machine Learning for Classification Create a customer churn prediction system using logistic regression and learn about feature selection. β€’ Logistic regression
β€’ Feature importance and selection
β€’ Categorical variable encoding
β€’ Model interpretation
Module 4: Evaluation Metrics for Classification Learn how to properly evaluate classification models and handle imbalanced datasets. β€’ Accuracy, precision, recall, F1-score
β€’ ROC curves and AUC
β€’ Cross-validation
β€’ Confusion matrices
β€’ Class imbalance handling
Module 5: Deploying Machine Learning Models Turn your models into web services and deploy them with Docker and cloud platforms. β€’ Model serialization with Pickle
β€’ FastAPI web services
β€’ Docker containerization
β€’ Cloud deployment
Module 6: Decision Trees & Ensemble Learning Learn tree-based models and ensemble methods for better predictions. β€’ Decision trees
β€’ Random Forest
β€’ Gradient boosting (XGBoost)
β€’ Hyperparameter tuning
β€’ Feature importance
Midterm Project
Module 8: Neural Networks & Deep Learning Introduction to neural networks using TensorFlow and Keras, including CNNs and transfer learning. β€’ Neural network fundamentals
β€’ PyTorch
β€’ TensorFlow & Keras
β€’ Convolutional Neural Networks
β€’ Transfer learning
β€’ Model optimization
Module 9: Serverless Deep Learning Deploy deep learning models using serverless technologies like AWS Lambda. β€’ Serverless concepts
β€’ Deploying Scikit-Learn models with AWS Lambda
β€’ Deploying TensorFlow and PyTorch models with AWS Lambda
β€’ API Gateway
Module 10: Kubernetes & TensorFlow Serving Learn to serve ML models at scale using Kubernetes and TensorFlow Serving. β€’ Kubernetes basics
β€’ TensorFlow Serving
β€’ Model deployment and scaling
β€’ Load balancing
Capstone project 1
Capstone project 2

Projects

flowchart LR
    Idea["Choose problem & dataset"]
    EDA["Explore & clean data"]
    Model["Train & validate model"]
    Deploy["Expose model via FastAPI"]
    Docker["Containerize with Docker"]
    Cloud["Deploy to cloud / Kubernetes / Lambda"]
    Share["Document & share project"]

    Idea --> EDA --> Model --> Deploy --> Docker --> Cloud --> Share
Loading

Choose a problem that interests you, find a suitable dataset, develop your model, and deploy it as a web service.

There will be 3 projects:

  1. Midterm Project after Module 6: Decision Trees & Ensemble Learning
  2. Capstone project 1 at the end of the course, after Module 10: Kubernetes & TensorFlow Serving
  3. Capstone project 2 at the end of the course, after Module 10: Kubernetes & TensorFlow Serving

These projects allow you to apply everything you've learned and make a great addition to your GitHub profile and portfolio.

Project Examples from Past Cohorts

Machine Learning Zoomcamp certificate of completion awarded after successfully completing projects and peer reviews

A local deployment architecture using Kubernetes with Kind from one of the students' projects

Some of the course projects from past cohorts:

  • Blood cell classifier for cancer prediction: an end-to-end tool that segments and classifies blood cells from microscope images to assist in detecting signs of acute lymphoblastic leukemia (ALL)
  • Waste classifier: an Xception-based image classifier on ~15,000 waste images, reaching 93.3% test accuracy, and serving predictions via a Flask API packaged in Docker

Certificate

Machine Learning Zoomcamp certificate of completion awarded after successfully completing projects and peer reviews

Machine Learning Zoomcamp certificate awarded upon successful completion

To receive a certificate, you'll need to complete and submit two projects:

  1. Complete two projects: Submit either a midterm project and a capstone project, OR two capstone projects
  2. Submit on time: Meet the project submission deadlines to qualify for certification
  3. Peer review: Evaluate and provide feedback on 3 fellow students' projects during the peer review process

Testimonials

Machine Learning Zoomcamp was exhaustive, with very comprehensive content that covered concepts in depth. You can learn everything from the simplest concepts to preparing and deploying an ML model for production. Additionally, the entire community behind this course is highly participative and collaborative. I would like to thank Alexey Grigorev for all the knowledge he shared with us and his team for providing the support we needed to solve each problem we faced.

Machine Learning Zoomcamp has been an incredible journey, thanks to the expert guidance of Alexey Grigorev. Hugely grateful to Alexey, Timur, and the entire DataTalksClub team for this course, and to my cohort batchmates for the invaluable support that enriched my learning experience. I’m thankful for this programme, which provided challenging coursework that is taught in a very structured and lucid way. The timely assignments & hands-on projects instill the sense of timely delivery, besides equipping us with practical acumen to solve real-life problems.

Balancing the intensive Machine Learning Zoomcamp with my other engagements was no easy task, but the experience deepened my expertise in machine learning engineering, reinforced my passion for ML deployment and cloud technologies, and strengthened my resilience in handling real-world ML challenges. Thank you, Alexey Grigorev, for this course!

Highly recommend the ML Zoomcamp for anyone wanting a structured path to production-ready machine learning. A big thank you - Alexey Grigorev and to the team at DataTalksClub for providing such a well-structured and engaging course.

A huge thank you to Alexey Grigoriev for creating such an amazing courseβ€”and making it free! It’s truly inspiring.

Huge thanks to Alexey Grigorev and the DataTalksClub community for the incredible support and clarity throughout. The open-source spirit and collaborative notes made the learning experience even richer.

Ready to start? Join the 2025 cohort or start with Module 1

Community & Getting Help

Where to Get Help

Community Guidelines

Learning in Public

We encourage sharing your progress! Write blog posts, create videos, post on social media with #mlzoomcamp. It helps you learn better and builds your professional network.

Bonus: You can earn extra points for sharing your learning experience publicly.

Learn more: Learning in Public

Sponsors

Interested in sponsoring? Contact [email protected].

About DataTalks.Club

DataTalks.Club

DataTalks.Club is a global online community of data enthusiasts. It's a place to discuss data, learn, share knowledge, ask and answer questions, and support each other.

Website β€’ Join Slack Community β€’ Newsletter β€’ Upcoming Events β€’ YouTube β€’ GitHub β€’ LinkedIn β€’ Twitter

All the activity at DataTalks.Club mainly happens on Slack. We post updates there and discuss different aspects of data, career questions, and more.