DataTalksClub
diff --git a/‎_podcast/s03e04-interviewing-300-data-scientists.md‎
Lines changed: 606 additions & 637 deletions b/‎_podcast/s03e04-interviewing-300-data-scientists.md‎
Lines changed: 606 additions & 637 deletions
diff --git a/‎_podcast/s03e07-market-yourself.md‎
Lines changed: 559 additions & 593 deletions b/‎_podcast/s03e07-market-yourself.md‎
Lines changed: 559 additions & 593 deletions
diff --git a/‎_podcast/s04e08-freelancing.md‎
Lines changed: 153 additions & 160 deletions b/‎_podcast/s04e08-freelancing.md‎
Lines changed: 153 additions & 160 deletions
diff --git a/‎_podcast/s05e02-data-engineering-acronyms.md‎
Lines changed: 596 additions & 616 deletions b/‎_podcast/s05e02-data-engineering-acronyms.md‎
Lines changed: 596 additions & 616 deletions
diff --git a/‎_podcast/s06e01-solopreneur.md‎
Lines changed: 686 additions & 725 deletions b/‎_podcast/s06e01-solopreneur.md‎
Lines changed: 686 additions & 725 deletions
diff --git a/‎_podcast/s06e02-non-technical-interviews.md‎
Lines changed: 807 additions & 854 deletions b/‎_podcast/s06e02-non-technical-interviews.md‎
Lines changed: 807 additions & 854 deletions
diff --git a/‎_podcast/s07e05-machine-learning-system-design-interview.md‎
Lines changed: 40 additions & 20 deletions b/‎_podcast/s07e05-machine-learning-system-design-interview.md‎
Lines changed: 40 additions & 20 deletions
@@ -2,8 +2,17 @@
 episode: 5
 guests:
 - valeriybabushkin
-intro: "In this episode, Valerii Babushkin—then Head of Data Science at Blockchain.com and Kaggle Grandmaster—breaks down how to approach machine learning system design at scale. He shares insights from building ML systems at Meta, Alibaba, and Yandex, explaining how to move beyond algorithms to focus on end-to-end design, feature engineering, and evaluation. Valerii walks through a real-world fraud detection example, discusses how to structure interview answers, and outlines the core principles from his book Machine Learning System Design. You’ll learn how to think like a senior ML engineer and design robust, production-ready systems."
-description: "Master ML system design interviews with Valerii Babushkin, ex-Meta Head of Data Science. Learn fraud detection systems, feature engineering, metrics selection, and production ML best practices for FAANG interviews."
+intro: In this episode, Valerii Babushkin—then Head of Data Science at Blockchain.com
+  and Kaggle Grandmaster—breaks down how to approach machine learning system design
+  at scale. He shares insights from building ML systems at Meta, Alibaba, and Yandex,
+  explaining how to move beyond algorithms to focus on end-to-end design, feature
+  engineering, and evaluation. Valerii walks through a real-world fraud detection
+  example, discusses how to structure interview answers, and outlines the core principles
+  from his book Machine Learning System Design. You’ll learn how to think like a senior
+  ML engineer and design robust, production-ready systems.
+description: Master ML system design interviews with Valerii Babushkin, ex-Meta Head
+  of Data Science. Learn fraud detection systems, feature engineering, metrics selection,
+  and production ML best practices for FAANG interviews.
 topics:
 - machine learning
 - career growth
@@ -18,8 +27,10 @@ links:
   youtube: https://www.youtube.com/watch?v=0RsmRjar66E
 season: 7
 short: Machine Learning System Design Interview
-title: "Machine Learning System Design & Interview Strategies for Senior ML Engineers"
+title: Machine Learning System Design & Interview Strategies for Senior ML Engineers
 transcript:
+- header: Podcast Introduction & Episode Overview
+- header: 'Valerii Background: Career Snapshot and Kaggle Achievements'
 - line: This week, we'll talk about machine learning system design interviews. We
     have a special guest today, Valerii. Valerii works at Blockchain.com as a head
     of data science. Before that, he worked in quite a few places. More recently at
@@ -47,15 +58,14 @@ transcript:
   sec: 182
   time: '3:02'
   who: Valerii
-- header: "Guest Introduction: Head of Data Science & Kaggle Grandmaster"
 - line: '[laughs] Okay, so I briefly already told everyone about your background.
     But before we go into our main topic of machine learning system design, maybe
     let''s talk a bit more about your career journey in detail. Can you tell us a
     bit about that?'
   sec: 186
   time: '3:06'
   who: Alexey
-- header: "Blockchain.com History & Broad Head of Data Science Role"
+- header: 'Blockchain.com Role: Scope, Responsibilities, and Data Ownership'
 - line: Well, sure. Let's start from the current time. As you said, I'm head of data
     science at Blockchain. So a bit about blockchain, first. It's a very old crypto
     company. When I say very old – it is very, very old. It was founded in 2011. Try
@@ -94,7 +104,7 @@ transcript:
   sec: 342
   time: '5:42'
   who: Alexey
-- header: "Career Shift: Retail to Facebook Privacy & Large-Scale Systems"
+- header: 'Transition to Meta: User Privacy Work and Large-Scale ML Experience'
 - line: To some extent, yes, because it's everything related to data – from infrastructure
     to applications. From analytics to visualization. Before that, I was working in
     – well, I joined Facebook and left Meta. I will just rotate my screen a bit –
@@ -123,6 +133,7 @@ transcript:
   sec: 450
   time: '7:30'
   who: Alexey
+- header: 'Hiring Experience: Conducting High-Volume Interviews and Team Leadership'
 - line: 'Live interview? Okay. I don''t think it''s about Blockchain’s mission. That''s
     it. What else? I was leading quite a big team in my time – the biggest team I
     was leading was almost 150 people: machine learning engineers, data analysts,
@@ -163,7 +174,7 @@ transcript:
   sec: 547
   time: '9:07'
   who: Valerii
-- header: "ML System Design: Target Audience (Level 5 Senior MEs)"
+- header: 'Candidate Targeting: Who Faces ML System Design Interviews'
 - line: Okay. Let's talk about machine learning system design. This is a part of the
     interview process and you said you did a lot of interviews as the interviewer.
     I imagine also, when you were joining Facebook before that, you also had to take
@@ -205,6 +216,7 @@ transcript:
   sec: 680
   time: '11:20'
   who: Alexey
+- header: 'Interview Structure: 45-Minute Narrative and Evaluation Goals'
 - line: Yeah, true. Good catch. Yes, level five is a Senior in terms of the level
     on Facebook, which means that, if you're on this level, it is an honorary thing
     to be on this level forever. So if you ended on level four, it was probably because
@@ -241,7 +253,6 @@ transcript:
   sec: 798
   time: '13:18'
   who: Valerii
-- header: "System Design vs. ML Design & Focusing on Machine Learning"
 - line: 'I think this is what happened to me, but this is something that I prepared
     for later. So, you said that important interviews for detecting, or assessing
     your level are: behavioral interview, system design interview, and machine learning
@@ -250,6 +261,9 @@ transcript:
   sec: 816
   time: '13:36'
   who: Alexey
+- header: 'Contrast: Software System Design Versus ML System Design'
+- header: 'Fraud Detection Case Study: Probabilities, Loss Functions, and Real-Time
+    Needs'
 - line: Okay, let's try to determine the disparity between those two. First of all,
     when you're asked to do a system design interview, you're usually asked about
     data structures, about different server-side components, like “What are the databases?
@@ -271,7 +285,6 @@ transcript:
   sec: 838
   time: '13:58'
   who: Valerii
-- header: "Fraud Detection Walkthrough: Loss Functions, Metrics, Modeling"
 - line: 'Now we can say that we know that we have to put not zero or one, but some
     score between zero and one, when we have a transaction. When we have a transaction
     now, that probably means we''d like to have the system in real time. Okay, let''s
@@ -295,6 +308,7 @@ transcript:
   sec: 838
   time: '13:58'
   who: Valerii
+- header: Labeling, Class Imbalance, and Feature Engineering Tradeoffs
 - line: Fortunately, the very basic log loss is good here. So we know that we might
     start from log loss. We also know that we might start from a very basic linear
     regression model. Why is that? Because we know that it has to be very fast – in
@@ -360,6 +374,7 @@ transcript:
   sec: 1003
   time: '16:43'
   who: Valerii
+- header: 'Interview Tactics: Stating Assumptions and Getting Alignment'
 - line: That's quite a lot of information. I was trying to process this. That's quite
     a lot of things. So this was an example of machine learning system design. The
     interview starts and then the person – the interviewer – asks you, "Let's design
@@ -368,7 +383,6 @@ transcript:
   sec: 1233
   time: '20:33'
   who: Alexey
-- header: "Interview Strategy: Making Assumptions & System vs. ML Design Examples"
 - line: The best way is not even to ask, but to say "My assumption is that. Do you
     agree with that or not?” You see, you asked the question, but actually, you’ve
     made an assumption. You say “Are you okay with that?” Because you've been given
@@ -381,6 +395,7 @@ transcript:
   sec: 1270
   time: '21:10'
   who: Valerii
+- header: 'Example: Points-of-Interest System vs Personalized Recommender'
 - line: Yeah, indeed. So, the original question I actually asked you is about the
     difference between system design and machine learning system design and I think
     it's very clear what machine learning system design is. It requires some domain
@@ -443,7 +458,7 @@ transcript:
   sec: 1467
   time: '24:27'
   who: Valerii
-- header: "ML System is the Whole Pipeline & Interview Failure: Too Much Heuristics"
+- header: 'End-to-End ML Pipeline: Metrics, Baselines, and A/B Testing'
 - line: But where does system design actually come into the picture here? Because
     here, we talked about selecting the right metric, which was the important thing,
     as you said. You said it was log loss for this specific case. Or even before log
@@ -531,7 +546,6 @@ transcript:
   sec: 1690
   time: '28:10'
   who: Valerii
-- header: "Securing the Interview: Iterative Baseline Design & Technical Depth"
 - line: '[laughs] I might be wrong with using these words. I think the recruiter probably
     used different words. But the reason for me failing the process – the whole interview
     – was machine learning system design. Not the others. I was afraid about the others.
@@ -543,6 +557,7 @@ transcript:
   sec: 1708
   time: '28:28'
   who: Alexey
+- header: 'Securing the Interview: Iterative Baselines and Signposting Depth'
 - line: Let's be honest, the interviewer was a human, and humans are subjective. Maybe
     they had a bad day. However, to some extent, I'm surprised because it's hard to
     say the interview was nodding. Maybe, again, the way you remember it and the way
@@ -586,6 +601,7 @@ transcript:
   sec: 1869
   time: '31:09'
   who: Alexey
+- header: 'Appropriate Depth: Practical ML Decisions vs Research-Level Detail'
 - line: Well, it's an interesting question for which there is no single answer. It
     depends. My opinion is that the interview has to be as close to the real job –
     the real work – as it can be. So, to be honest, in applied machine learning, you
@@ -623,7 +639,7 @@ transcript:
   sec: 1999
   time: '33:19'
   who: Valerii
-- header: "ML System Prep: Experience, Mock Interviews, Dealing with Unknown Domains"
+- header: 'Preparation Strategies: Mock Interviews, Resources, and Experience'
 - line: Okay. [laughs] So, how do I actually prepare for machine learning system design
     interviews? It feels as though just being a practitioner is not enough. Because,
     first, you never know what exactly is expected. I guess you need to ask that.
@@ -709,7 +725,7 @@ transcript:
   sec: 2248
   time: '37:28'
   who: Valerii
-- header: "Tool: ML Project Checklist & Defining Goal, Proxy Metrics, Long-Term Health"
+- header: 'Industry Checklist: Core ML Project Review Items and Patterns'
 - line: Speaking of this mock interview – a while ago, I had a mock interview with
     Valerii, where Valerii interviewed me. The question was about designing a fraud
     detection system.
@@ -755,7 +771,7 @@ transcript:
   sec: 2353
   time: '39:13'
   who: Valerii
-- header: "Post-Goal Steps: Features, Validation, A/B Testing, Monitoring, Fallbacks"
+- header: 'Defining Goals and Proxy Metrics: Business Alignment and Long-Term Health'
 - line: So about this checklist – let's say we need to design a system, not necessarily
     for an interview, but just design a system. What is the first thing we need to
     do? Do you remember what is in this checklist?
@@ -839,6 +855,7 @@ transcript:
   sec: 2641
   time: '44:01'
   who: Alexey
+- header: Features, Labels, Model Selection, and Validation Workflow
 - line: Let's say we know what we would like to do. We know how we can try to optimize
     it in this way. What does that mean? That means that if my model improves, there
     is a high chance that my metric of interest will be better. Now, I need to think
@@ -868,6 +885,7 @@ transcript:
   sec: 2651
   time: '44:11'
   who: Valerii
+- header: 'Production Robustness: Monitoring, Distribution Shift, and Fallbacks'
 - line: Perhaps if you cover all these parts during your system design interview,
     you're already in quite a good position. Right?
   sec: 2762
@@ -914,7 +932,7 @@ transcript:
   sec: 2868
   time: '47:48'
   who: Valerii
-- header: "ML System Components: Algorithms are 1-5% & Features are Paramount"
+- header: 'System Components: Why Features Matter More Than Model Architecture'
 - line: Okay. So let's go to the questions. We have quite a few of them. The first
     question we have is, “What are the typical components of a machine learning system?
     And what percentage of it are machine learning algorithms?”
@@ -968,6 +986,7 @@ transcript:
   sec: 2997
   time: '49:57'
   who: Valerii
+- header: 'Engineering Integration: Serving Models, Embeddings, and MLOps Roles'
 - line: Thank you. Let's go to the next one, “How to make machine learning algorithms
     work with other parts of systems to solve real world problems?” I guess the question
     is more about, “Okay, we have this model that we just discussed. This model for
@@ -1000,7 +1019,7 @@ transcript:
   sec: 3134
   time: '52:14'
   who: Alexey
-- header: "Concept: Avoiding ML & Tool: Machine Learning Design Patterns Book"
+- header: When to Avoid ML and Useful Design Pattern References
 - line: Do we really need machine learning here exactly? Maybe we can be lucky and
     we can just avoid it.
   sec: 3145
@@ -1052,7 +1071,7 @@ transcript:
   sec: 3239
   time: '53:59'
   who: Valerii
-- header: "New Grad Interviews: No System Design & Focus on Coding (LeetCode)"
+- header: 'New Grad Expectations: Coding Focus and Limited System Design'
 - line: Yeah, so another question from Alvaro. Alvaro is graduating soon and he is
     a machine learning intern at a startup. He's starting a job hunt, hopefully [inaudible].
     So how much system design should he expect as a new grad?
@@ -1130,7 +1149,7 @@ transcript:
   sec: 3440
   time: '57:20'
   who: Valerii
-- header: "Validation in Production: A/B Tests, Human Labels, Practitioner Experience"
+- header: 'Validating in Production: A/B Tests, Causality, and Human Labels'
 - line: Okay. I don't think we have a lot of time for more questions. There is an
     interesting question from Vijay, which is about, “What is the best way to validate
     the model performance in production? Do we need humans for that or are there other
@@ -1177,6 +1196,7 @@ transcript:
   sec: 3527
   time: '58:47'
   who: Valerii
+- header: 'Career Path: Moving from Data Science Practice to System Design'
 - line: Yeah, so the question is, “With this profile, you're very good at doing data
     science stuff. How did you transition from data science to being good at system
     design?”
@@ -1203,6 +1223,7 @@ transcript:
   sec: 3583
   time: '59:43'
   who: Valerii
+- header: Closing Remarks and Contact Information
 - line: '[laughs] Okay, I think that''s all we have time for. So maybe last one –
     How can people find you?'
   sec: 3603
@@ -1260,7 +1281,6 @@ transcript:
   time: '1:00:51'
   who: Valerii
 ---
-
 Links:
 
 * [Valerii's telegram channel (in Russian)](https://t.me/cryptovalerii){:target="_blank"}