Skip to content

Commit c0bc3d0

Browse files
committed
All with intro texts
1 parent 04baec9 commit c0bc3d0

File tree

166 files changed

+2538
-113
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

166 files changed

+2538
-113
lines changed

_podcast/no-timestamps/s01e03-building-ds-team.md

Lines changed: 16 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -14,15 +14,22 @@ links:
1414
anchor: https://anchor.fm/datatalksclub/episodes/Building-a-Data-Science-Team---Dat-Tran-enlmef
1515
spotify: https://open.spotify.com/episode/0daFpY1z2J4Uop1XdMNsnY
1616
apple: https://podcasts.apple.com/us/podcast/building-a-data-science-team-dat-tran/id1541710331?i=1000502061864
17-
intro: In this episode, Dat Tran, Partner and CTO at DATANOMIQ, shares his journey
18-
from economics and gaming to leading AI and data science teams at companies like
19-
idealo and Axel Springer. He discusses how to scale AI from prototype to production,
20-
build strong product cultures, and balance generalists vs. specialists when hiring.
21-
Drawing on his experience founding Priceloop, Dat dives into MLOps in production,
22-
open-source collaboration, explainable AI, and how to retain top talent in competitive
23-
markets. Packed with lessons on leadership, data strategy, and sustainable AI systems,
24-
this episode is a must-listen for data professionals aiming to build real impact
25-
with machine learning.
17+
intro: How do you build an MLOps‑ready data team while shipping a transparent, white‑box
18+
dynamic pricing product for a startup? In this episode Dat Tran—Partner & CTO at
19+
DATANOMIQ, former Head of Data at idealo, and co‑founder of Priceloop—walks through
20+
the practical tradeoffs of moving from prototypes to production ML. <br><br> Dat
21+
traces his path from economics and early coding to production ML at Accenture, Axel
22+
Springer and idealo, and explains the “day‑two” operations mindset required for
23+
model maintenance and MLOps. We cover building a Head of Data role, hiring strategies
24+
for early‑stage startups (T‑shaped generalists first, specialists later), and how
25+
to align hiring with product uncertainty. Dat also outlines Priceloop’s white‑box
26+
AI approach to dynamic pricing—human‑centric systems that augment pricing managers
27+
rather than replace them—and the role of open research and open‑source in competitive
28+
advantage. <br><br> Tune in for concrete guidance on team composition (ML engineers,
29+
data engineers, PMs), take‑home assessments, project prioritization, retention,
30+
and educating leadership on realistic AI capabilities. Listeners will leave with
31+
actionable steps to create production‑grade MLOps teams and build transparent dynamic
32+
pricing solutions.
2633
transcript:
2734
- header: Intro
2835
- line: Today we have pleasure to have Dat as a guest. Dat needs no introduction.

_podcast/no-timestamps/s01e04-standing-out-as-a-data-scientist.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1067,4 +1067,18 @@ transcript:
10671067
who: Alexey
10681068
description: Master data scientist resume, portfolio & interview tactics to get interviews,
10691069
prove business impact and negotiate higher salary with recruiter tips.
1070+
intro: 'How do you get hired — or hire — for a data scientist role when expectations,
1071+
titles, and hiring processes differ so widely? In this episode Luke Whipps, co‑founder
1072+
of Neural.AI and host of the AI Game Changer podcast, draws on 8+ years recruiting
1073+
data, analytics and AI talent to answer that question. We walk through a six‑stage
1074+
recruitment workflow from role definition to offer, and tackle practical hiring
1075+
and job‑seeking topics: writing a data scientist resume and CV (format, length,
1076+
audience fit), building a portfolio that links tech stack to concrete projects,
1077+
and shaping a career narrative that demonstrates real business impact. Luke breaks
1078+
down shortlist and interview preparation, candidate funnel strategies, junior hiring
1079+
tips, targeted outreach (email, LinkedIn) and focus strategies for approaching fewer
1080+
companies. He also covers salary signals and negotiation, transitioning from academia
1081+
or web development, job‑hopping concerns, and how to align job titles without misrepresenting
1082+
experience. Listen to gain actionable interview preparation, portfolio and salary
1083+
negotiation strategies for data science hiring and career progression.'
10701084
---

_podcast/no-timestamps/s01e05-mentoring.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,22 @@ links:
1919
anchor: https://anchor.fm/datatalksclub/episodes/Mentoring---Rahul-Jain-eo7cmu
2020
spotify: TODO
2121
apple: TODO
22+
intro: How do you find a mentor, turn mentoring into paid work, and grow as a technical
23+
leader? In this episode Rahul Jain—Senior Solutions Engineer at Snowflake with 15+
24+
years in data and AI—walks through practical steps for mentorship and leadership
25+
development grounded in his career from mining engineering to data engineering and
26+
management. We define mentoring (purpose, types, sponsorship), explore ways to find
27+
a mentor via networks, cold outreach, and platforms, and share cold outreach best
28+
practices like specificity, background, and follow‑up. Rahul outlines how to prepare
29+
effective mentoring sessions (goals, agendas), compares one‑off advice to long‑term
30+
relationships, and covers benefits of being a mentor including listening and pattern
31+
recognition. Listeners will also learn people‑skills essentials (empathy, avoiding
32+
the “advice monster”), balancing technical work with leadership, addressing common
33+
mentee challenges like imposter syndrome, and when to use external coaches. Practical
34+
guidance on setting boundaries, starting paid mentorship, pricing and accountability,
35+
building reciprocal relationships, and maintaining development plans rounds out
36+
the episode—ideal for engineers and aspiring technical leaders seeking actionable
37+
mentoring and career growth strategies.
2238
---
2339

2440
Today we're discussing mentoring with [Rahul Jain](/people/rahuljain.html), a technical leader with about 20 years of experience building and running software products. He currently leads the Business Intelligence and Data Engineering units at Omio, a ticket-booking company, and mentors engineers and managers through The Mentoring Club.

_podcast/no-timestamps/s02e01-writing.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,23 @@ links:
2121
anchor: https://anchor.fm/datatalksclub/episodes/The-Importance-of-Writing-in-a-Tech-Career---Eugene-Yan-ep17du
2222
spotify: TODO
2323
apple: TODO
24+
intro: How do you publish developer-focused posts weekly without sacrificing depth
25+
or your day job? In this episode Eugene Yan — an Applied Scientist at Amazon who
26+
builds pragmatic ML systems and previously led data science teams at Lazada and
27+
uCare.ai — walks through a practical, outline-first approach to sustainable developer
28+
blogging and building a technical portfolio. <br><br> We cover Eugene’s career pivot
29+
into public writing, motivations for sharing knowledge, and how to target readers,
30+
peers, and future teammates. Listen for his 7-day weekly writing cadence, time-budgeting
31+
advice (including tips to avoid over-editing), and the outline-first method for
32+
filtering ideas and rewriting from memory. He also breaks down idea sourcing, title
33+
and length decisions, getting started tactics, and recommended blogging tools (Medium,
34+
Substack, WordPress, Jekyll/GitHub Pages). You’ll hear routines for morning reps
35+
and weekend deep work, distribution strategies via Twitter and LinkedIn, and how
36+
to translate work artifacts into press-release-style docs, decision logs, and clearer
37+
technical documentation. Plus, actionable portfolio best practices—clear README,
38+
quick-start guide, and repo tours—to make your code and writing discoverable. <br><br>
39+
Tune in to learn a repeatable workflow for weekly developer blogging, technical
40+
writing, and portfolio building that scales with your career.
2441
---
2542

2643
Today we're discussing technical writing, logging, documentation, and more. Our special guest is [Eugene Yan](/people/eugeneyan). Eugene works at the intersection of machine learning and product, building pragmatic ML systems while writing and speaking about effective data science, ML in production, and career growth.

_podcast/no-timestamps/s02e02-developer-advocacy.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -951,4 +951,20 @@ transcript:
951951
who: Alexey
952952
description: 'Discover DevRel tactics for Data Science: community growth, reproducibility,
953953
and content strategy—practical metrics, safety practices, and career growth tips.'
954+
intro: How do you practice developer relations for data science while balancing reproducibility,
955+
community growth, and content strategy? In this episode Elle O’Brien — a data scientist
956+
at Iterative (working on DVC and CML) and a lecturer at the University of Michigan
957+
with a PhD in neuroscience and computational modeling from UW — walks through practical
958+
DevRel for data-focused tools and teaching. <br><br> We cover her shift from a viral
959+
StyleGAN project into DevRel, the scope of a solo developer advocate (product work,
960+
docs, PRs, videos, hiring), and how she prioritizes releases versus evergreen content.
961+
Elle shares promotion tactics (Hacker News, Reddit, social), approaches to community
962+
safety and moderation, and the emotional realities of online work. She explains
963+
community metrics, role distinctions between DevRel/advocate/evangelist, and core
964+
skills like technical credibility and rapid learning. We also dig into content strategy
965+
for teaching—curriculum design, reusable video content, recording lectures as open
966+
educational resources, and practical ways to get started blogging and building a
967+
developer portfolio. <br><br> Listen to gain actionable guidance on community growth,
968+
reproducibility best practices, content planning, and the trade-offs of DevRel work
969+
in open source data science.
954970
---

_podcast/no-timestamps/s02e03-open-source.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,23 @@ links:
2626
anchor: https://anchor.fm/datatalksclub/episodes/Getting-Started-with-Open-Source---Vincent-Warmerdam-epk60j
2727
spotify: https://open.spotify.com/episode/1dsbDeVncfsEg3m3cYB927
2828
apple: https://podcasts.apple.com/us/podcast/getting-started-with-open-source-vincent-warmerdam/id1541710331?i=1000507024598
29+
intro: 'How do you start contributing to open source ML projects like scikit-learn
30+
pipelines—or move from curious user to confident contributor on Rasa’s conversational
31+
AI stack? In this episode Vincent Warmerdam, Research Advocate at Rasa and creator
32+
of The Algorithm Whiteboard and calmcode.io, walks through practical, hands-on advice
33+
for contributing to open source ML. <br><br> Vincent shares his career pivot from
34+
design student to data scientist and highlights projects (evol, clumper, memo, whatlies,
35+
scikit-lego) that illustrate small-tools-to-impact workflows. We deep-dive into
36+
scikit-learn–compatible pipeline components, design principles for low-maintenance
37+
APIs, and common mistakes such as publishing to PyPI too early. You’ll get a documentation
38+
checklist (README, guides, API reference, examples), guidance on filing reproducible
39+
issues, and step-by-step preparation for pull requests: testing, CI, packaging,
40+
and pre-commit hooks. <br><br> Listeners will leave with concrete strategies for
41+
finding the right project, balancing large vs. small repositories, community stewardship
42+
and contribution etiquette, and ways OSS work can boost career visibility through
43+
talks, blogs, and meetups. If you want actionable next steps for contributing to
44+
open source ML, scikit-learn pipelines, PRs, docs, or Rasa conversational AI, this
45+
episode maps the path.'
2946
---
3047

3148
Today we're talking open source with our guest, **Vincent Warmerdam**. Vincent is a Research Advocate at Rasa. If you check his LinkedIn, you'll see a lot: he's made Reddit's front page, runs calmcode.io for learning to code, has organized PyData Amsterdam and AI Saturdays Amsterdam, and he's a data evangelist and open-source enthusiast who's created and maintains several open-source packages. And—last but not least—he has over 80 LinkedIn endorsements for "awesomeness." Welcome, Vincent!

_podcast/no-timestamps/s02e04-mlops.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1100,4 +1100,19 @@ transcript:
11001100
who: 'Theo:'
11011101
description: 'Master MLOps with Kubeflow: monitor data drift, automate retraining
11021102
and scale pipelines using KFServing, Katib & Prometheus for production-ready ML.'
1103+
intro: How do you detect model drift, trigger retraining, and scale ML pipelines in
1104+
production? In this episode Theofilos Papapanagiotou — a systems engineer with 20
1105+
years’ experience (mostly in telcos) now building tools to run ML workloads and
1106+
an active Kubeflow advocate — walks through practical MLOps patterns and tooling
1107+
to answer that question. <br><br> We define MLOps as culture, process, and technology,
1108+
contrast DevOps vs MLOps across model lifecycle and data drift, and unpack monitoring
1109+
for drift, fairness, and retraining triggers. Hear about monitoring stacks (Prometheus/Grafana,
1110+
inference sensors), commoditizing inference monitoring, and how monitoring can feed
1111+
new training data. Theofilos explains team composition and the “MLOps engineer”
1112+
debate, maturity models from manual training to automated, data‑driven retraining,
1113+
and traceability via MLMD metadata and model versioning. <br><br> We also explore
1114+
the Kubeflow ecosystem — Pipelines, KFServing, Feast, Katib, and TFX — plus hyperparameter
1115+
search, cloud‑managed pipelines, edge/mobile considerations, and practical tips
1116+
for small teams. Listen to learn concrete approaches to detect model drift, automate
1117+
retraining, and scale pipelines with Kubeflow and related MLOps practices.
11031118
---

_podcast/no-timestamps/s02e05-feature-stores.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,22 @@ links:
2727
anchor: https://anchor.fm/datatalksclub/episodes/Feature-Stores-Cutting-through-the-Hype---Willem-Pienaar-ept6m8/a-a4hlg3r
2828
spotify: https://open.spotify.com/episode/05YnfTWbplXwOwicR2doy3
2929
apple: https://podcasts.apple.com/us/podcast/feature-stores-cutting-through-the-hype-willem-pienaar/id1541710331?i=1000508782957
30+
intro: How do you reliably build and serve real‑time features for production ML without
31+
rework, duplication, or training/serving skew? In this episode Willem Pienaar —
32+
engineering lead at Tecton and creator of Feast — walks through what feature stores
33+
solve in MLOps and how they enable real‑time feature engineering. We define feature
34+
stores, compare feature creation vs retrieval (SQL, Python, APIs, on‑demand transforms),
35+
and illustrate a production real‑time fraud detection lookup. Willem separates hype
36+
from value, explains organizational challenges like team silos and speed to production,
37+
and outlines the platform role across materialization, serving, and validation.
38+
<br><br> You’ll get practical coverage of Feast (open‑source) and Tecton (enterprise),
39+
architecture components (transform engine, storage, serving, registry, monitoring),
40+
and when online tabular use cases require a feature store versus when it’s overkill.
41+
The episode also covers integrations (dbt, Kubeflow, Airflow), streaming vs batch
42+
(Flink, Spark), validation and monitoring (drift detection, Great Expectations,
43+
TFDV), backfilling strategies, ownership and governance, and getting started resources
44+
(feast.dev, Docker). Listen to learn when to adopt a feature store and concrete
45+
next steps for productionizing features in your MLOps stack.
3046
---
3147

3248
In this episode, we dive deeper into feature stores with Willem, creator of Feast (an open-source feature store). Previously, Willem led the Data Science Platform team at Gojek and now works at Tecton, which develops feature store technology.

_podcast/no-timestamps/s02e06-decision-optimization.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,21 @@ links:
1818
description: 'Learn prescriptive analytics & robust optimization for supply chain
1919
pricing: align ML predictions to decisions, scale models, pick solvers, and boost
2020
revenue.'
21+
intro: 'How do you turn machine learning predictions into better real-world decisions—especially
22+
under uncertainty in supply chains and pricing? In this episode Dan Becker, Founder
23+
& CEO of Decision AI and former Google data scientist and Product Director at DataRobot,
24+
walks through prescriptive analytics and decision optimization for practical business
25+
impact. With a background that includes top Kaggle performance and contributions
26+
to TensorFlow and Keras, Dan explains how to formulate optimization problems, choose
27+
objectives and constraints, and integrate ML forecasts into prescriptive and robust
28+
optimization models. <br><br> We cover robust vs. stochastic optimization, aligning
29+
loss functions with business objectives, and the solvers and tools that make this
30+
work—OR-Tools, Gurobi, Pyomo and open-source options. Dan also digs into scalability,
31+
approximation techniques, and deployment: pipelines, monitoring, and feedback loops.
32+
Use cases include supply chain optimization, resource allocation, and pricing/bidding
33+
strategies, plus operational, legal, and ethical constraints. Listeners will get
34+
practical guidance on evaluation metrics, common pitfalls like mis-specified objectives
35+
and overfitting decisions, and the cross-functional skills needed—data science,
36+
operations research, and software engineering—to get started with prescriptive optimization
37+
projects.'
2138
---

_podcast/no-timestamps/s02e07-abc-data-science.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1246,6 +1246,21 @@ transcript:
12461246
description: 'Master the Data Science ABC Framework: Analyst, Builder, Consultant.
12471247
Get SQL, Python, MLOps career tips, project roadmap, transition strategies to land
12481248
roles.'
1249+
intro: 'How do you pick the right data science path—and actually make the transition?
1250+
In this episode Danny Ma, a recovering data scientist now focused on ML and data
1251+
engineering, walks through his ABC Framework (Analyst, Builder, Consultant) and
1252+
pragmatic steps for career moves. Danny, who runs the #DataWithDanny community (4,500+
1253+
members) and specializes in analytics, supervised ML, data architecture and digital
1254+
customer experiments, traces his own shift from SQL/SAS/Excel workflows to Python,
1255+
Kaggle projects and production systems. <br><br> We cover the ABC Framework origins
1256+
and definitions: Type A (Analyst) — data exploration, visualization and storytelling;
1257+
Type B (Builder) — ML engineering, MLOps and production mindset; Type C (Consultant/Leader)
1258+
— stakeholder persuasion and strategy. Danny shares transition tactics: build projects
1259+
first, learn theory as needed, core tools (Git, Docker, cloud), practicing engineering
1260+
via mini-projects and mentorship, portfolio and referral strategies, and when advanced
1261+
degrees matter. Tune in to get concrete guidance on skills to prioritize, how to
1262+
gain production experience, and a clear roadmap from SQL → visualization → ML →
1263+
deep learning to advance your data science career.'
12491264
---
12501265

12511266
Links:

0 commit comments

Comments
 (0)