Skip to content

Commit a922514

Browse files
authored
Merge pull request #101 from DataTalksClub/data-eng-courses
Data eng courses listicle and structured content fixes
2 parents 035ed64 + 84d68b1 commit a922514

File tree

164 files changed

+4371
-2381
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

164 files changed

+4371
-2381
lines changed

_layouts/author.html

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,4 +81,39 @@ <h3>Books</h3>
8181
</div>
8282

8383
{% include footer.html %}
84+
85+
{%- assign person_description = page.bio_short | default: page.description | default: content | strip_html | strip_newlines | truncate: 200 -%}
86+
{%- assign same_as_links = "" -%}
87+
{%- if page.linkedin -%}
88+
{%- assign same_as_links = same_as_links | append: '"https://www.linkedin.com/in/' | append: page.linkedin | append: '/"' -%}
89+
{%- endif -%}
90+
{%- if page.twitter -%}
91+
{%- if same_as_links != "" -%}
92+
{%- assign same_as_links = same_as_links | append: ', ' -%}
93+
{%- endif -%}
94+
{%- assign same_as_links = same_as_links | append: '"https://twitter.com/' | append: page.twitter | append: '"' -%}
95+
{%- endif -%}
96+
{%- if page.github -%}
97+
{%- if same_as_links != "" -%}
98+
{%- assign same_as_links = same_as_links | append: ', ' -%}
99+
{%- endif -%}
100+
{%- assign same_as_links = same_as_links | append: '"https://github.com/' | append: page.github | append: '"' -%}
101+
{%- endif -%}
102+
{%- if page.web -%}
103+
{%- if same_as_links != "" -%}
104+
{%- assign same_as_links = same_as_links | append: ', ' -%}
105+
{%- endif -%}
106+
{%- assign same_as_links = same_as_links | append: '"' | append: page.web | append: '"' -%}
107+
{%- endif -%}
108+
<script type="application/ld+json">
109+
{
110+
"@context": "https://schema.org",
111+
"@type": "Person",
112+
"name": {{ page.title | jsonify }}{% if page.picture %},
113+
"image": "{{ site.url }}/{{ page.picture }}"{% endif %},
114+
"url": "{{ site.url }}{{ page.url }}",
115+
"description": {{ person_description | jsonify }}{% if same_as_links != "" %},
116+
"sameAs": [{{ same_as_links }}]{% endif %}
117+
}
118+
</script>
84119
</body>

_podcast/ab-testing-and-product-experimentation.md

Lines changed: 21 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: "Product Analytics & A/B Testing: Causality, Metrics, Power Analysis, A/A Tests"
3-
short: "A/B Testing"
2+
title: 'Product Analytics & A/B Testing: Causality, Metrics, Power Analysis, A/A Tests'
3+
short: A/B Testing
44
season: 7
55
episode: 6
66
guests:
@@ -14,16 +14,30 @@ links:
1414
apple: https://podcasts.apple.com/us/podcast/a-b-testing-jakob-graff/id1541710331?i=1000552243668
1515
spotify: https://open.spotify.com/episode/3LhBOO1UANCGbOwkntZt4j
1616
youtube: https://www.youtube.com/watch?v=0Gqx1LtqRZU
17-
18-
description: "Master product analytics, A/B testing & power analysis: design stable metrics, validate randomization with A/A tests, plan sample size to de-risk features."
19-
intro: "How do you design product experiments that truly establish causality and avoid costly false conclusions? In this episode, Jakob Graff — Director of Data Science and Data Analytics at diconium, with prior analytics leadership at Inkitt, Babbel, King and a background in econometrics — walks through practical product analytics and A/B testing strategies focused on causality and reliable metrics. <br><br> We cover why randomized experiments mirror clinical trials, how experimentation de-risks features and builds organizational learning, and a concrete case study on subscription vs. points revenue metric design. Jakob explains experimentation platform trade-offs (third-party vs. in-house), traffic splitters, assignment tracking, and why A/A tests validate system trust. You’ll hear best practices for first tests (two-group simplicity), metric selection considering noise and seasonality, and how to plan duration with power analysis and sample-size calculations. The discussion also compares z/t/nonparametric tests, p-value intuition from A/A comparisons, frequentist vs Bayesian perspectives, and multi-armed test considerations. <br><br> Listen to learn practical steps for designing randomized experiments, selecting stable metrics, planning sample sizes, and interpreting results so your product analytics and A/B testing produce actionable, causal insights"
17+
description: 'Master product analytics, A/B testing & power analysis: design stable
18+
metrics, validate randomization with A/A tests, plan sample size to de-risk features.'
19+
intro: How do you design product experiments that truly establish causality and avoid
20+
costly false conclusions? In this episode, Jakob Graff — Director of Data Science
21+
and Data Analytics at diconium, with prior analytics leadership at Inkitt, Babbel,
22+
King and a background in econometrics — walks through practical product analytics
23+
and A/B testing strategies focused on causality and reliable metrics. <br><br> We
24+
cover why randomized experiments mirror clinical trials, how experimentation de-risks
25+
features and builds organizational learning, and a concrete case study on subscription
26+
vs. points revenue metric design. Jakob explains experimentation platform trade-offs
27+
(third-party vs. in-house), traffic splitters, assignment tracking, and why A/A
28+
tests validate system trust. You’ll hear best practices for first tests (two-group
29+
simplicity), metric selection considering noise and seasonality, and how to plan
30+
duration with power analysis and sample-size calculations. The discussion also compares
31+
z/t/nonparametric tests, p-value intuition from A/A comparisons, frequentist vs
32+
Bayesian perspectives, and multi-armed test considerations. <br><br> Listen to learn
33+
practical steps for designing randomized experiments, selecting stable metrics,
34+
planning sample sizes, and interpreting results so your product analytics and A/B
35+
testing produce actionable, causal insights
2036
topics:
2137
- data science
2238
- practices
2339
dateadded: 2022-02-27
24-
2540
duration: PT01H03M37S
26-
2741
quotableClips:
2842
- name: Podcast Introduction
2943
startOffset: 0
@@ -105,11 +119,6 @@ quotableClips:
105119
startOffset: 3839
106120
url: https://www.youtube.com/watch?v=0Gqx1LtqRZU&t=3839
107121
endOffset: 3880
108-
- name: Episode Wrap-up and Key Takeaways
109-
startOffset: 3880
110-
url: https://www.youtube.com/watch?v=0Gqx1LtqRZU&t=3880
111-
endOffset: 3817
112-
113122
transcript:
114123
- header: Podcast Introduction
115124
- header: Guest Background & Career Transition to Data Science

_podcast/ai-for-ecology-biodiversity-and-conservation.md

Lines changed: 22 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
2-
title: "AI for Ecology, Biodiversity, and Conservation: Computer Vision, Remote Sensing and Citizen Science"
3-
short: "AI for Ecology, Biodiversity, and Conservation"
2+
title: 'AI for Ecology, Biodiversity, and Conservation: Computer Vision, Remote Sensing
3+
and Citizen Science'
4+
short: AI for Ecology, Biodiversity, and Conservation
45
season: 18
56
episode: 3
67
guests:
@@ -14,8 +15,25 @@ links:
1415
apple: https://podcasts.apple.com/us/podcast/ai-for-ecology-biodiversity-and-conservation-tanya/id1541710331?i=1000653709956
1516
spotify: https://open.spotify.com/episode/3Hhz5N8ZDvsOPlPP3wxQxq?si=Oz7y_pBrTfeypfYZXubu-g
1617
youtube: https://www.youtube.com/watch?v=30tTrozbAkg
17-
description: "Discover AI-driven computer vision and remote sensing strategies to scale biodiversity monitoring, improve species ID, and inform conservation policy."
18-
intro: "How can AI help close critical data gaps in biodiversity monitoring and turn images and sensor data into actionable conservation decisions? In this episode Tanya Berger-Wolf, a computational ecologist, director of TDAI@OSU, and co-founder of the Wildbook project (Wild Me), walks through practical applications of AI for ecology, biodiversity monitoring, and conservation. <br><br> We cover core techniques—computer vision, machine learning, and remote sensing—and their use in image-based monitoring with camera traps, drones, and species identification. Tanya explains individual identification and longitudinal tracking, habitat mapping and change detection, and the data challenges of labeling, class imbalance, and sparse observations. The conversation addresses integration of heterogeneous datasets, model robustness (domain shift and transfer learning), and ethical considerations including Indigenous knowledge and equity. You’ll also hear about scalable platforms like Wildbook, citizen science workflows for crowdsourcing and quality control, policy relevance, open data and FAIR principles, edge deployment in the field, and building sustainable monitoring programs. <br><br> Listen to gain concrete insights on tools, pitfalls, and next steps for applying AI to conservation—what works now, what remains hard, and resources to explore further."
18+
description: Discover AI-driven computer vision and remote sensing strategies to scale
19+
biodiversity monitoring, improve species ID, and inform conservation policy.
20+
intro: How can AI help close critical data gaps in biodiversity monitoring and turn
21+
images and sensor data into actionable conservation decisions? In this episode Tanya
22+
Berger-Wolf, a computational ecologist, director of TDAI@OSU, and co-founder of
23+
the Wildbook project (Wild Me), walks through practical applications of AI for ecology,
24+
biodiversity monitoring, and conservation. <br><br> We cover core techniques—computer
25+
vision, machine learning, and remote sensing—and their use in image-based monitoring
26+
with camera traps, drones, and species identification. Tanya explains individual
27+
identification and longitudinal tracking, habitat mapping and change detection,
28+
and the data challenges of labeling, class imbalance, and sparse observations. The
29+
conversation addresses integration of heterogeneous datasets, model robustness (domain
30+
shift and transfer learning), and ethical considerations including Indigenous knowledge
31+
and equity. You’ll also hear about scalable platforms like Wildbook, citizen science
32+
workflows for crowdsourcing and quality control, policy relevance, open data and
33+
FAIR principles, edge deployment in the field, and building sustainable monitoring
34+
programs. <br><br> Listen to gain concrete insights on tools, pitfalls, and next
35+
steps for applying AI to conservation—what works now, what remains hard, and resources
36+
to explore further.
1937
topics:
2038
- AI
2139
- computer vision
@@ -116,10 +134,6 @@ quotableClips:
116134
startOffset: 3630
117135
url: https://www.youtube.com/watch?v=30tTrozbAkg&t=3630
118136
endOffset: 3720
119-
- name: 'Episode Closing: Key Takeaways and Next Steps'
120-
startOffset: 3720
121-
url: https://www.youtube.com/watch?v=30tTrozbAkg&t=3720
122-
endOffset: 3720
123137
context: 'Context: The episode frames a biodiversity crisis made harder by fragmented,
124138
sparse data and limited monitoring capacity, then surveys AI tools (computer vision,
125139
remote sensing, platforms, citizen science), technical challenges, ethical concerns,

_podcast/ai-ml-product-design-and-experimentation.md

Lines changed: 21 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: "AI Product Design: Algorithm-Ready UX, Rapid Experiments & Data-Driven Roadmaps"
3-
short: "Innovation and Design for Machine Learning"
2+
title: 'AI Product Design: Algorithm-Ready UX, Rapid Experiments & Data-Driven Roadmaps'
3+
short: Innovation and Design for Machine Learning
44
season: 8
55
episode: 3
66
guests:
@@ -14,19 +14,32 @@ links:
1414
apple: https://podcasts.apple.com/us/podcast/innovation-and-design-for-machine-learning-liesbeth/id1541710331?i=1000556693861
1515
spotify: https://open.spotify.com/episode/4vhTQJ6Aj9z5VHm9UsHspv
1616
youtube: https://www.youtube.com/watch?v=tcqBfZw41FM
17-
18-
description: "Master AI product design: build algorithm-ready UX, run rapid experiments and craft data-driven roadmaps to prioritize innovation and ship measurable results."
19-
intro: "How do you design products that are “algorithm-ready” while running rapid experiments and building data-driven roadmaps? In this episode, Liesbeth Dingemans—strategy and AI leader, founder of Dingemans Consulting, former VP of Revenue at Source.ag and Head of AI Strategy at Prosus—walks through pragmatic approaches to AI product design that bridge vision and execution. <br><br> We cover algorithm-friendly UX and signal collection, a concrete interaction-design case study comparing TikTok and Instagram signals, and the Double Diamond framework for moving from problem framing to solution exploration. Liesbeth explains scoping and prioritization, parallel experiments and proofs of concept, one-week design sprints, appropriate timeframes for research-to-scale, and the role of designers, data scientists, engineers and product managers in shaping AI roadmaps. <br><br> Listeners will learn how to avoid rework by involving data science early, use scoping documents to challenge assumptions, create measurable experiments (the Task Force/“Jet Ski” model), and build data-driven pitches for long-term bets versus quarterly OKRs. Tune in for concrete frameworks and practices to make AI product design, rapid experiments, and data-driven roadmaps work in your organization"
17+
description: 'Master AI product design: build algorithm-ready UX, run rapid experiments
18+
and craft data-driven roadmaps to prioritize innovation and ship measurable results.'
19+
intro: How do you design products that are “algorithm-ready” while running rapid experiments
20+
and building data-driven roadmaps? In this episode, Liesbeth Dingemans—strategy
21+
and AI leader, founder of Dingemans Consulting, former VP of Revenue at Source.ag
22+
and Head of AI Strategy at Prosus—walks through pragmatic approaches to AI product
23+
design that bridge vision and execution. <br><br> We cover algorithm-friendly UX
24+
and signal collection, a concrete interaction-design case study comparing TikTok
25+
and Instagram signals, and the Double Diamond framework for moving from problem
26+
framing to solution exploration. Liesbeth explains scoping and prioritization, parallel
27+
experiments and proofs of concept, one-week design sprints, appropriate timeframes
28+
for research-to-scale, and the role of designers, data scientists, engineers and
29+
product managers in shaping AI roadmaps. <br><br> Listeners will learn how to avoid
30+
rework by involving data science early, use scoping documents to challenge assumptions,
31+
create measurable experiments (the Task Force/“Jet Ski” model), and build data-driven
32+
pitches for long-term bets versus quarterly OKRs. Tune in for concrete frameworks
33+
and practices to make AI product design, rapid experiments, and data-driven roadmaps
34+
work in your organization
2035
topics:
2136
- machine learning
2237
- design thinking
2338
- strategy
2439
- ai
2540
- practices
2641
dateadded: 2022-04-10
27-
2842
duration: PT00H59M14S
29-
3043
quotableClips:
3144
- name: Episode Introduction & Guest Overview
3245
startOffset: 0
@@ -132,11 +145,6 @@ quotableClips:
132145
startOffset: 3500
133146
url: https://www.youtube.com/watch?v=tcqBfZw41FM&t=3500
134147
endOffset: 3605
135-
- name: Closing Notes, Resources and Contact Links
136-
startOffset: 3605
137-
url: https://www.youtube.com/watch?v=tcqBfZw41FM&t=3605
138-
endOffset: 3554
139-
140148
transcript:
141149
- header: Episode Introduction & Guest Overview
142150
- header: 'Guest Background: Strategy, Product and AI Trajectory'
@@ -688,8 +696,7 @@ transcript:
688696
sec: 1817
689697
time: '30:17'
690698
who: Liesbeth
691-
- header: 'Scoping Documents: Challenging Assumptions with "Why"
692-
'
699+
- header: 'Scoping Documents: Challenging Assumptions with "Why" '
693700
- line: 'Let''s imagine we have this situation: a manager comes to me, or to the team,
694701
or to the product manager and says, “Hey, this is the problem we think we have.
695702
Let''s solve it with a neural network.” So how do we challenge that person? How

_podcast/algorithmic-trading-with-python-and-machine-learning.md

Lines changed: 20 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: "Algorithmic Trading with Python: Backtesting, Risk Management and Deployment"
3-
short: "Stock Market Analysis with Python and Machine Learning"
2+
title: 'Algorithmic Trading with Python: Backtesting, Risk Management and Deployment'
3+
short: Stock Market Analysis with Python and Machine Learning
44
season: 17
55
episode: 3
66
guests:
@@ -14,14 +14,30 @@ links:
1414
apple: https://podcasts.apple.com/us/podcast/stock-market-analysis-with-python-and-machine/id1541710331?i=1000641465239
1515
spotify: https://open.spotify.com/episode/1ZXAeGr4Kx7F6oLQUip8Cc?si=KJwpYL-3SvuX8nPdc2cyOg
1616
youtube: https://www.youtube.com/watch?v=NThHAEIazFk
17-
description: "Master algorithmic trading: backtesting and risk management—learn practical data sources, features, models & execution to build robust strategies."
17+
description: 'Master algorithmic trading: backtesting and risk management—learn practical
18+
data sources, features, models & execution to build robust strategies.'
1819
topics:
1920
- machine learning
2021
- data science
2122
- MLOps
2223
- algorithmic trading
2324
- tools
24-
intro: "How do you turn a trading idea into a robust, risk-managed algorithm in Python? In this episode Ivan Brigida — analytics lead behind PythonInvest with 10+ years in statistical modeling, forecasting, econometrics and finance — walks through practical steps for algorithmic trading with Python, from data sourcing to deployment (and a clear reminder this is educational, not investment advice). <br><br> We cover where retail traders get market data (Yahoo, Quandl, Polygon), OHLCV and adjusted-close nuances, and a concrete mean-reversion example. Ivan explains backtesting methodology, common pitfalls like time-series data leakage, and walk-forward simulation for realistic validation. He breaks down risk management (stop-loss thresholds, position sizing), execution and trading fees, plus evaluation metrics (ROI, precision) and defining prediction targets (binary growth thresholds such as 5%). <br><br> On the modeling side you’ll hear practical feature engineering (time-window stats, handcrafted indicators), model choices (logistic regression, XGBoost, neural nets), explainability via feature importance, and deployment options (cron, Airflow, APIs, partial automation). Listen to gain actionable guidance for building, validating, and deploying algorithmic trading systems in Python."
25+
intro: How do you turn a trading idea into a robust, risk-managed algorithm in Python?
26+
In this episode Ivan Brigida — analytics lead behind PythonInvest with 10+ years
27+
in statistical modeling, forecasting, econometrics and finance — walks through practical
28+
steps for algorithmic trading with Python, from data sourcing to deployment (and
29+
a clear reminder this is educational, not investment advice). <br><br> We cover
30+
where retail traders get market data (Yahoo, Quandl, Polygon), OHLCV and adjusted-close
31+
nuances, and a concrete mean-reversion example. Ivan explains backtesting methodology,
32+
common pitfalls like time-series data leakage, and walk-forward simulation for realistic
33+
validation. He breaks down risk management (stop-loss thresholds, position sizing),
34+
execution and trading fees, plus evaluation metrics (ROI, precision) and defining
35+
prediction targets (binary growth thresholds such as 5%). <br><br> On the modeling
36+
side you’ll hear practical feature engineering (time-window stats, handcrafted indicators),
37+
model choices (logistic regression, XGBoost, neural nets), explainability via feature
38+
importance, and deployment options (cron, Airflow, APIs, partial automation). Listen
39+
to gain actionable guidance for building, validating, and deploying algorithmic
40+
trading systems in Python.
2541
dateadded: 2024-01-24
2642
duration: PT01H40S
2743
quotableClips:
@@ -129,10 +145,6 @@ quotableClips:
129145
startOffset: 3666
130146
url: https://www.youtube.com/watch?v=NThHAEIazFk&t=3666
131147
endOffset: 3696
132-
- name: Episode Wrap-up and final reminder (not financial advice)
133-
startOffset: 3696
134-
url: https://www.youtube.com/watch?v=NThHAEIazFk&t=3696
135-
endOffset: 3640
136148
transcript:
137149
- header: Podcast Introduction
138150
- header: 'Guest Introduction: Ivan Brigida — Analytics Lead & PythonInvest'

0 commit comments

Comments
 (0)