Skip to content

Commit 374e229

Browse files
committed
All URLs updated
1 parent 3b5a851 commit 374e229

File tree

57 files changed

+1793
-2989
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+1793
-2989
lines changed

_podcast/to-update/s18e03-ai-for-ecology-biodiversity-and-conservation.md renamed to _podcast/ai-for-ecology-biodiversity-and-conservation.md

Lines changed: 33 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
---
2-
title: "Context: The episode frames a biodiversity crisis made harder by fragmented, sparse data and limited monitoring capacity, then surveys AI tools (computer vision, remote sensing, platforms, citizen science), technical challenges, ethical concerns, and policy needs for conservation.
3-
4-
Core narrative: AI's most important role in conservation is as an integrative, trustworthy infrastructure that turns heterogeneous, messy ecological data into continuous, scalable, and actionable knowledge—bridging camera traps, drones, satellites, citizen science, and field expertise through interoperable standards, robust models, edge deployment, and open platforms. Real impact requires coupling technical advances with ethics, community engagement, capacity building, sustainable funding, and multistakeholder governance so that AI-enabled monitoring directly informs equitable conservation decisions, enforcement, and long-term policy."
2+
title: 'AI for Ecology, Biodiversity, and Conservation: Computer Vision, Remote Sensing
3+
and Citizen Science'
54
short: AI for Ecology, Biodiversity, and Conservation
65
season: 18
76
episode: 3
@@ -16,12 +15,26 @@ links:
1615
apple: https://podcasts.apple.com/us/podcast/ai-for-ecology-biodiversity-and-conservation-tanya/id1541710331?i=1000653709956
1716
spotify: https://open.spotify.com/episode/3Hhz5N8ZDvsOPlPP3wxQxq?si=Oz7y_pBrTfeypfYZXubu-g
1817
youtube: https://www.youtube.com/watch?v=30tTrozbAkg
19-
20-
description: 'Discover AI-driven wildlife conservation: computer vision, remote sensing & citizen science for scalable species ID, habitat maps, alerts and policy impact.'
21-
intro: How can AI actually scale wildlife conservation in the face of accelerating biodiversity loss and persistent data gaps? In this episode, computational ecologist Tanya Berger-Wolf—director of TDAI@OSU, co‑founder of the Wildbook project, and director of technology at Wild Me—walks us through practical ways computer vision, remote sensing, and citizen science are transforming biodiversity monitoring. <br><br> We explore core AI techniques (machine learning, transfer learning, domain adaptation), image‑based monitoring with camera traps, drones and photo‑ID for individual tracking, and remote sensing for habitat mapping and change detection. Tanya addresses key data challenges—labeling, class imbalance, sparse observations—and the need for interoperable datasets, open standards and FAIR principles. We also cover model robustness, edge deployment in the field, ethics and Indigenous knowledge, scalable platforms like Wildbook, and how citizen science and crowdsourcing support quality control and long‑term monitoring. <br><br> Listeners will come away with a clearer understanding of tools and workflows for wildlife monitoring, practical barriers to scaling AI for conservation, policy and funding considerations, and resources to begin applying computer vision, remote sensing, and citizen science in their own conservation projects
18+
description: Discover AI-driven computer vision and remote sensing strategies to scale
19+
biodiversity monitoring, improve species ID, and inform conservation policy.
20+
intro: How can AI help close critical data gaps in biodiversity monitoring and turn
21+
images and sensor data into actionable conservation decisions? In this episode Tanya
22+
Berger‑Wolf, a computational ecologist, director of TDAI@OSU, and co‑founder of
23+
the Wildbook project (Wild Me), walks through practical applications of AI for ecology,
24+
biodiversity monitoring, and conservation. <br><br> We cover core techniques—computer
25+
vision, machine learning, and remote sensing—and their use in image‑based monitoring
26+
with camera traps, drones, and species identification. Tanya explains individual
27+
identification and longitudinal tracking, habitat mapping and change detection,
28+
and the data challenges of labeling, class imbalance, and sparse observations. The
29+
conversation addresses integration of heterogeneous datasets, model robustness (domain
30+
shift and transfer learning), and ethical considerations including Indigenous knowledge
31+
and equity. You’ll also hear about scalable platforms like Wildbook, citizen science
32+
workflows for crowdsourcing and quality control, policy relevance, open data and
33+
FAIR principles, edge deployment in the field, and building sustainable monitoring
34+
programs. <br><br> Listen to gain concrete insights on tools, pitfalls, and next
35+
steps for applying AI to conservation—what works now, what remains hard, and resources
36+
to explore further.
2237
dateadded: 2024-04-28
23-
24-
2538
quotableClips:
2639
- name: Podcast Introduction
2740
startOffset: 0
@@ -119,9 +132,20 @@ quotableClips:
119132
startOffset: 3720
120133
url: https://www.youtube.com/watch?v=30tTrozbAkg&t=3720
121134
endOffset: 3720
135+
context: 'Context: The episode frames a biodiversity crisis made harder by fragmented,
136+
sparse data and limited monitoring capacity, then surveys AI tools (computer vision,
137+
remote sensing, platforms, citizen science), technical challenges, ethical concerns,
138+
and policy needs for conservation.
122139
140+
Core narrative: AI''s most important role in conservation is as an integrative,
141+
trustworthy infrastructure that turns heterogeneous, messy ecological data into
142+
continuous, scalable, and actionable knowledge—bridging camera traps, drones, satellites,
143+
citizen science, and field expertise through interoperable standards, robust models,
144+
edge deployment, and open platforms. Real impact requires coupling technical advances
145+
with ethics, community engagement, capacity building, sustainable funding, and multistakeholder
146+
governance so that AI-enabled monitoring directly informs equitable conservation
147+
decisions, enforcement, and long-term policy.'
123148
---
124-
125149
Links:
126150

127151
* [Biodiversity and Artificial Intelligence pdf](https://www.gpai.ai/projects/responsible-ai/environment/biodiversity-and-AI-opportunities-recommendations-for-action.pdf){:target="_blank"}

_podcast/to-update/s20e01-trends-in-ai-infrastructure.md renamed to _podcast/ai-infrastructure-hybrid-cloud-on-prem-distributed-training.md

Lines changed: 43 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,6 @@
11
---
2-
title: "Context: A conversation with an AI-infrastructure practitioner about moving from developer tools to building DStack, exploring real-world trade-offs across hardware, software, deployment, and business models for practical AI adoption.
3-
4-
Core theme (single unifying idea): Practical AI is an infrastructure-first problem — success depends less on chasing the biggest model and more on designing cost-effective, controllable, and efficient stacks (hardware, orchestration, and software) that fit hybrid cloud/on‑prem realities, leverage open-source ecosystems, and optimize distributed training and serving for real-world constraints.
5-
6-
Dominant through-line: Every segment — from cost of ownership and cloud vs on‑prem trade‑offs to open vs proprietary models, decentralization, distributed training bottlenecks, orchestration gaps, and edge/federated use cases — returns to the same tension: how to deliver AI that is scalable, performant, and economically sustainable by choosing the right mix of tooling, deployment model, and optimizations.
7-
8-
Key themes implied by the narrative:
9-
- Cost and control drive architecture choices more than raw model capability.
10-
- Hybrid cloud + on‑prem is the pragmatic reality; orchestration must adapt.
11-
- Open-source ecosystems accelerate feedback, tooling, and business flexibility.
12-
- Efficient distributed training and communication optimizations trump brute-force scaling.
13-
- Decentralization (privacy, local control, edge) is often a matter of fit and trade-offs, not ideology.
14-
- Practical provisioning, automation, and orchestration are the unsolved scaling problems for non–AI‑first organizations."
2+
title: 'Post-ChatGPT AI Infrastructure: Open Source Orchestration, On-Prem Economics
3+
& Distributed Training at Scale'
154
short: Trends in AI Infrastructure
165
season: 20
176
episode: 1
@@ -26,13 +15,26 @@ links:
2615
apple: https://podcasts.apple.com/us/podcast/redefining-ai-infrastructure-open-source-chips-and/id1541710331?i=1000687565459
2716
spotify: https://open.spotify.com/episode/5MIc1pAXPxVYSr0E4pndU4
2817
youtube: https://www.youtube.com/watch?v=1aMuynlLM3o
29-
30-
description: Discover DStack to cut AI infrastructure costs with on‑prem GPU training and MLOps alternatives—optimize distributed training, reduce orchestration overhead
31-
intro: 'How can engineering teams cut AI infrastructure costs without sacrificing performance or control? In this episode, Andrey Cheptsov — founder and CEO of dstack and former JetBrains engineer — walks through the motivation behind DStack, an open‑source orchestration alternative designed to lower AI infrastructure total cost of ownership. We trace the cloud vs on‑prem economics (including MLOps limitations like SageMaker), the decision to build open‑source developer tooling, and the trade‑offs between open and proprietary models. <br><br> You’ll hear practical discussion of on‑prem GPU training and distributed training challenges: GPU requirements, PyTorch + NCCL communication bottlenecks, optimization strategies such as DeepSpeed, and tips for fine‑tuning and serving models for non–AI‑first companies. The episode also covers orchestration gaps — Kubernetes and SLURM limitations — plus bare‑metal provisioning, hybrid cloud realities, edge computing scope, and federated learning versus distributed compute. <br><br> If you’re evaluating MLOps alternatives, on‑prem GPU coordination, or ways to reduce AI infrastructure cost, this episode offers concrete perspectives on when to choose on‑prem vs cloud, how DStack fits into the stack, and practical trade‑offs for production ML workloads.'
18+
description: 'Discover AI infrastructure strategies: open source orchestration, on-prem
19+
economics and distributed training at scale to cut costs, boost performance and
20+
control.'
21+
intro: How has the rise of ChatGPT reshaped the infrastructure needed to build and
22+
run large language models, and when does open source orchestration make sense compared
23+
to cloud or proprietary systems? In this episode we speak with Andrey Cheptsov,
24+
founder and CEO of dstack — an open-source alternative to Kubernetes and Slurm designed
25+
to simplify AI infrastructure orchestration. Drawing on his decade-plus at JetBrains
26+
building developer tools, Andrey frames practical trade-offs between on-prem economics
27+
and cloud spend, the maturity of open source orchestration tools, and patterns for
28+
distributed training at scale. We cover core topics including open source orchestration
29+
for AI workloads, cost and operational considerations for on-prem deployments, and
30+
strategies to scale distributed training efficiently and reliably. Listen to understand
31+
when an open source approach like dstack is appropriate, what to evaluate in orchestration
32+
tools, and how to balance performance, cost, and control as you scale AI projects
33+
post-ChatGPT. This episode is for engineering leaders and ML infrastructure teams
34+
seeking actionable insights on AI infrastructure, orchestration tools, on‑prem economics,
35+
and distributed training best practices.
3236
dateadded: 2025-02-26
33-
3437
duration: PT01H06M04S
35-
3638
quotableClips:
3739
- name: Episode Kickoff & Guest Introduction
3840
startOffset: 0
@@ -118,7 +120,6 @@ quotableClips:
118120
startOffset: 3938
119121
url: https://www.youtube.com/watch?v=1aMuynlLM3o&t=3938
120122
endOffset: 3964
121-
122123
transcript:
123124
- header: Episode Kickoff & Guest Introduction
124125
- line: This week, we'll talk about AI infrastructure and everything related to it.
@@ -955,8 +956,30 @@ transcript:
955956
sec: 3964
956957
time: '1:06:04'
957958
who: Andrey
958-
---
959+
context: 'Context: A conversation with an AI-infrastructure practitioner about moving
960+
from developer tools to building DStack, exploring real-world trade-offs across
961+
hardware, software, deployment, and business models for practical AI adoption.
962+
963+
Core theme (single unifying idea): Practical AI is an infrastructure-first problem
964+
— success depends less on chasing the biggest model and more on designing cost-effective,
965+
controllable, and efficient stacks (hardware, orchestration, and software) that
966+
fit hybrid cloud/on‑prem realities, leverage open-source ecosystems, and optimize
967+
distributed training and serving for real-world constraints.
959968
969+
Dominant through-line: Every segment — from cost of ownership and cloud vs on‑prem
970+
trade‑offs to open vs proprietary models, decentralization, distributed training
971+
bottlenecks, orchestration gaps, and edge/federated use cases — returns to the same
972+
tension: how to deliver AI that is scalable, performant, and economically sustainable
973+
by choosing the right mix of tooling, deployment model, and optimizations.
974+
975+
Key themes implied by the narrative: - Cost and control drive architecture choices
976+
more than raw model capability. - Hybrid cloud + on‑prem is the pragmatic reality;
977+
orchestration must adapt. - Open-source ecosystems accelerate feedback, tooling,
978+
and business flexibility. - Efficient distributed training and communication optimizations
979+
trump brute-force scaling. - Decentralization (privacy, local control, edge) is
980+
often a matter of fit and trade-offs, not ideology. - Practical provisioning, automation,
981+
and orchestration are the unsolved scaling problems for non–AI‑first organizations.'
982+
---
960983
Links:
961984

962985
* [Twitter](https://twitter.com/andrey_cheptsov/){:target="_blank"}

_podcast/to-update/s17e03-stock-market-analysis-with-python-and-machine-learning.md renamed to _podcast/algorithmic-trading-with-python-and-machine-learning.md

Lines changed: 33 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
---
2-
title: "Context: This episode follows Ivan Brigida’s path from finance to analytics and walks listeners step‑by‑step through the practical craft of retail algorithmic investing — covering data sources and quality, time‑series market formats, strategy ideas (like mean reversion), rigorous backtesting and walk‑forward validation, risk management and execution, feature engineering and model choice, explainability, deployment, and learning resources.
3-
4-
Core: The unifying idea is that successful retail algorithmic trading is built like an engineering pipeline — start with clean, well‑understood data; define precise prediction targets; design simple, interpretable models and handcrafted features; validate performance with rigorous, leakage‑free backtests and walk‑forward simulations; embed strict risk controls and disciplined execution; and iterate toward partial automation and reproducible deployment while treating the whole process as a continuous learning project rather than a shortcut to quick profits."
2+
title: 'Algorithmic Trading with Python: Backtesting, Risk Management and Deployment'
53
short: Stock Market Analysis with Python and Machine Learning
64
season: 17
75
episode: 3
@@ -16,13 +14,26 @@ links:
1614
apple: https://podcasts.apple.com/us/podcast/stock-market-analysis-with-python-and-machine/id1541710331?i=1000641465239
1715
spotify: https://open.spotify.com/episode/1ZXAeGr4Kx7F6oLQUip8Cc?si=KJwpYL-3SvuX8nPdc2cyOg
1816
youtube: https://www.youtube.com/watch?v=NThHAEIazFk
19-
20-
description: 'Discover algorithmic trading & mean reversion: practical backtesting, data APIs, risk management, model choices and trade execution to boost strategy ROI.'
21-
intro: 'How do you build, backtest, and deploy a robust mean-reversion algorithm without falling prey to bad data or time‑series leakage? In this episode, Ivan Brigida — Analytics Lead and creator of PythonInvest — draws on 10+ years in business intelligence, econometrics, forecasting, machine learning and finance to answer that question. <br><br> We walk through practical steps for algorithmic trading: choosing retail-friendly data APIs (Yahoo, Quandl, Polygon), understanding market data formats like OHLCV and adjusted close, and cleaning for data quality. Ivan explains mean reversion strategy design, risk management fundamentals including stop‑loss and position sizing, and rigorous backtesting methods—covering time‑series leakage and walk‑forward simulation. He also breaks down prediction targets, feature engineering with time‑window statistics, and model choices from logistic regression to XGBoost and neural networks, plus approaches to explainability and evaluation metrics (ROI, precision, trading fees). Finally, deployment options (cron, Airflow, APIs) and learning resources from PythonInvest are discussed. <br><br> Listen to gain actionable guidance on backtesting, data sources, risk controls, and machine learning techniques to move a mean‑reversion idea toward a reproducible algorithmic trading workflow.'
17+
description: 'Master algorithmic trading: backtesting and risk management—learn practical
18+
data sources, features, models & execution to build robust strategies.'
19+
intro: How do you turn a trading idea into a robust, risk‑managed algorithm in Python?
20+
In this episode Ivan Brigida — analytics lead behind PythonInvest with 10+ years
21+
in statistical modeling, forecasting, econometrics and finance — walks through practical
22+
steps for algorithmic trading with Python, from data sourcing to deployment (and
23+
a clear reminder this is educational, not investment advice). <br><br> We cover
24+
where retail traders get market data (Yahoo, Quandl, Polygon), OHLCV and adjusted‑close
25+
nuances, and a concrete mean‑reversion example. Ivan explains backtesting methodology,
26+
common pitfalls like time‑series data leakage, and walk‑forward simulation for realistic
27+
validation. He breaks down risk management (stop‑loss thresholds, position sizing),
28+
execution and trading fees, plus evaluation metrics (ROI, precision) and defining
29+
prediction targets (binary growth thresholds such as 5%). <br><br> On the modeling
30+
side you’ll hear practical feature engineering (time‑window stats, handcrafted indicators),
31+
model choices (logistic regression, XGBoost, neural nets), explainability via feature
32+
importance, and deployment options (cron, Airflow, APIs, partial automation). Listen
33+
to gain actionable guidance for building, validating, and deploying algorithmic
34+
trading systems in Python.
2235
dateadded: 2024-01-24
23-
2436
duration: PT01H40S
25-
2637
quotableClips:
2738
- name: Podcast Introduction
2839
startOffset: 0
@@ -132,7 +143,6 @@ quotableClips:
132143
startOffset: 3696
133144
url: https://www.youtube.com/watch?v=NThHAEIazFk&t=3696
134145
endOffset: 3640
135-
136146
transcript:
137147
- header: Podcast Introduction
138148
- header: 'Guest Introduction: Ivan Brigida — Analytics Lead & PythonInvest'
@@ -1134,8 +1144,21 @@ transcript:
11341144
sec: 3735
11351145
time: '1:02:15'
11361146
who: Ivan
1137-
---
1147+
context: 'Context: This episode follows Ivan Brigida’s path from finance to analytics
1148+
and walks listeners step‑by‑step through the practical craft of retail algorithmic
1149+
investing — covering data sources and quality, time‑series market formats, strategy
1150+
ideas (like mean reversion), rigorous backtesting and walk‑forward validation, risk
1151+
management and execution, feature engineering and model choice, explainability,
1152+
deployment, and learning resources.
11381153
1154+
Core: The unifying idea is that successful retail algorithmic trading is built like
1155+
an engineering pipeline — start with clean, well‑understood data; define precise
1156+
prediction targets; design simple, interpretable models and handcrafted features;
1157+
validate performance with rigorous, leakage‑free backtests and walk‑forward simulations;
1158+
embed strict risk controls and disciplined execution; and iterate toward partial
1159+
automation and reproducible deployment while treating the whole process as a continuous
1160+
learning project rather than a shortcut to quick profits.'
1161+
---
11391162
Links:
11401163

11411164
* [Exploring Finance APIs](https://pythoninvest.com/long-read/exploring-finance-apis){:target="_blank"}

0 commit comments

Comments
 (0)