You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _podcast/ai-for-ecology-biodiversity-and-conservation.md
+33-9Lines changed: 33 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,6 @@
1
1
---
2
-
title: "Context: The episode frames a biodiversity crisis made harder by fragmented, sparse data and limited monitoring capacity, then surveys AI tools (computer vision, remote sensing, platforms, citizen science), technical challenges, ethical concerns, and policy needs for conservation.
3
-
4
-
Core narrative: AI's most important role in conservation is as an integrative, trustworthy infrastructure that turns heterogeneous, messy ecological data into continuous, scalable, and actionable knowledge—bridging camera traps, drones, satellites, citizen science, and field expertise through interoperable standards, robust models, edge deployment, and open platforms. Real impact requires coupling technical advances with ethics, community engagement, capacity building, sustainable funding, and multistakeholder governance so that AI-enabled monitoring directly informs equitable conservation decisions, enforcement, and long-term policy."
2
+
title: 'AI for Ecology, Biodiversity, and Conservation: Computer Vision, Remote Sensing
3
+
and Citizen Science'
5
4
short: AI for Ecology, Biodiversity, and Conservation
description: 'Discover AI-driven wildlife conservation: computer vision, remote sensing & citizen science for scalable species ID, habitat maps, alerts and policy impact.'
21
-
intro: How can AI actually scale wildlife conservation in the face of accelerating biodiversity loss and persistent data gaps? In this episode, computational ecologist Tanya Berger-Wolf—director of TDAI@OSU, co‑founder of the Wildbook project, and director of technology at Wild Me—walks us through practical ways computer vision, remote sensing, and citizen science are transforming biodiversity monitoring. <br><br> We explore core AI techniques (machine learning, transfer learning, domain adaptation), image‑based monitoring with camera traps, drones and photo‑ID for individual tracking, and remote sensing for habitat mapping and change detection. Tanya addresses key data challenges—labeling, class imbalance, sparse observations—and the need for interoperable datasets, open standards and FAIR principles. We also cover model robustness, edge deployment in the field, ethics and Indigenous knowledge, scalable platforms like Wildbook, and how citizen science and crowdsourcing support quality control and long‑term monitoring. <br><br> Listeners will come away with a clearer understanding of tools and workflows for wildlife monitoring, practical barriers to scaling AI for conservation, policy and funding considerations, and resources to begin applying computer vision, remote sensing, and citizen science in their own conservation projects
18
+
description: Discover AI-driven computer vision and remote sensing strategies to scale
19
+
biodiversity monitoring, improve species ID, and inform conservation policy.
20
+
intro: How can AI help close critical data gaps in biodiversity monitoring and turn
21
+
images and sensor data into actionable conservation decisions? In this episode Tanya
22
+
Berger‑Wolf, a computational ecologist, director of TDAI@OSU, and co‑founder of
23
+
the Wildbook project (Wild Me), walks through practical applications of AI for ecology,
24
+
biodiversity monitoring, and conservation. <br><br> We cover core techniques—computer
25
+
vision, machine learning, and remote sensing—and their use in image‑based monitoring
26
+
with camera traps, drones, and species identification. Tanya explains individual
27
+
identification and longitudinal tracking, habitat mapping and change detection,
28
+
and the data challenges of labeling, class imbalance, and sparse observations. The
29
+
conversation addresses integration of heterogeneous datasets, model robustness (domain
30
+
shift and transfer learning), and ethical considerations including Indigenous knowledge
31
+
and equity. You’ll also hear about scalable platforms like Wildbook, citizen science
32
+
workflows for crowdsourcing and quality control, policy relevance, open data and
33
+
FAIR principles, edge deployment in the field, and building sustainable monitoring
34
+
programs. <br><br> Listen to gain concrete insights on tools, pitfalls, and next
35
+
steps for applying AI to conservation—what works now, what remains hard, and resources
Core narrative: AI''s most important role in conservation is as an integrative,
141
+
trustworthy infrastructure that turns heterogeneous, messy ecological data into
142
+
continuous, scalable, and actionable knowledge—bridging camera traps, drones, satellites,
143
+
citizen science, and field expertise through interoperable standards, robust models,
144
+
edge deployment, and open platforms. Real impact requires coupling technical advances
145
+
with ethics, community engagement, capacity building, sustainable funding, and multistakeholder
146
+
governance so that AI-enabled monitoring directly informs equitable conservation
147
+
decisions, enforcement, and long-term policy.'
123
148
---
124
-
125
149
Links:
126
150
127
151
*[Biodiversity and Artificial Intelligence pdf](https://www.gpai.ai/projects/responsible-ai/environment/biodiversity-and-AI-opportunities-recommendations-for-action.pdf){:target="_blank"}
Copy file name to clipboardExpand all lines: _podcast/ai-infrastructure-hybrid-cloud-on-prem-distributed-training.md
+43-20Lines changed: 43 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,17 +1,6 @@
1
1
---
2
-
title: "Context: A conversation with an AI-infrastructure practitioner about moving from developer tools to building DStack, exploring real-world trade-offs across hardware, software, deployment, and business models for practical AI adoption.
3
-
4
-
Core theme (single unifying idea): Practical AI is an infrastructure-first problem — success depends less on chasing the biggest model and more on designing cost-effective, controllable, and efficient stacks (hardware, orchestration, and software) that fit hybrid cloud/on‑prem realities, leverage open-source ecosystems, and optimize distributed training and serving for real-world constraints.
5
-
6
-
Dominant through-line: Every segment — from cost of ownership and cloud vs on‑prem trade‑offs to open vs proprietary models, decentralization, distributed training bottlenecks, orchestration gaps, and edge/federated use cases — returns to the same tension: how to deliver AI that is scalable, performant, and economically sustainable by choosing the right mix of tooling, deployment model, and optimizations.
7
-
8
-
Key themes implied by the narrative:
9
-
- Cost and control drive architecture choices more than raw model capability.
10
-
- Hybrid cloud + on‑prem is the pragmatic reality; orchestration must adapt.
11
-
- Open-source ecosystems accelerate feedback, tooling, and business flexibility.
12
-
- Efficient distributed training and communication optimizations trump brute-force scaling.
13
-
- Decentralization (privacy, local control, edge) is often a matter of fit and trade-offs, not ideology.
14
-
- Practical provisioning, automation, and orchestration are the unsolved scaling problems for non–AI‑first organizations."
2
+
title: 'Post-ChatGPT AI Infrastructure: Open Source Orchestration, On-Prem Economics
description: Discover DStack to cut AI infrastructure costs with on‑prem GPU training and MLOps alternatives—optimize distributed training, reduce orchestration overhead
31
-
intro: 'How can engineering teams cut AI infrastructure costs without sacrificing performance or control? In this episode, Andrey Cheptsov — founder and CEO of dstack and former JetBrains engineer — walks through the motivation behind DStack, an open‑source orchestration alternative designed to lower AI infrastructure total cost of ownership. We trace the cloud vs on‑prem economics (including MLOps limitations like SageMaker), the decision to build open‑source developer tooling, and the trade‑offs between open and proprietary models. <br><br> You’ll hear practical discussion of on‑prem GPU training and distributed training challenges: GPU requirements, PyTorch + NCCL communication bottlenecks, optimization strategies such as DeepSpeed, and tips for fine‑tuning and serving models for non–AI‑first companies. The episode also covers orchestration gaps — Kubernetes and SLURM limitations — plus bare‑metal provisioning, hybrid cloud realities, edge computing scope, and federated learning versus distributed compute. <br><br> If you’re evaluating MLOps alternatives, on‑prem GPU coordination, or ways to reduce AI infrastructure cost, this episode offers concrete perspectives on when to choose on‑prem vs cloud, how DStack fits into the stack, and practical trade‑offs for production ML workloads.'
18
+
description: 'Discover AI infrastructure strategies: open source orchestration, on-prem
19
+
economics and distributed training at scale to cut costs, boost performance and
20
+
control.'
21
+
intro: How has the rise of ChatGPT reshaped the infrastructure needed to build and
22
+
run large language models, and when does open source orchestration make sense compared
23
+
to cloud or proprietary systems? In this episode we speak with Andrey Cheptsov,
24
+
founder and CEO of dstack — an open-source alternative to Kubernetes and Slurm designed
25
+
to simplify AI infrastructure orchestration. Drawing on his decade-plus at JetBrains
26
+
building developer tools, Andrey frames practical trade-offs between on-prem economics
27
+
and cloud spend, the maturity of open source orchestration tools, and patterns for
28
+
distributed training at scale. We cover core topics including open source orchestration
29
+
for AI workloads, cost and operational considerations for on-prem deployments, and
30
+
strategies to scale distributed training efficiently and reliably. Listen to understand
31
+
when an open source approach like dstack is appropriate, what to evaluate in orchestration
32
+
tools, and how to balance performance, cost, and control as you scale AI projects
33
+
post-ChatGPT. This episode is for engineering leaders and ML infrastructure teams
34
+
seeking actionable insights on AI infrastructure, orchestration tools, on‑prem economics,
Copy file name to clipboardExpand all lines: _podcast/algorithmic-trading-with-python-and-machine-learning.md
+33-10Lines changed: 33 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,5 @@
1
1
---
2
-
title: "Context: This episode follows Ivan Brigida’s path from finance to analytics and walks listeners step‑by‑step through the practical craft of retail algorithmic investing — covering data sources and quality, time‑series market formats, strategy ideas (like mean reversion), rigorous backtesting and walk‑forward validation, risk management and execution, feature engineering and model choice, explainability, deployment, and learning resources.
3
-
4
-
Core: The unifying idea is that successful retail algorithmic trading is built like an engineering pipeline — start with clean, well‑understood data; define precise prediction targets; design simple, interpretable models and handcrafted features; validate performance with rigorous, leakage‑free backtests and walk‑forward simulations; embed strict risk controls and disciplined execution; and iterate toward partial automation and reproducible deployment while treating the whole process as a continuous learning project rather than a shortcut to quick profits."
2
+
title: 'Algorithmic Trading with Python: Backtesting, Risk Management and Deployment'
5
3
short: Stock Market Analysis with Python and Machine Learning
description: 'Discover algorithmic trading & mean reversion: practical backtesting, data APIs, risk management, model choices and trade execution to boost strategy ROI.'
21
-
intro: 'How do you build, backtest, and deploy a robust mean-reversion algorithm without falling prey to bad data or time‑series leakage? In this episode, Ivan Brigida — Analytics Lead and creator of PythonInvest — draws on 10+ years in business intelligence, econometrics, forecasting, machine learning and finance to answer that question. <br><br> We walk through practical steps for algorithmic trading: choosing retail-friendly data APIs (Yahoo, Quandl, Polygon), understanding market data formats like OHLCV and adjusted close, and cleaning for data quality. Ivan explains mean reversion strategy design, risk management fundamentals including stop‑loss and position sizing, and rigorous backtesting methods—covering time‑series leakage and walk‑forward simulation. He also breaks down prediction targets, feature engineering with time‑window statistics, and model choices from logistic regression to XGBoost and neural networks, plus approaches to explainability and evaluation metrics (ROI, precision, trading fees). Finally, deployment options (cron, Airflow, APIs) and learning resources from PythonInvest are discussed. <br><br> Listen to gain actionable guidance on backtesting, data sources, risk controls, and machine learning techniques to move a mean‑reversion idea toward a reproducible algorithmic trading workflow.'
17
+
description: 'Master algorithmic trading: backtesting and risk management—learn practical
18
+
data sources, features, models & execution to build robust strategies.'
19
+
intro: How do you turn a trading idea into a robust, risk‑managed algorithm in Python?
20
+
In this episode Ivan Brigida — analytics lead behind PythonInvest with 10+ years
21
+
in statistical modeling, forecasting, econometrics and finance — walks through practical
22
+
steps for algorithmic trading with Python, from data sourcing to deployment (and
23
+
a clear reminder this is educational, not investment advice). <br><br> We cover
24
+
where retail traders get market data (Yahoo, Quandl, Polygon), OHLCV and adjusted‑close
25
+
nuances, and a concrete mean‑reversion example. Ivan explains backtesting methodology,
26
+
common pitfalls like time‑series data leakage, and walk‑forward simulation for realistic
27
+
validation. He breaks down risk management (stop‑loss thresholds, position sizing),
28
+
execution and trading fees, plus evaluation metrics (ROI, precision) and defining
29
+
prediction targets (binary growth thresholds such as 5%). <br><br> On the modeling
30
+
side you’ll hear practical feature engineering (time‑window stats, handcrafted indicators),
31
+
model choices (logistic regression, XGBoost, neural nets), explainability via feature
32
+
importance, and deployment options (cron, Airflow, APIs, partial automation). Listen
33
+
to gain actionable guidance for building, validating, and deploying algorithmic
0 commit comments