diff --git a/conferences/2025/3BVEKT.html b/conferences/2025/3BVEKT.html new file mode 100644 index 0000000..8225042 --- /dev/null +++ b/conferences/2025/3BVEKT.html @@ -0,0 +1,237 @@ + + + + + +Lightning Talks - PyData Berlin 2025 + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+ + Lightning Talks + +

Lightning Talks

+

Plenary Session [Organizers]

+
+
+ + +
+
+ Lightning Talks - Session Card +
+
+ + +
+
+ +
+ Level: None + + Company/Institute: NA + + + Room: {'en': 'Kuppelsaal'} + + + Time: 2025-09-02T16:45:00+02:00 + +
+ +

Abstract

+

Lightning Talks are short, 5-minute presentations open to all attendees. They’re a fun and fast-paced way to share ideas, showcase projects, spark discussions, or raise awareness about topics you care about — whether technical, community-related, or just inspiring. + +No slides are required, and talks can be spontaneous or prepared. It’s a great chance to speak up and connect with the community!

+ + +

Prerequisites

+

None

+ + +

Description

+
+

⚡ Lightning Talk Rules

+
    +
  • No promotion for products or companies.
  • +
  • No call for 'we are hiring' (but you may name your employer).
  • +
  • One LT per person per conference policy.
  • +
+

Community Event Announcements

+
    +
  • ⏱ You want to announce a community event? You have ONE minute.
  • +
  • All event announcements will be collected in a single slide slide deck, see instructions at the Lightning Talk desk in the Community Space in the Lounge on Level 1.
  • +
+

All other LTs:

+
    +
  • ⏱ You have exactly 5 minutes. The clock starts when you start — and ends when time’s up. That’s the thrill of Lightning Talks ⚡
  • +
  • 🎯 Be sharp, clear, and fun. Introduce your idea, make your point, give the audience something to remember. No pressure. (Okay, maybe a little.)
  • +
  • 🐍 Keep it relevant to Python, PyData and the community. You can go broad — tools, workflows, stories, experiments — as long as there’s some connection to Python, PyData or the community.
  • +
  • 👏 Keep it respectful. Keep it awesome. Humor is welcome, but please be kind, inclusive, and professional.
  • +
  • 🎤 Be ready when your name is called. We’re running a tight session — speakers go on stage rapid-fire. Stay close and stay hyped.
  • +
+
+ +

Speaker

+ + +
+ View Full Conference Program +
+ +
+
+ + + +
+ + + + + + + + + + \ No newline at end of file diff --git a/conferences/2025/3LDDAB.html b/conferences/2025/3LDDAB.html index 4f1e1db..66e4bab 100644 --- a/conferences/2025/3LDDAB.html +++ b/conferences/2025/3LDDAB.html @@ -166,7 +166,7 @@

From Manual to LLMs: Scaling Product Categorization

Room: {'en': 'B05-B06'} - Time: 2025-09-02T14:00:00+00:00 + Time: 2025-09-02T16:00:00+02:00 @@ -226,7 +226,7 @@

Speakers

- Giampaolo Casolla + Giampaolo Casolla

Giampaolo Casolla

@@ -237,12 +237,23 @@

Giampaolo Casolla

Giampaolo Casolla is a Senior Data Scientist at GetYourGuide, leveraging advanced machine learning and Generative AI to solve complex travel industry challenges. With expertise spanning areas like Safety, Risk, and Security, and strong skills in stats, Python, R, and cloud tech, he brings a diverse background to the role. Prior to GetYourGuide, Giampaolo developed award-winning ML solutions at Amazon and has a background in research with publications and conference presentations. At GetYourGuide, he's focused on integrating LLMs and GenAI into data products to drive innovation in travel technology.

+ +
- Ansgar Grüne + Ansgar Grüne

Ansgar Grüne

@@ -253,6 +264,17 @@

Ansgar Grüne

Ansgar Grüne is a Senior Data Scientist at GetYourGuide in Berlin. His work focuses on ML/AI approaches to improve the users search and discovery experience on the platform. He holds a Ph.D. in Theoretical Computer Science and has 10 years of experience as a Data Scientist in the travel industry following several years as software engineer.

+ +
diff --git a/conferences/2025/3XMJM3.html b/conferences/2025/3XMJM3.html index 5b15e33..8ffb980 100644 --- a/conferences/2025/3XMJM3.html +++ b/conferences/2025/3XMJM3.html @@ -166,7 +166,7 @@

Risk Budget Optimization for Causal Mix Models

Room: {'en': 'B05-B06'} - Time: 2025-09-01T15:00:00+00:00 + Time: 2025-09-01T17:00:00+02:00 @@ -208,6 +208,21 @@

Carlos Trujillo

My long‑term goal is to master the hybrid role of “Marketing Scientist” blending statistical rigor with business storytelling. If you like statistics, bayesian models, data‑driven decisions, as well open‑source cameo, then let’s connect.

+ + diff --git a/conferences/2025/8UJA37.html b/conferences/2025/8UJA37.html index 368f306..704f119 100644 --- a/conferences/2025/8UJA37.html +++ b/conferences/2025/8UJA37.html @@ -166,7 +166,7 @@

Exploring Millions of High-dimensional Datapoints in the Browser for Early D Room: {'en': 'B05-B06'} - Time: 2025-09-01T09:20:00+00:00 + Time: 2025-09-01T11:20:00+02:00 @@ -199,7 +199,7 @@

Speakers

- Tim Tenckhoff + Tim Tenckhoff

Tim Tenckhoff

@@ -210,12 +210,23 @@

Tim Tenckhoff

Tim is a Software Development Consultant at Netlight with a track record of experience in diverse industries, including MedTech, E-Mobility, FinTech, E-Commerce, EdTech and IoT. With a passion for technology and a relentless pursuit of excellence, he is dedicated to continuously push the boundaries of innovation while crafting clean, well-architected solutions and streamlining processes for efficiency. Currently, Tim is supporting Bayer in the Research and Development domain by visualising extensive cell painting image data in early drug discovery.

+ +
- Matthias Orlowski + Matthias Orlowski

Matthias Orlowski

@@ -226,6 +237,17 @@

Matthias Orlowski

As a Machine Learning Engineer at Bayer, Matthias Orlowski has contributed to various projects, focusing on natural language processing in pharmacovigilance and medical image processing in radiology and early drug discovery. Matthias studied in Konstanz, Nottingham (UK), Durham (North Carolina, USA), and Berlin, where he earned a PhD from Humboldt University in 2015. Prior to joining Bayer, Matthias gained diverse experience in multiple roles and organizations, tackling projects in consumer targeting, campaigning, and recommender systems.

+ +
diff --git a/conferences/2025/AU8F9U.html b/conferences/2025/AU8F9U.html new file mode 100644 index 0000000..69eaa92 --- /dev/null +++ b/conferences/2025/AU8F9U.html @@ -0,0 +1,253 @@ + + + + + +Automating Content Creation with LLMs: A Journey from Manual to AI-Driven Excellence - PyData Berlin 2025 + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+ + Generative AI + +

Automating Content Creation with LLMs: A Journey from Manual to AI-Driven Excellence

+

Talk [Sponsored]

+
+
+ + +
+
+ Automating Content Creation with LLMs: A Journey from Manual to AI-Driven Excellence - Session Card +
+
+ + +
+
+ +
+ Level: Advanced + + Company/Institute: GetYourGuide Gmbh + + + Room: {'en': 'B07-B08'} + + + Time: 2025-09-01T13:40:00+02:00 + +
+ +

Abstract

+

In the fast-paced realm of travel experiences, GetYourGuide encountered the challenge of maintaining consistent, high-quality content across its global marketplace. Manual content creation by suppliers often resulted in inconsistencies and errors, negatively impacting conversion rates. To address this, we leveraged large language models (LLMs) to automate content generation, ensuring uniformity and accuracy. This talk will explore our innovative approach, including the development of fine-tuned models for generating key text sections and the use of Function Calling GPT API for structured data. A pivotal aspect of our solution was the creation of an LLM evaluator to detect and correct hallucinations, thereby improving factual accuracy. Through A/B testing, we demonstrated that AI-driven content led to fewer defects and increased bookings. Attendees will gain insights into training data refinement, prompt engineering, and deploying AI at scale, offering valuable lessons for automating content creation across industries.

+ + +

Prerequisites

+

Openai fine-tuning: https://platform.openai.com/docs/guides/fine-tuning +Openai function calling: https://platform.openai.com/docs/guides/function-calling?api-mode=chat +Evaluating model performance with LLM: https://platform.openai.com/docs/guides/evals

+ + +

Description

+
+

GetYourGuide, a global marketplace for travel experiences, needs to provide structured and inspiring content for every activity in its marketplace.
+Before the release of our AI models, suppliers would create their content fully manually. The manual approach led to several issues in production, such as content inconsistencies, incorrect grammar, non-English language, and poor adherence to our content guidelines.
+These content defects negatively impact the conversion rate of activities.
+At the same time, with the large scale of new activity generation, our internal teams could only review a very small fraction of the submitted content.

+

With our LLM solution, suppliers can now automatically generate optimal content for their activities. Our feature allows users to simply copy-paste any existing raw text of their activity, and our models would then prefill most of the content sections. Suppliers then have the opportunity to review and edit the content.
+We chose two different methods to generate free text content and structured information.

+

For free text, we used the OpenAI fine-tune API to create two different models generating the relevant sections of our travel activities, i.e. the title, the highlights, the short and full descriptions.
+For structured information, we used the Function Calling gpt API to prefill the different activities tags and categories that have fixed values constraints in our database, such as the transport used or the type of the guide.

+

In order to validate our models, as well as for production monitoring, we developed a dedicated LLM evaluator that identifies hallucinations for our specific case, that is our models generating information that is not factually correct as compared to the input supplier text. With this hallucination evaluator, we were able to score the performance of different models and unlock key learnings and iterations. The evaluator also enables our internal team to detect and correct the hallucinations in production.

+

After several AB experiments, the new automated content creation feature is fully released to all our suppliers. The activities with content generated via AI showed significantly fewer content defects and a significant increase in bookings, with only a small fraction of hallucinations that can be reviewed and corrected manually.

+

In this talk, we will share our long journey consisting of several training data iterations to build our fine-tuned models, the prompt engineering challenges in building our evaluator and our function call model. We will also cover the different experiments and the operational challenges in training the models and deploying the service in production.
+The talk will provide some concrete ideas and tools to automate the generation of optimal content with LLMs, which is a common use case in many industries.

+
+ +

Speaker

+ +
+ +
+ M +
+ +
+

Marco Vene

+ +

Senior Data Scientist

+ + +

With over a decade of experience in data science and analytics, I am a Senior Data Scientist at GetYourGuide, where I lead initiatives in leveraging large language models (LLMs) to enhance content quality and conversion rates. My expertise includes fine-tuning LLMs for custom text generation and classification, developing NLP models for discovering new travel interests, and automating predictive models for global travel demand. I have a robust background in machine learning, natural language processing, and AI-driven content automation, which has significantly improved operational efficiencies and business outcomes. +Prior to moving to Data Science, I was a Senior Data Analyst at GetYourGuide, where I developed key metrics for availability and loyalty, built automated forecasting for our travel activities, performed impact analyses for sales and marketing, and automated data analyses with custom libraries. +Before joining GetYourGuide, I worked as Data Analyst in Foodpanda, an online food delivery platform, where I optimized restaurant ranking algorithms and developed recommendation systems. +My analytical journey began at Wealth-X in Budapest, where I worked as a Business Analyst, and later as Research Consultant in Millward Brown Vermeer, where I applied statistical techniques to report insights to external customers. +I hold a Master's degree in Marketing from Rotterdam School of Management, Erasmus University, graduated cum laude, and a Bachelor's degree in Business/Managerial Economics from Università di Pisa. +Driven by a passion for data-driven decision-making, I am committed to advancing AI technologies to solve complex business challenges. At PyData 2025 Berlin, I aim to share insights into deploying AI at scale, refining training data, and mastering prompt engineering to automate content creation across industries.

+ + +
+
+ + +
+ View Full Conference Program +
+ +
+
+ + + +
+ + + + + + + + + + \ No newline at end of file diff --git a/conferences/2025/B3STGX.html b/conferences/2025/B3STGX.html index 57f682d..68d76ef 100644 --- a/conferences/2025/B3STGX.html +++ b/conferences/2025/B3STGX.html @@ -166,7 +166,7 @@

See only what you are allowed to see: Fine-Grained Authorization

Room: {'en': 'B09'} - Time: 2025-09-01T13:40:00+00:00 + Time: 2025-09-03T13:40:00+02:00 diff --git a/conferences/2025/BCGJQB.html b/conferences/2025/BCGJQB.html index 71aae74..e22e822 100644 --- a/conferences/2025/BCGJQB.html +++ b/conferences/2025/BCGJQB.html @@ -139,6 +139,8 @@

PyData Berlin

+ PyData & Scientific Libraries Stack +

Scaling Probabilistic Models with Variational Inference

Talk

@@ -164,7 +166,7 @@

Scaling Probabilistic Models with Variational Inference

Room: {'en': 'B07-B08'} - Time: 2025-09-02T11:40:00+00:00 + Time: 2025-09-02T13:40:00+02:00 diff --git a/conferences/2025/C3MGDN.html b/conferences/2025/C3MGDN.html index 419fccc..d075322 100644 --- a/conferences/2025/C3MGDN.html +++ b/conferences/2025/C3MGDN.html @@ -15,7 +15,7 @@ - + @@ -25,7 +25,7 @@ - + @@ -148,11 +148,17 @@

PyData Berlin

Education, Career & Life

Maintainers of the Future: Code, Culture, and Everything After

-

Talk (long)

+

Keynote

+
+
+ Maintainers of the Future: Code, Culture, and Everything After - Session Card +
+
+
@@ -166,7 +172,7 @@

Maintainers of the Future: Code, Culture, and Everything After

Room: {'en': 'Kuppelsaal'} - Time: 2025-09-03T07:10:00+00:00 + Time: 2025-09-03T09:10:00+02:00
@@ -201,6 +207,19 @@

Jessica Greene

Jessica Greene is a self/community-taught developer who came to tech by way of the film industry and specialty coffee roasting. She is now a Senior Machine Learning Engineer at Ecosia.org, where she explores how ML and generative AI can support climate action. Passionate about ethical, sustainable, and inclusive technology, Jessica co-leads PyLadies Berlin, serves on the board of the Python Software Verband (PySV), and is part of the Python Software Foundation’s Conduct Working Group. In 2024, she was honored with the inaugural Outstanding PyLadies Award and the PSF Community Service Award for her contributions to the Python ecosystem.

+ + diff --git a/conferences/2025/CAUAZY.html b/conferences/2025/CAUAZY.html new file mode 100644 index 0000000..296002d --- /dev/null +++ b/conferences/2025/CAUAZY.html @@ -0,0 +1,258 @@ + + + + + +Most AI Agents Are Useless. Let’s Fix That - PyData Berlin 2025 + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+ + Natural Language Processing & Audio (incl. Generative AI NLP) + +

Most AI Agents Are Useless. Let’s Fix That

+

Talk

+
+
+ + +
+
+ Most AI Agents Are Useless. Let’s Fix That - Session Card +
+
+ + +
+
+ +
+ Level: Novice + + Company/Institute: deepset GmbH + + + Room: {'en': 'B07-B08'} + + + Time: 2025-09-02T11:20:00+02:00 + +
+ +

Abstract

+

AI agents are having a moment, but most of them are little more than fragile prototypes that break under pressure. Together, we’ll explore why so many agentic systems fail in practice, and how to fix that with real engineering principles. In this talk, you’ll learn how to build agents that are modular, observable, and ready for production. If you’re tired of LLM demos that don’t deliver, this talk is your blueprint for building agents that actually work.

+ + +

Prerequisites

+

Familarity with basic concepts in generative AI like LLMs and prompting +Basic python knowledge

+ + +

Description

+
+

Let’s face it: most AI agents are glorified demos. They look flashy, but they’re brittle, hard to debug, and rarely make it into real products. Why? Because wiring an LLM to a few tools is easy. Engineering a robust, testable, and scalable system is hard.

+

This talk is for practitioners, data scientists, AI engineers, and developers who want to stop tinkering and start shipping. We’ll take a candid look at the common reasons agent systems fail and introduce practical patterns to fix them using Haystack, an open-source Python framework to build custom AI applications.

+

You’ll learn how to design agents that are:

+
    +
  • Modular, so they’re easy to extend and evolve
  • +
  • Observable, so you can trace failures and understand the behavior
  • +
  • Maintainable, so they don’t become one-off science projects
  • +
+

We’ll also cover advanced topics like multimodal inputs and Model Context Protocol (MCP) to push your agents into more capable territory.

+

Whether you’re just starting to explore agents or trying to tame an unruly prototype, you’ll leave with a clear, actionable blueprint to build something that’s not just smart, but also reliable.

+
+ +

Speaker

+ +
+ + Bilge Yücel + +
+

Bilge Yücel

+ +

Developer Relations Engineer

+ + +

Bilge is a developer relations engineer at deepset, where she helps developers build powerful AI applications and teaches the world how to use Haystack. Passionate about RAG, LLMs, and all things Gen AI, she enjoys making complex AI concepts accessible both online and at real-life events

+ + + + +
+
+ + +
+ View Full Conference Program +
+ +
+
+ + + +
+ + + + + + + + + + \ No newline at end of file diff --git a/conferences/2025/DBL9PQ.html b/conferences/2025/DBL9PQ.html index b32ede6..dcbd78a 100644 --- a/conferences/2025/DBL9PQ.html +++ b/conferences/2025/DBL9PQ.html @@ -166,7 +166,7 @@

The Importance and Elegance of Polars Expressions

Room: {'en': 'B05-B06'} - Time: 2025-09-02T08:40:00+00:00 + Time: 2025-09-02T10:40:00+02:00 @@ -198,7 +198,7 @@

Speaker

- Jeroen Janssens + Jeroen Janssens

Jeroen Janssens

@@ -211,8 +211,14 @@

Jeroen Janssens

@@ -206,10 +206,10 @@

Speakers

- Kristian Rother + Dr. Kristian Rother
-

Kristian Rother

+

Dr. Kristian Rother

freelance

@@ -219,6 +219,19 @@

Kristian Rother

Kristian has translated 5 Python books and written 2 more himself, in addition to numerous teaching guides. Kristian has collected 364 stars on Advent of Code. His knowledge about async is, unfortunately, miserable. His favorite Python module is 're'. Kristian believes everybody can learn programming.

+ +
diff --git a/conferences/2025/GPZPFP.html b/conferences/2025/GPZPFP.html index e9a1022..73e3c40 100644 --- a/conferences/2025/GPZPFP.html +++ b/conferences/2025/GPZPFP.html @@ -166,7 +166,7 @@

Building Reactive Data Apps with Shinylive and WebAssembly

Room: {'en': 'B05-B06'} - Time: 2025-09-02T10:00:00+00:00 + Time: 2025-09-02T12:00:00+02:00
@@ -199,15 +199,21 @@

Christoph Scheuch

Self-employed

-

Christoph Scheuch is an independent data science and business intelligence expert, currently serving as an external lecturer at Humboldt University of Berlin and as a summer school instructor at the Barcelona School of Economics. He is the co-creator and maintainer of the Tidy Finance project, an open-source initiative promoting transparent and reproducible research in financial economics. +

Christoph Scheuch is an independent data science and business intelligence expert, currently serving as an external lecturer at Humboldt University of Berlin and as a summer school instructor at the Barcelona School of Economics. He is the co-creator and maintainer of the [Tidy Finance](https://www.tidy-finance.org/) project, an open-source initiative promoting transparent and reproducible research in financial economics, and the [EconDataverse](https://www.econdataverse.org/), a universe of open-source packages to work seamlessly with economic data in R and Python. Previously, Christoph held leadership roles at the social trading platform wikifolio.com, including Head of Artificial Intelligence, Director of Product, and Head of BI & Data Science. He has also lectured at the Vienna University of Economics and Business, where he earned his PhD in Finance through the Vienna Graduate School of Finance.

@@ -200,7 +200,7 @@

Speakers

- Michele Dolfi + Michele Dolfi

Michele Dolfi

@@ -211,12 +211,27 @@

Michele Dolfi

Dr. Michele Dolfi is a technical lead in the AI for Knowledge group at IBM Research, focusing on knowledge engineering and understanding. Michele is one of the researchers who created the Deep Search platform and the Docling open source project. His expertise spans from artificial intelligence to high performance computing and quantum systems.

+ +
- Christoph Auer + Christoph Auer

Christoph Auer

@@ -225,6 +240,19 @@

Christoph Auer

+ +
diff --git a/conferences/2025/GRZ3RG.html b/conferences/2025/GRZ3RG.html index bd3f7fd..2090809 100644 --- a/conferences/2025/GRZ3RG.html +++ b/conferences/2025/GRZ3RG.html @@ -4,13 +4,13 @@ A Beginner's Guide to State Space Modeling - PyData Berlin 2025 - + - + @@ -18,7 +18,7 @@ - + @@ -166,7 +166,7 @@

A Beginner's Guide to State Space Modeling

Room: {'en': 'B09'} - Time: 2025-09-01T08:40:00+00:00 + Time: 2025-09-01T10:40:00+02:00
@@ -256,7 +256,23 @@

Speakers

- Alexandre Andorra +
+ J +
+ +
+

Jesse Grabowski

+ + +

Jesse Grabowski is a PhD candidate at Paris 1 Pantheon-Sorbonne. He is also a principal data scientist at PyMC labs, and a core developer of PyMC, Pytensor, and related packages. His area of research includes time series modeling, macroeconomics, and finance.

+ + +
+
+ +
+ + Alexandre Andorra

Alexandre Andorra

@@ -272,8 +288,14 @@

Alexandre Andorra

-
- -
- J -
- -
-

Jesse Grabowski

- - -

Jesse Grabowski is a PhD candidate at Paris 1 Pantheon-Sorbonne. He is also a principal data scientist at PyMC labs, and a core developer of PyMC, Pytensor, and related packages. His area of research includes time series modeling, macroeconomics, and finance.

- - -
-
-
View Full Conference Program diff --git a/conferences/2025/GW9EXL.html b/conferences/2025/GW9EXL.html index ddc32c1..41d75f6 100644 --- a/conferences/2025/GW9EXL.html +++ b/conferences/2025/GW9EXL.html @@ -166,7 +166,7 @@

Consumer Choice Models with PyMC Marketing

Room: {'en': 'B05-B06'} - Time: 2025-09-01T14:20:00+00:00 + Time: 2025-09-01T16:20:00+02:00
@@ -206,6 +206,10 @@

Nathaniel Forde

@@ -221,7 +221,7 @@

Speaker

- Cainã Max Couto da Silva + Cainã Max Couto da Silva

Cainã Max Couto da Silva

@@ -234,8 +234,14 @@

Cainã Max Couto da Silva

@@ -191,7 +191,7 @@

Speaker

- Tobias Lampert + Tobias Lampert

Tobias Lampert

@@ -202,6 +202,17 @@

Tobias Lampert

An accomplished technical leader, Tobias brings over two decades of experience in software development, complemented by profound expertise in Data Science and Data Engineering. His career has focused on the end-to-end design and implementation of complex data-intensive applications, spanning the full lifecycle from data ingestion to deployment. In his current role at Lotum he is tackling a data volume of several hundred million events from mobile games per day.

+ +
diff --git a/conferences/2025/HUNUEB.html b/conferences/2025/HUNUEB.html index fd009cb..376e8bb 100644 --- a/conferences/2025/HUNUEB.html +++ b/conferences/2025/HUNUEB.html @@ -166,7 +166,7 @@

Causal Inference in Network Structures: Lessons learned From Financial Servi Room: {'en': 'B05-B06'} - Time: 2025-09-02T09:20:00+00:00 + Time: 2025-09-02T11:20:00+02:00

@@ -194,9 +194,7 @@

Speaker

-
- D -
+ Danial Senejohnny

Danial Senejohnny

diff --git a/conferences/2025/HYGHBG.html b/conferences/2025/HYGHBG.html index 357ca8a..b7eee3b 100644 --- a/conferences/2025/HYGHBG.html +++ b/conferences/2025/HYGHBG.html @@ -3,23 +3,29 @@ -Keynote 45 Min (Laura and Andy: "PyData 2077: a data science future retrospective") - PyData Berlin 2025 - +PyData 2077: a data science future retrospective - PyData Berlin 2025 + - - - + + + - - - + + + @@ -141,12 +147,18 @@

PyData Berlin

Education, Career & Life -

Keynote 45 Min (Laura and Andy: "PyData 2077: a data science future retrospective")

-

Talk (long)

+

PyData 2077: a data science future retrospective

+

Keynote

+
+
+ PyData 2077: a data science future retrospective - Session Card +
+
+
@@ -160,12 +172,17 @@

Keynote 45 Min (Laura and Andy: "PyData 2077: a data science future retrospe Room: {'en': 'Kuppelsaal'} - Time: 2025-09-01T07:20:00+00:00 + Time: 2025-09-01T09:20:00+02:00

Abstract

-

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam sit amet varius neque, in condimentum turpis. Praesent maximus turpis lorem, vel pellentesque est rutrum nec. Donec aliquet dolor ut massa fringilla auctor sit amet sed enim. Aenean mollis mi vel orci lacinia, a gravida purus mattis. Fusce nec nisl sit amet mauris hendrerit varius ac in nisi. Curabitur tempor purus tellus, non mattis ante ultricies quis. Curabitur id porttitor metus, et porta purus.

+

From: Chrono-Regulatory Commission, Temporal Enforcement Division +To: PyData Berlin Organising Committee +Subject: Citation #TMP-2077-091 - Unauthorised Spacetime Disturbance + +Dear Committee, +Our temporal monitoring systems have detected an unauthorised chronological anomaly emanating from your facility (Berliner Congress Center, coordinates 52.52068°N, 13.416451°E) scheduled to manifest on September 1st at 9:20 a.m.

Prerequisites

@@ -174,10 +191,84 @@

Prerequisites

Description

-

Sed mattis dui quis porttitor bibendum. Cras laoreet sollicitudin velit quis mollis. Ut vitae eleifend ex, eu suscipit lorem. Pellentesque eget faucibus lorem. Sed porta commodo pellentesque. Sed feugiat lectus sed nisl venenatis vestibulum. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas porttitor fringilla leo, eget finibus orci. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Aliquam auctor urna ut diam hendrerit rutrum et et mi. Nulla tortor sem, venenatis vulputate lacus convallis, pulvinar condimentum eros.

+

VIOLATION DETAILS:
+- Unauthorized temporal incursion detected
+- Speakers identified as: Kitchen, A. & Summers, L. (baseline timeline)
+- Anomalous data signatures suggest retrospective analysis from non-contemporaneous source
+- Evidence of information leakage: late 21st-century technological practices and standards
+- Risk assessment: Moderate timeline contamination potential

+

REGULATORY COMPLIANCE REQUIRED:
+Per Temporal Code Section 2077.3, you are hereby notified that failure to contain this spacetime disturbance will result in fines of up to 50,000 temporal credits. You must ensure adequate attendance at the specified coordinates to properly observe and contain the anomaly as it unfolds.
+WARNING: Preliminary scans indicate the transmission contains advanced analytical frameworks and critical commentary on primitive early-21st-century data science practices. Attendees may experience paradigm shifts, changes to mental models, or sudden clarity regarding field trajectories.

+

Sincerely,
+Compliance Officer Z-7749
+Chrono-Regulatory Commission
+"Keeping Yesterday Safe for Tomorrow"

-

Speaker

+

Speakers

+ +
+ + Laura Summers + +
+

Laura Summers

+ +

Lead Design Engineer

+ + +

Laura is a very technical designer™️, working at Pydantic as Lead Design Engineer. Her side projects include Sweet Summer Child Score (summerchild.dev) and Ethics Litmus Tests (ethical-litmus.site). Laura is passionate about feminism, digital rights and designing for privacy. She speaks, writes and runs workshops at the intersection of design and technology.

+ + + + +
+
+ +
+ + Andy Kitchen + +
+

Andy Kitchen

+ +

AI Researcher

+ + +

Andy Kitchen is a hacker, startup founder and AI/Neuroscience researcher. Let's grab a beer and talk about philosophy, computer science and society (and science fiction while we're at it!)

+ + + + +
+
diff --git a/conferences/2025/JE8YJT.html b/conferences/2025/JE8YJT.html index dc01893..ce8f7d4 100644 --- a/conferences/2025/JE8YJT.html +++ b/conferences/2025/JE8YJT.html @@ -166,7 +166,7 @@

What’s Really Going On in Your Model? A Python Guide to Explainable AI

Room: {'en': 'B05-B06'} - Time: 2025-09-01T12:20:00+00:00 + Time: 2025-09-01T14:20:00+02:00
@@ -194,7 +194,7 @@

Speaker

- Yashasvi Misra + Yashasvi Misra

Yashasvi Misra

@@ -205,6 +205,17 @@

Yashasvi Misra

Yashasvi Misra is a Data Engineer at Pure Storage and Chair of the NumFOCUS Code of Conduct Working Group, where she helps foster inclusive practices across the open-source ecosystem. She has contributed to foundational projects like NumPy and has been an active part of the Python community since her college days. Yashasvi is also a passionate advocate for diversity and inclusion in tech. She introduced a period leave policy at a previous organisation and continues to work toward building more equitable workplaces. She has shared her work and insights at conferences around the world, including PyCon India, PyCon Europe, PyLadiesCon, and PyData Global.

+ +
diff --git a/conferences/2025/JEKYLT.html b/conferences/2025/JEKYLT.html index ca70974..c9e75d3 100644 --- a/conferences/2025/JEKYLT.html +++ b/conferences/2025/JEKYLT.html @@ -4,13 +4,13 @@ Navigating healthcare scientific knowledge:building AI agents for accurate biomedical data retrieval - PyData Berlin 2025 - + - + @@ -18,7 +18,7 @@ - + @@ -166,7 +166,7 @@

Navigating healthcare scientific knowledge:building AI agents for accurate b Room: {'en': 'B05-B06'} - Time: 2025-09-02T13:00:00+00:00 + Time: 2025-09-02T15:00:00+02:00 @@ -225,7 +225,7 @@

Speaker

-

Laura

+

Laura Dumont

Senior Machine learning engineer

@@ -233,6 +233,21 @@

Laura

I have worked in the healthcare industry for more than 10 years, currently a senior machine learning at Owkin. Committed to open source and open science principles, I aspire to leverage Python and data science for social good, focusing on health, inclusion, and projects that make a meaningful difference in people's lives.

+ +
diff --git a/conferences/2025/JKEHMH.html b/conferences/2025/JKEHMH.html index 153138f..1da16f6 100644 --- a/conferences/2025/JKEHMH.html +++ b/conferences/2025/JKEHMH.html @@ -11,7 +11,7 @@ - + @@ -19,7 +19,7 @@ - + @@ -142,11 +142,17 @@

PyData Berlin

PyData & Scientific Libraries Stack

Narwhals: enabling universal dataframe support

-

Talk (long)

+

Keynote

+
+
+ Narwhals: enabling universal dataframe support - Session Card +
+
+
@@ -160,7 +166,7 @@

Narwhals: enabling universal dataframe support

Room: {'en': 'Kuppelsaal'} - Time: 2025-09-02T07:10:00+00:00 + Time: 2025-09-02T09:10:00+02:00
@@ -208,6 +214,19 @@

Marco Gorelli

He has a background in Mathematics and holds an MSc from the University of Oxford, and was one of the prize winners in the M6 Forecasting Competition (2nd place overall Q1).

+ + diff --git a/conferences/2025/KBEEHS.html b/conferences/2025/KBEEHS.html new file mode 100644 index 0000000..410db38 --- /dev/null +++ b/conferences/2025/KBEEHS.html @@ -0,0 +1,288 @@ + + + + + +Accessible Data Visualizations - PyData Berlin 2025 + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+ + Visualisation & Jupyter + +

Accessible Data Visualizations

+

Talk

+
+
+ + +
+
+ Accessible Data Visualizations - Session Card +
+
+ + +
+
+ +
+ Level: Novice + + Company/Institute: work: cusy GmbH Softwareentwicklung, uni: Berliner Hochschule für Technik (BHT Berlin) + + + Room: {'en': 'B07-B08'} + + + Time: 2025-09-01T12:00:00+02:00 + +
+ +

Abstract

+

Data visualizations often exclude users with visual impairments and temporary or situational constraints. Many regulations (European Accessibility Act, American Disabilities Act) now mandate inclusive digital content. Our research provides practical solutions — optimized color palettes, supplementary patterns, and alternative formats — implemented in popular libraries like Bokeh and Vega-Altair. These techniques, available through our open-source cusy Design System, create visualizations that reach broader audiences while meeting compliance requirements and improving comprehension for all users.

+ + +

Prerequisites

+

There are no real prerequisites, but the talk might be most interesting to you if you have basic knowledge of data visualizations and some sort of interest for accessibility in combination with data visualizations.

+ + +

Description

+
+

Introduction

+

Accessible data visualizations extend beyond aesthetics to meet established standards and accommodate diverse visual abilities. This presentation demonstrates how to create visualizations that comply with Web Content Accessibility Guidelines (WCAG) contrast requirements, support users with color vision deficiencies, and convey information through multiple encoding channels. The topics in the presentation explore practical techniques using colors, patterns, SVG accessibility features, and alternative data formats.

+

This presentation is designed for data scientists, visualization specialists, dashboard designers, and accessibility auditors who need to communicate findings effectively to diverse audiences. Attendees will benefit by:

+
    +
  • Learning practical techniques to make visualizations accessible without sacrificing analytical depth
  • +
  • Gaining implementation strategies for common data visualization libraries
  • +
  • Acquiring skills to expand their reach to users with visual impairments
  • +
  • Taking away ready-to-use color palettes and pattern sets for immediate implementation
  • +
+

Topics

+

Color Accessibility

+

Data visualizations must meet WCAG contrast ratios (≥3:1) for distinguishable elements. Our optimized palette features:

+
    +
  • Eight distinct colors plus neutral gray for invalid data
  • +
  • CIEDE2000 perceptual differences >20 between colors
  • +
  • Verified compatibility with various color vision deficiencies
  • +
  • Print-friendly CMYK values (ISO Coated V2 300% or Pantone C)
  • +
  • Contrast ratios >3.0 (WCAG AA-level) against white and black backgrounds
  • +
+

Pattern Implementation

+

Patterns provide critical secondary encoding when color alone is insufficient, we'll present:

+
    +
  • Unique pattern paired with each color
  • +
  • Area fills that maintain distinction at various scales
  • +
  • Sequential pattern densities for quantitative data
  • +
  • Pattern elements adaptable as point markers
  • +
  • Implementation via SVG <pattern> tags
  • +
+

Technical Implementation

+

Practical examples will demonstrate:

+
    +
  • Using color contrast checkers for validation
  • +
  • Implementing SVG <pattern> elements
  • +
  • Creating accessible SVG with proper ARIA attributes
  • +
  • Providing alternative data formats (e.g. HTML tables with semantic descriptions)
  • +
  • Testing with screen readers and accessibility tools
  • +
+

Conclusion

+

Implementing these practices creates data visualizations that are not only compliant with accessibility regulations but also more effective for all users. The cusy Design System offers open-source resources to implement these techniques across various visualization libraries.

+
+ +

Speaker

+ +
+ +
+ M +
+ +
+

Maris Nieuwenhuis

+ +

Working student

+ + +

## Junior Dev +- TS/JS, Python, Java, and a teeny bit o' C++ +- WebDev, DataViz, Backend-Buzz + +#a11y

+ + + + +
+
+ + +
+ View Full Conference Program +
+ +
+
+ + + +
+ + + + + + + + + + \ No newline at end of file diff --git a/conferences/2025/KCPVYN.html b/conferences/2025/KCPVYN.html index d8ada65..983e7ea 100644 --- a/conferences/2025/KCPVYN.html +++ b/conferences/2025/KCPVYN.html @@ -166,7 +166,7 @@

🛰️➡️🧑‍💻: Streamlining Satellite Data for Analysis-Ready Out Room: {'en': 'B05-B06'} - Time: 2025-09-01T08:40:00+00:00 + Time: 2025-09-01T10:40:00+02:00 @@ -188,7 +188,7 @@

Speaker

- Vinayak Nair + Vinayak Nair

Vinayak Nair

@@ -199,6 +199,19 @@

Vinayak Nair

Remote Sensing & Space System Engineer | Innovating AI-Powered Geospatial Solutions | Expert in Satellite Data and Infrastructure Monitoring

+ +
diff --git a/conferences/2025/KEJJSP.html b/conferences/2025/KEJJSP.html index 533a2d4..0b1a1d9 100644 --- a/conferences/2025/KEJJSP.html +++ b/conferences/2025/KEJJSP.html @@ -163,10 +163,10 @@

Template-based web app and deployment pipeline at an enterprise-ready level Company/Institute: NKD Group GmbH - Room: {'en': 'B05-B06'} + Room: {'en': 'B07-B08'} - Time: 2025-09-03T12:20:00+00:00 + Time: 2025-09-02T16:00:00+02:00 @@ -226,7 +226,7 @@

Speaker

- Johannes Schöck + Johannes Schöck

Johannes Schöck

@@ -234,9 +234,22 @@

Johannes Schöck

Senior Data Scientist

-

A studied natural scientist, Johannes tought himself data science skills and now does what he loves: to solve data and tech challenges that generate value.

+

A studied natural scientist and expert in power semiconductors, Johannes tought himself data science skills and now does what he loves: to solve data and tech challenges that generate value.

+ +
diff --git a/conferences/2025/KKWBKK.html b/conferences/2025/KKWBKK.html index b620a25..6e71724 100644 --- a/conferences/2025/KKWBKK.html +++ b/conferences/2025/KKWBKK.html @@ -166,7 +166,7 @@

Edge of Intelligence: The State of AI in Browsers

Room: {'en': 'B05-B06'} - Time: 2025-09-03T08:40:00+00:00 + Time: 2025-09-03T10:40:00+02:00 @@ -209,6 +209,10 @@

Johannes Kolbe

+
+
+ Building Bridges, Not Silos: Lessons from Running a Data & ML/AI Engineering Guild at Vattenfall - Session Card +
+
+
@@ -157,11 +163,15 @@

Building Bridges, Not Silos: Lessons from Running a Data & ML/AI Engineering Company/Institute: Vattenfall + Room: {'en': 'B05-B06'} + + + Time: 2025-09-01T15:40:00+02:00

Abstract

-

In large organizations, data and AI talent often work in fragmented teams, making cross-pollination of ideas, tools, and best practices a challenge. At Vattenfall, we addressed this by founding the “Data & ML/AI Engineering Guild” — a cross-functional community dedicated to sharing knowledge, aligning on technical standards, and accelerating innovation across business units.

+

In large organizations, data and AI talent often work in fragmented teams, making cross-pollination of ideas, tools, and best practices a challenge. At Vattenfall, we addressed this by founding the “Data & ML/AI Engineering Guild”: a cross-functional community dedicated to sharing knowledge, aligning on technical standards, and accelerating innovation across business units.

Prerequisites

@@ -170,7 +180,7 @@

Prerequisites

Description

-

As data professionals, we talk a lot about breaking down data silos — but the reality is, silos exist not just in our data, but in our teams. At Vattenfall, data and ML/AI engineers are spread across different units, projects, and domains. We often face similar challenges, reinvent similar solutions, and learn the hard way: in parallel, and in isolation.

+

As data professionals, we talk a lot about breaking down data silos, but the reality is, silos exist not just in our data, but in our teams. At Vattenfall, data and ML/AI engineers are spread across different units, projects, and domains. We often face similar challenges, reinvent similar solutions, and learn the hard way: in parallel, and in isolation.

To address this, we started something simple but powerful: a community. The "Data & ML/AI Engineering Guild" began as a small group of engineers coming together to share learnings and frustrations. Over time, it evolved into a cross-functional space where we exchange knowledge, run internal talks, and build technical momentum across the organization.

In this talk, I’ll walk through how we built and scaled this guild: what worked, what didn’t, and what we’re still figuring out. I’ll share the formats we use to keep engagement high (even when calendars are packed), how we balance deep technical discussions with accessibility. I’ll also reflect on how this kind of community work complements our day jobs, and how it helps engineers grow beyond the boundaries of their product teams.

If you’re thinking about starting a tech guild, already running one, or just curious how to create more connection and consistency in your data/ML org, this talk is for you. My goal is to share honest lessons (not just success stories) and hopefully spark ideas you can adapt in your own context.

@@ -180,7 +190,7 @@

Speaker

- Anastasia Karavdina + Anastasia Karavdina

Anastasia Karavdina

@@ -195,6 +205,19 @@

Anastasia Karavdina

During my free time, I like learning new tools and techniques and implementing them in end-to-end AI/ML and IoT projects. My experience has also been very helpful in guiding data analysts, data scientists, and machine learning engineers as a mentor and contributing to the growth of the next generation of data scientist elite.

+ +
diff --git a/conferences/2025/LZYBVH.html b/conferences/2025/LZYBVH.html index 55a77c3..2705e87 100644 --- a/conferences/2025/LZYBVH.html +++ b/conferences/2025/LZYBVH.html @@ -4,13 +4,13 @@ The EU AI Act: Unveiling Lesser-Known Aspects, Implementation Entities, and Exemptions - PyData Berlin 2025 - + - + @@ -18,7 +18,7 @@ - + @@ -166,7 +166,7 @@

The EU AI Act: Unveiling Lesser-Known Aspects, Implementation Entities, and Room: {'en': 'B05-B06'} - Time: 2025-09-01T11:40:00+00:00 + Time: 2025-09-01T13:40:00+02:00

@@ -195,10 +195,10 @@

Speaker

- Adrin + Adrin Jalali
-

Adrin

+

Adrin Jalali

VP Labs

@@ -206,6 +206,19 @@

Adrin

Adrin is VP Labs at probabl.ai and has a PhD in computational biology. He is also a maintainer of open source projects such as scikit-learn and fairlearn. He focuses on developer tools in the statistical machine learning and responsible ML space.

+ +
diff --git a/conferences/2025/NUNXEV.html b/conferences/2025/NUNXEV.html index 42501d4..548d717 100644 --- a/conferences/2025/NUNXEV.html +++ b/conferences/2025/NUNXEV.html @@ -172,7 +172,7 @@

One API to Rule Them All? LiteLLM in Production

Room: {'en': 'B07-B08'} - Time: 2025-09-02T10:00:00+00:00 + Time: 2025-09-02T12:00:00+02:00 @@ -216,6 +216,17 @@

Alina Dallmann

Alina Dallmann is a computer scientist currently working as a Data Scientist at scieneers GmbH. Her enthusiasm for classical software development and data-driven projects has recently come together in various projects focused on building retrieval-augmented generation (RAG) systems.

+ + diff --git a/conferences/2025/PPAYDV.html b/conferences/2025/PPAYDV.html index 55fc097..d643049 100644 --- a/conferences/2025/PPAYDV.html +++ b/conferences/2025/PPAYDV.html @@ -4,22 +4,22 @@ Better docs, happier users: What we learned applying Diataxis to HoloViz libraries - PyData Berlin 2025 - + - - + + - - + + @@ -147,6 +147,12 @@

Better docs, happier users: What we learned applying Diataxis to HoloViz lib

+
+
+ Better docs, happier users: What we learned applying Diataxis to HoloViz libraries - Session Card +
+
+
@@ -157,6 +163,10 @@

Better docs, happier users: What we learned applying Diataxis to HoloViz lib Company/Institute: Anaconda + Room: {'en': 'B07-B08'} + + + Time: 2025-09-03T12:00:00+02:00

@@ -179,10 +189,10 @@

Speaker

- Maxime  Liquet + Maxime Liquet
-

Maxime Liquet

+

Maxime Liquet

Software Engineer

@@ -193,6 +203,10 @@

Maxime Liquet

@@ -213,7 +213,7 @@

Speaker

- Veit Schiele + Veit Schiele

Veit Schiele

@@ -228,8 +228,14 @@

Veit Schiele

diff --git a/conferences/2025/SB88M7.html b/conferences/2025/SB88M7.html index 907d3c1..2108918 100644 --- a/conferences/2025/SB88M7.html +++ b/conferences/2025/SB88M7.html @@ -166,7 +166,7 @@

Beyond the Black Box: Interpreting ML models with SHAP

Room: {'en': 'B07-B08'} - Time: 2025-09-01T14:20:00+00:00 + Time: 2025-09-01T16:20:00+02:00
@@ -251,6 +251,19 @@

Avik Basu

Outside of work, he explores the intersection of machine learning, personal finance, and open-source tools, aiming to build software that is accessible, self-hostable, and privacy-focused. He is driven by a strong belief in community, transparency, and empowering others through education and mentorship.

+ +
diff --git a/conferences/2025/SCQE8H.html b/conferences/2025/SCQE8H.html index 11da1d8..8f3e5d1 100644 --- a/conferences/2025/SCQE8H.html +++ b/conferences/2025/SCQE8H.html @@ -166,7 +166,7 @@

Spot the difference: 🕵️ using foundation models to monitor for change w Room: {'en': 'B07-B08'} - Time: 2025-09-03T11:40:00+00:00 + Time: 2025-09-03T13:40:00+02:00

@@ -204,8 +204,14 @@

Ferdinand Schenck

@@ -206,7 +206,7 @@

Speaker

- Yaseen Esmaeelpour + Yaseen Esmaeelpour

Yaseen Esmaeelpour

@@ -217,6 +217,21 @@

Yaseen Esmaeelpour

I am a data analyst with experience in various sectors including tech and supply chain. I am also a hobby programmer and like to spend my spare time working on cool personal projects.

+ +
diff --git a/conferences/2025/VURY38.html b/conferences/2025/VURY38.html index 924dcb2..3674b1b 100644 --- a/conferences/2025/VURY38.html +++ b/conferences/2025/VURY38.html @@ -166,7 +166,7 @@

Building an A/B Testing Framework with NiceGUI

Room: {'en': 'B07-B08'} - Time: 2025-09-01T15:00:00+00:00 + Time: 2025-09-01T17:00:00+02:00 diff --git a/conferences/2025/W9Q7JY.html b/conferences/2025/W9Q7JY.html index ff48bc1..223d657 100644 --- a/conferences/2025/W9Q7JY.html +++ b/conferences/2025/W9Q7JY.html @@ -166,7 +166,7 @@

Deep Dive into the Synthetic Data SDK

Room: {'en': 'B09'} - Time: 2025-09-02T11:40:00+00:00 + Time: 2025-09-02T13:40:00+02:00 @@ -194,7 +194,7 @@

Speaker

- Tobias Hann + Tobias Hann

Tobias Hann

@@ -205,6 +205,19 @@

Tobias Hann

Tobias is the CEO of MOSTLY AI, the leader in privacy-preserving synthetic data. Originally from Vienna, Austria, he is currently based in Munich, Germany. Before joining MOSTLY AI, Tobias worked as a management consultant with the Boston Consulting Group and in tech start-ups in different leadership roles. He earned a PhD from the Vienna University of Business and Economics and an MBA from the Haas School of Business at UC Berkeley. With his extensive background in strategy and technology, Tobias drives MOSTLY AI’s mission to revolutionize data access and data insights across industries.

+ +
diff --git a/conferences/2025/WGJJQN.html b/conferences/2025/WGJJQN.html index eb0dcb7..25457b5 100644 --- a/conferences/2025/WGJJQN.html +++ b/conferences/2025/WGJJQN.html @@ -4,13 +4,13 @@ How We Automate Chaos: Agentic AI and Community Ops at PyCon DE & PyData - PyData Berlin 2025 - + - + @@ -18,7 +18,7 @@ - + @@ -163,10 +163,10 @@

How We Automate Chaos: Agentic AI and Community Ops at PyCon DE & PyData

Company/Institute: Pioneers Hub - Room: {'en': 'B09'} + Room: {'en': 'B07-B08'} - Time: 2025-09-03T11:40:00+00:00 + Time: 2025-09-02T15:00:00+02:00 @@ -200,7 +200,7 @@

Speaker

-

Alexander C.S. Hendorf

+

Alexander CS Hendorf

Lead Conference Resurrection Engineer

@@ -222,8 +222,14 @@

Alexander C.S. Hendorf

@@ -213,7 +213,7 @@

Speaker

- Elizaveta Zinovyeva + Elizaveta Zinovyeva

Elizaveta Zinovyeva

@@ -224,6 +224,17 @@

Elizaveta Zinovyeva

Liza (Elizaveta) Zinovyeva is an Applied Scientist at AWS Generative AI Innovation Center and is based in Berlin. She helps customers across different industries to integrate Generative AI into their existing applications and workflows. She is passionate about AI/ML, finance and software security topics. In her spare time, she enjoys spending time with her family, sports, learning new technologies, and table quizzes.

+ +
diff --git a/conferences/2025/WXPVCS.html b/conferences/2025/WXPVCS.html index a3a1291..0753bb6 100644 --- a/conferences/2025/WXPVCS.html +++ b/conferences/2025/WXPVCS.html @@ -166,7 +166,7 @@

More than DataFrames: Data Pipelines with the Swiss Army Knife DuckDB

Room: {'en': 'B09'} - Time: 2025-09-01T11:40:00+00:00 + Time: 2025-09-01T13:40:00+02:00
@@ -194,7 +194,7 @@

Speaker

- Mehdi Ouazza + Mehdi Ouazza

Mehdi Ouazza

@@ -209,8 +209,14 @@

Mehdi Ouazza

@@ -195,7 +195,7 @@

Speakers

- Dat Tran + Dat Tran

Dat Tran

@@ -208,8 +208,14 @@

Dat Tran

diff --git a/conferences/2025/XFPTWN.html b/conferences/2025/XFPTWN.html new file mode 100644 index 0000000..a48d4e2 --- /dev/null +++ b/conferences/2025/XFPTWN.html @@ -0,0 +1,274 @@ + + + + + +AI-Ready Data in Action: Powering Smarter Agents - PyData Berlin 2025 + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+ + Data Handling & Engineering + +

AI-Ready Data in Action: Powering Smarter Agents

+

Tutorial

+
+
+ + +
+
+ AI-Ready Data in Action: Powering Smarter Agents - Session Card +
+
+ + +
+
+ +
+ Level: None + + Company/Institute: TBA + + + Room: {'en': 'B09'} + + + Time: 2025-09-01T15:40:00+02:00 + +
+ +

Abstract

+

This hands-on workshop focuses on what AI engineers do most often: making data AI-ready and turning it into production-useful applications. Together with dltHub and LanceDB, you’ll walk through an end-to-end workflow: collecting and preparing real-world data with best practices, managing it in LanceDB, and powering AI applications with search, filters, hybrid retrieval, and lightweight agents. By the end, you’ll know how to move from raw data to functional, production-ready AI setups without the usual friction. We will touch upon multi-modal data and going to production with this end-to-end use case.

+ + +

Prerequisites

+

TBA

+ + +

Description

+
+

Modern AI applications are only as powerful as the data that fuels them. Yet, much of the real-world data AI engineers encounter is messy, incomplete, or unoptimized data. In this hands-on tutorial, AI-Ready Data in Action: Powering Smarter Agents, participants will walk through the full lifecycle of preparing unstructured data, embedding it into LanceDB, and leveraging it for search and agentic applications. Using a real-world dataset, attendees will incrementally ingest, clean, and vectorize text data, tune hybrid search strategies, and build a lightweight chat agent to surface relevant results. The tutorial concludes by showing how to take a working demo into production. By the end, participants will gain practical experience in bridging the gap between messy raw data and production-ready pipelines for AI applications.

+

Prior knowledge

+
    +
  • Basic Python programming.
  • +
  • Awareness of embeddings, vectors, and AI search concepts (we’ll explain where needed).
  • +
+

The tutorial is designed to be accessible: engineers familiar with Python should be able to follow along step by step.

+

Key Takeaways

+

By the end of the tutorial, participants will:

+
    +
  1. Understand the end-to-end workflow of taking raw, real-world data and preparing it for AI applications.
  2. +
  3. Build and run an incremental dlt pipeline to ingest real data into LanceDB.
  4. +
  5. Apply text preprocessing and generate embeddings for semantic search.
  6. +
  7. Optimize retrieval with vector and hybrid search strategies.
  8. +
  9. Implement a lightweight AI agent capable of surfacing relevant issues from a natural language description.
  10. +
  11. Learn how to transition from a demo project to a production setup using LanceDB Cloud.
  12. +
+

Outline

+
    +
  • Introduce dlt (data load tool) and how it enables schema evolution, incremental loading, and normalization in pipelines.
  • +
  • Introduce LanceDB and explain embeddings, vector search, hybrid retrieval and multi-modal data for AI applications.
  • +
  • Ingest and preprocess a real dataset with dlt, generate embeddings, and load it into LanceDB following best data engineering practices.
  • +
  • Optimize search in LanceDB by tuning parameters, selecting distance metrics, and adding hybrid retrieval.
  • +
  • Build a lightweight AI agent that queries LanceDB and returns the most relevant issues from natural-language prompts.
  • +
  • Demonstrate the path to production using automation, monitoring, and LanceDB Cloud for scaling and reliability.
  • +
  • Conclude with key takeaways and an open Q&A.
  • +
+
+ +

Speakers

+ +
+ +
+ V +
+ +
+

Violetta Mishechkina

+ + + +
+
+ +
+ +
+ C +
+ +
+

Chang

+ + +

Chang is the CEO/Co-founder of LanceDB and has been making data tooling for ML/AI for almost two decades. +One of the original co-authors of the pandas project, Chang started LanceDB to make it easy for AI teams to work with all of the data that doesn't fit neatly into all of those dataframes - from embeddings to images, from audio to video, at petabyte scale.

+ + +
+
+ + + + +
+
+ + + +
+ + + + + + + + + + \ No newline at end of file diff --git a/conferences/2025/XNMYDK.html b/conferences/2025/XNMYDK.html index 8f643e8..ade4db7 100644 --- a/conferences/2025/XNMYDK.html +++ b/conferences/2025/XNMYDK.html @@ -169,7 +169,7 @@

Lane detection in self-driving using only NumPy

Room: {'en': 'B07-B08'} - Time: 2025-09-03T12:20:00+00:00 + Time: 2025-09-03T14:20:00+02:00
@@ -197,7 +197,7 @@

Speaker

- Emma Saroyan + Emma Saroyan

Emma Saroyan

diff --git a/conferences/2025/YKFWKQ.html b/conferences/2025/YKFWKQ.html index eee42d9..e0c7dbb 100644 --- a/conferences/2025/YKFWKQ.html +++ b/conferences/2025/YKFWKQ.html @@ -11,7 +11,7 @@ - + @@ -19,7 +19,7 @@ - + @@ -147,6 +147,12 @@

Beyond Benchmarks: Practical Evaluation Strategies for Compound AI Systems +
+
+ Beyond Benchmarks: Practical Evaluation Strategies for Compound AI Systems - Session Card +
+
+
@@ -157,6 +163,10 @@

Beyond Benchmarks: Practical Evaluation Strategies for Compound AI SystemsCompany/Institute: DataForce Solutions GmbH + Room: {'en': 'B05-B06'} + + + Time: 2025-09-02T14:20:00+02:00

@@ -176,10 +186,10 @@

Description

As large language models become integral to real-world applications, evaluating and improving their performance is a growing challenge. Generic benchmarks and simple metrics fail to adequately assess domain-specific, multi-step reasoning required by compound AI pipelines like retrieval-augmented generation (RAG), multi-tool agents, or knowledge assistants. Moreover, manual evaluation of every step is infeasible at scale, while fully automated LLM-as-a-judge approaches lack critical domain insights.

In this talk, we will present a practical evaluation approach to enable continuous improvement of LLM-powered systems. It incorporates the following stages:
-- Automatic tracing (e.g. using MLFlow, Langfuse, etc.): capturing input/output pairs across the pipeline to build an evaluation dataset.
+- Automatic tracing: capturing input/output pairs across the pipeline to build an evaluation dataset.
- Expert feedback collection: working with subject matter experts and user interactions to assess correctness, and identify failure points.
-- Iterative improvement cycle: tuning the components and/or optimizing prompts (using frameworks like DSPy, TextGrad, etc.).
-- Degradation tests: turning feedback into automated evaluation tests - ranging from exact match checks to LLM-as-a-judge assertions (using approaches like DeepEval) - to guard against regressions.
+- Iterative improvement cycle: tuning the components and/or optimizing prompts.
+- Degradation tests: turning feedback into automated evaluation tests - ranging from exact match checks to LLM-as-a-judge assertions - to guard against regressions.
- Continuous monitoring: using the growing evaluation dataset to validate the system as models, tools, or data sources evolve.

This framework ensures that LLM applications remain reliable, and aligned with specific business needs over time.

Target audience: AI practitioners developing and maintaining LLM-based applications.

@@ -194,7 +204,7 @@

Speakers

- Iryna Kondrashchenko + Iryna Kondrashchenko

Iryna Kondrashchenko

@@ -203,6 +213,21 @@

Iryna Kondrashchenko

Iryna is a data scientist and co-founder of DataForce Solutions GmbH, a company specialized in delivering end-to-end data science and AI services. She contributes to several open-source libraries, and strongly believes that open-source products foster a more inclusive tech industry, equipping individuals and organizations with the necessary tools to innovate and compete.

+ +
diff --git a/conferences/2025/ZLJRNN.html b/conferences/2025/ZLJRNN.html new file mode 100644 index 0000000..5f6e8de --- /dev/null +++ b/conferences/2025/ZLJRNN.html @@ -0,0 +1,260 @@ + + + + + +Benchmarking 2000+ Cloud Servers for GBM Model Training and LLM Inference Speed - PyData Berlin 2025 + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+ + Infrastructure - Hardware & Cloud + +

Benchmarking 2000+ Cloud Servers for GBM Model Training and LLM Inference Speed

+

Talk

+
+
+ + +
+
+ Benchmarking 2000+ Cloud Servers for GBM Model Training and LLM Inference Speed - Session Card +
+
+ + +
+
+ +
+ Level: Novice + + Company/Institute: Spare Cores + + + Room: {'en': 'B07-B08'} + + + Time: 2025-09-01T14:20:00+02:00 + +
+ +

Abstract

+

Spare Cores is a Python-based, open-source, and vendor-independent ecosystem collecting, generating, and standardizing comprehensive data on cloud server pricing and performance. In our latest project, we started 2000+ server types across five cloud vendors to evaluate their suitability for serving Large Language Models from 135M to 70B parameters. We tested how efficiently models can be loaded into memory of VRAM, and measured inference speed across varying token lengths for prompt processing and text generation. The published data can help you find the optimal instance type for your LLM serving needs, and we will also share our experiences and challenges with the data collection and insights into general patterns.

+ + +

Prerequisites

+

-

+ + +

Description

+
+

Spare Cores is a vendor-independent, open-source, Python-based ecosystem offering a comprehensive inventory and performance evaluation of servers across cloud providers. We automate the discovery and provisioning of thousands of server types in public using GitHub Actions to run hardware inspection tools and benchmarks for different workloads, including:
+- General performance (GeekBench, PassMark)
+- Memory bandwidth and compressions algorightms
+- OpenSSL, Redis, and web serving speed
+- DS/ML-specific benchmarks like GBM training and LLM inference on CPUs and GPUs

+

All results and open-source tools (such as database dumps, APIs, and SDKs) are openly published to help users identify and launch the most cost-efficient instance type for their specific use case in their own cloud environment.

+

This talk introduces the open-source ecosystem, then highlights our latest benchmarking efforts, including the performance evaluation of ~2,000 server types to determine the largest LLM model (from 135M to 70B parameters) that can be loaded on the machines and the inference speeds achievable with various token length for prompt processing and text generation.

+
+ +

Speaker

+ +
+ +
+ G +
+ +
+

Gergely Daroczi

+ +

Hack of all trades, master of NaN

+ + +

Gergely Daroczi, PhD, is a passionate R/Python user and package developer for two decades. With over 15 years in the industry, he has expertise in data science, engineering, cloud infrastructure, and data operations across SaaS, fintech, adtech, and healthtech startups in California and Hungary, focusing on building scalable data platforms. Gergely maintains a dozen open-source R and Python projects and organizes a tech meetup with 1,800 members in Hungary – along with other open-source and data conferences.

+ + + + +
+
+ + + + +
+
+ + + +
+ + + + + + + + + + \ No newline at end of file diff --git a/conferences/2025/ZXTLEW.html b/conferences/2025/ZXTLEW.html index 85d315f..005f6f9 100644 --- a/conferences/2025/ZXTLEW.html +++ b/conferences/2025/ZXTLEW.html @@ -166,7 +166,7 @@

Forget the Cloud: Building Lean Batch Pipelines from TCP Streams with Python Room: {'en': 'B09'} - Time: 2025-09-02T14:00:00+00:00 + Time: 2025-09-02T16:00:00+02:00

@@ -209,7 +209,7 @@

Speaker

- Orell Garten + Orell Garten

Orell Garten

@@ -220,6 +220,19 @@

Orell Garten

Software and data engineering consultant. I build data systems that help businesses to answer questions about their business. I like solving problems in a pragmatic way.

+ +
diff --git a/conferences/2025/index.html b/conferences/2025/index.html index ae09f4f..863e222 100644 --- a/conferences/2025/index.html +++ b/conferences/2025/index.html @@ -252,6 +252,8 @@

PyData Berlin 2025 - Sessions

+ + @@ -279,7 +281,7 @@

- By: Maxime Liquet + By: Maxime Liquet
@@ -300,7 +302,7 @@

- Building Bridges, Not Silos: Lessons from Running a Data & ML/AI Engineering community at Vattenfall + Building Bridges, Not Silos: Lessons from Running a Data & ML/AI Engineering Guild at Vattenfall

@@ -312,37 +314,12 @@

-
-
-
- - - Community & Diversity - - - Talk -
-
- -

- Building a Thriving Tech Ecosystem: The Role of PyLadies in Fostering Growth and Inclusion -

- -
- By: Gertrude Abagale -
- -
- The global tech ecosystem continues to grow, yet challenges like limited mentorship, a lack of role models, and fragmented community support hinder progress, especially for underrepresented groups. Py... -
-
- -
+
- - Community & Diversity + + Computer Vision (incl. Generative AI CV) Talk @@ -350,15 +327,16 @@

- Not Just Code: Building Communities That Don’t Burn People Out + Lane detection in self-driving using only NumPy

- By: AISHAT MUIBUDEEN (Maya) + By: Emma Saroyan
- Open source runs on passion, but passion is not a renewable resource. This talk will explore the hidden emotional and social costs of contributing to open-source projects. From burnout to invisibility... + Are you a scientist or a developer looking to understand how to use NumPy to solve computer vision problems? +NumPy is a Python package that provides the multidimensional array object which you can us...
@@ -375,41 +353,40 @@

- Lane detection in self-driving using only NumPy + Spot the difference: 🕵️ using foundation models to monitor for change with satellite imagery 🛰️

- By: Emma Saroyan + By: Ferdinand Schenck
- Are you a scientist or a developer looking to understand how to use NumPy to solve computer vision problems? -NumPy is a Python package that provides the multidimensional array object which you can us... + Energy infrastructure is vulnerable to damage by erosion or third party interference, which often takes the form of unsanctioned construction. In this talk we discuss our experiences using deep learni...
-
+
- - Computer Vision (incl. Generative AI CV) + + Data Handling & Engineering - Talk + Tutorial

- Spot the difference: 🕵️ using foundation models to monitor for change with satellite imagery 🛰️ + AI-Ready Data in Action: Powering Smarter Agents

- By: Ferdinand Schenck + By: Violetta Mishechkina, Chang
- Energy infrastructure is vulnerable to damage by erosion or third party interference, which often takes the form of unsanctioned construction. In this talk we discuss our experiences using deep learni... + This hands-on workshop focuses on what AI engineers do most often: making data AI-ready and turning it into production-useful applications. Together with dltHub and LanceDB, you’ll walk through an end...
@@ -580,7 +557,7 @@

- By: Alexander C.S. Hendorf + By: Alexander CS Hendorf
@@ -651,40 +628,41 @@

- 🛰️➡️🧑‍💻: Streamlining Satellite Data for Analysis-Ready Outputs + When Postgres is enough: solving document storage, pub/sub and distributed queues without more tools

- By: Vinayak Nair + By: Eugen Geist
- I will share how our team built an end-to-end system to transform raw satellite imagery into analysis-ready datasets for use cases like vegetation monitoring, deforestation detection, and identifying ... + When a new requirement appears, whether it's document storage, pub/sub messaging, distributed queues, or even full-text search, Postgres can often handle it without introducing more infrastructure. + ...
-
+
- - Education, Career & Life + + Data Handling & Engineering - Talk (long) + Talk

- Keynote 45 Min (Laura and Andy: "PyData 2077: a data science future retrospective") + 🛰️➡️🧑‍💻: Streamlining Satellite Data for Analysis-Ready Outputs

- By: + By: Vinayak Nair
- Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam sit amet varius neque, in condimentum turpis. Praesent maximus turpis lorem, vel pellentesque est rutrum nec. Donec aliquet dolor ut mas... + I will share how our team built an end-to-end system to transform raw satellite imagery into analysis-ready datasets for use cases like vegetation monitoring, deforestation detection, and identifying ...
@@ -696,7 +674,7 @@

Education, Career & Life - Talk (long) + Keynote

@@ -709,8 +687,8 @@

- How we sustain what we build — and why the future of tech depends on care, not only code. - + How we sustain what we build — and why the future of tech depends on care, not only code. + The last five years have reshaped tech — through a pandemic, economic uncertainty, shifting politics, and t...
@@ -732,7 +710,7 @@

- By: Kristian Rother, Shreyaasri Prakash + By: Dr. Kristian Rother, Shreyaasri Prakash
@@ -740,6 +718,36 @@

+
+
+
+ + + Education, Career & Life + + + Keynote +
+
+ +

+ PyData 2077: a data science future retrospective +

+ +
+ By: Laura Summers, Andy Kitchen +
+ +
+ From: Chrono-Regulatory Commission, Temporal Enforcement Division +To: PyData Berlin Organising Committee +Subject: Citation #TMP-2077-091 - Unauthorised Spacetime Disturbance + +Dear Committee, +Our ... +
+
+
@@ -757,7 +765,7 @@

- By: Adrin + By: Adrin Jalali
@@ -798,20 +806,20 @@

Generative AI - Talk + Talk [Sponsored]

- A quarter decade of learnings from scaling RAG to millions of users + Automating Content Creation with LLMs: A Journey from Manual to AI-Driven Excellence

- By: Jakob Pörschmann + By: Marco Vene
- Drawing on experience at Google designing 50+ RAG applications rolled out to millions of users, this talk presents a practical RAG design blueprint. We'll dissect key decision points for building robu... + In the fast-paced realm of travel experiences, GetYourGuide encountered the challenge of maintaining consistent, high-quality content across its global marketplace. Manual content creation by supplier...
@@ -882,7 +890,7 @@

- By: Laura + By: Laura Dumont
@@ -911,8 +919,8 @@

- Using LiteLLM in a Real-World RAG System: What Worked and What Didn’t - + Using LiteLLM in a Real-World RAG System: What Worked and What Didn’t + LiteLLM provides a unified interface to work with multiple LLM providers—but how well does it hold up in practice? In this talk...
@@ -930,15 +938,15 @@

- Edge of Intelligence: The State of AI in Browsers + Benchmarking 2000+ Cloud Servers for GBM Model Training and LLM Inference Speed

- By: Johannes Kolbe + By: Gergely Daroczi
- API calls suck! Okay, not all of them. But building your AI features reliant on third party APIs can bring a lot of trouble. In this talk you'll learn how to use web technologies to become more indepe... + Spare Cores is a Python-based, open-source, and vendor-independent ecosystem collecting, generating, and standardizing comprehensive data on cloud server pricing and performance. In our latest project...
@@ -955,15 +963,15 @@

- Flying Beyond Keywords: Our Aviation Semantic Search Journey + Edge of Intelligence: The State of AI in Browsers

- By: Dat Tran, Dennis Schmidt + By: Johannes Kolbe
- In aviation, search isn’t simple—people use abbreviations, slang, and technical terms that make exact matching tricky. We started with just Postgres, aiming for something that worked. Over time, we up... + API calls suck! Okay, not all of them. But building your AI features reliant on third party APIs can bring a lot of trouble. In this talk you'll learn how to use web technologies to become more indepe...
@@ -975,29 +983,29 @@

Infrastructure - Hardware & Cloud - Talk (long) + Talk

- Template-based web app and deployment pipeline at an enterprise-ready level on Azure + Flying Beyond Keywords: Our Aviation Semantic Search Journey

- By: Johannes Schöck + By: Dat Tran, Dennis Schmidt
- A practical deep-dive into Azure DevOps pipelines, the Azure CLI, and how to combine pipeline, bicep, and python templates to build a fully automated web app deployment system. Deploying a new proof o... + In aviation, search isn’t simple—people use abbreviations, slang, and technical terms that make exact matching tricky. We started with just Postgres, aiming for something that worked. Over time, we up...
-
+
- - Natural Language Processing & Audio (incl. Generative AI NLP) + + Infrastructure - Hardware & Cloud Talk @@ -1005,99 +1013,99 @@

- Beyond Benchmarks: Practical Evaluation Strategies for Compound AI Systems + Scaling Python: An End-to-End ML Pipeline for ISS Anomaly Detection with Kubeflow and MLFlow

- By: Iryna Kondrashchenko, Oleh Kostromin + By: Christian Geier
- Evaluating large language models (LLMs) in real-world applications goes far beyond standard benchmarks. When LLMs are embedded in complex pipelines, choosing the right models, prompts, and parameters ... + Building and deploying scalable, reproducible machine learning pipelines can be challenging, especially when working with orchestration tools like Slurm or Kubernetes. In this talk, we demonstrate how...
-
+
- - Natural Language Processing & Audio (incl. Generative AI NLP) + + Infrastructure - Hardware & Cloud - Talk + Talk (long)

- Bridging Custom Schemas and Wikidata with an LLM-Assisted Interactive Python Tool + Template-based web app and deployment pipeline at an enterprise-ready level on Azure

- By: Sankalp Gilda, Ph.D. + By: Johannes Schöck
- Many projects build knowledge graphs with custom schemas but struggle to align them with standard hubs like Wikidata. Manual mapping is tedious and error-prone, while fully automated methods often lac... + A practical deep-dive into Azure DevOps pipelines, the Azure CLI, and how to combine pipeline, bicep, and python templates to build a fully automated web app deployment system. Deploying a new proof o...
-
+
- - Natural Language Processing & Audio (incl. Generative AI NLP) + + Lightning Talks - Talk + Plenary Session [Organizers]

- From Months to Minutes: Accelerating Compliance Reviews with GenAI + Lightning Talks

- By: Elizaveta Zinovyeva + By:
- Transform time-consuming document compliance reviews into automated workflows with Generative AI. Through live demonstrations, learn how to build systems that extract policies from unstructured data, ... + Lightning Talks are short, 5-minute presentations open to all attendees. They’re a fun and fast-paced way to share ideas, showcase projects, spark discussions, or raise awareness about topics you care...
-
+
- - PyData & Scientific Libraries Stack + + Natural Language Processing & Audio (incl. Generative AI NLP) - Tutorial + Talk

- A Beginner's Guide to State Space Modeling + Beyond Benchmarks: Practical Evaluation Strategies for Compound AI Systems

- By: Alexandre Andorra, Jesse Grabowski + By: Iryna Kondrashchenko, Oleh Kostromin
- **State Space Models** (SSMs) are powerful tools for time series analysis, widely used in finance, economics, ecology, and engineering. They allow researchers to encode structural behavior into time s... + Evaluating large language models (LLMs) in real-world applications goes far beyond standard benchmarks. When LLMs are embedded in complex pipelines, choosing the right models, prompts, and parameters ...
-
+
- - PyData & Scientific Libraries Stack + + Natural Language Processing & Audio (incl. Generative AI NLP) Talk @@ -1105,24 +1113,24 @@

- Building Reactive Data Apps with Shinylive and WebAssembly + From Months to Minutes: Accelerating Compliance Reviews with GenAI

- By: Christoph Scheuch + By: Elizaveta Zinovyeva
- WebAssembly is reshaping how Python applications can be delivered - allowing fully interactive apps that run directly in the browser, without a traditional backend server. In this talk, I’ll demonstra... + Transform time-consuming document compliance reviews into automated workflows with Generative AI. Through live demonstrations, learn how to build systems that extract policies from unstructured data, ...
-
+
- - PyData & Scientific Libraries Stack + + Natural Language Processing & Audio (incl. Generative AI NLP) Talk @@ -1130,40 +1138,40 @@

- Causal Inference in Network Structures: Lessons learned From Financial Services + Most AI Agents Are Useless. Let’s Fix That

- By: Danial Senejohnny + By: Bilge Yücel
- *Causal inference techniques are crucial to understanding the impact of actions on outcomes.* *This talk shares lessons learned from applying these techniques in real-world scenarios where standard me... + AI agents are having a moment, but most of them are little more than fragile prototypes that break under pressure. Together, we’ll explore why so many agentic systems fail in practice, and how to fix ...
-
+
- - PyData & Scientific Libraries Stack + + Natural Language Processing & Audio (incl. Generative AI NLP) - Talk + Talk [Sponsored]

- Consumer Choice Models with PyMC Marketing + Training Specialized Language Models with Less Data: An End-to-End Practical Guide

- By: Nathaniel Forde + By: Jacek Golebiowski
- Consumer choice models are an important part of product innovation and market strategy. In this talk we'll see how they can be used to learn about substitution goods and market shares in competitive m... + Small Language Models (SLMs) offer an efficient and cost-effective alternative to LLMs—especially when latency, privacy, inference costs or deployment constraints matter. However, training them typica...
@@ -1175,20 +1183,20 @@

PyData & Scientific Libraries Stack - Talk (long) + Tutorial

- Narwhals: enabling universal dataframe support + A Beginner's Guide to State Space Modeling

- By: Marco Gorelli + By: Jesse Grabowski, Alexandre Andorra
- Ever tried passing a Polars Dataframe to a data science library and found that it...just works? No errors, no panics, no noticeable overhead, just...results? This is becoming increasingly common in 20... + **State Space Models** (SSMs) are powerful tools for time series analysis, widely used in finance, economics, ecology, and engineering. They allow researchers to encode structural behavior into time s...
@@ -1205,15 +1213,15 @@

- Risk Budget Optimization for Causal Mix Models + Building Reactive Data Apps with Shinylive and WebAssembly

- By: Carlos Trujillo + By: Christoph Scheuch
- Traditional budget planners chase the highest predicted return and hope for the best. Bayesian models take the opposite route: they quantify uncertainty first, then let us optimize budgets with that u... + WebAssembly is reshaping how Python applications can be delivered - allowing fully interactive apps that run directly in the browser, without a traditional backend server. In this talk, I’ll demonstra...
@@ -1230,24 +1238,24 @@

- The Importance and Elegance of Polars Expressions + Causal Inference in Network Structures: Lessons learned From Financial Services

- By: Jeroen Janssens + By: Danial Senejohnny
- Polars is known for its speed, but its elegance comes from its use of expressions. In this talk, we’ll explore how Polars expressions work and why they are key to efficient and elegant data manipulati... + *Causal inference techniques are crucial to understanding the impact of actions on outcomes.* *This talk shares lessons learned from applying these techniques in real-world scenarios where standard me...
-
+
- - Visualisation & Jupyter + + PyData & Scientific Libraries Stack Talk @@ -1255,49 +1263,49 @@

- Beyond Linear Funnels: Visualizing Conditional User Journeys with Python + Consumer Choice Models with PyMC Marketing

- By: Yaseen Esmaeelpour + By: Nathaniel Forde
- Optimizing user funnels is a common task for data analysts and data scientists. Funnels are not always linear in the real world. often, the next step depends on earlier responses or actions. This resu... + Consumer choice models are an important part of product innovation and market strategy. In this talk we'll see how they can be used to learn about substitution goods and market shares in competitive m...
-
+
- - Visualisation & Jupyter + + PyData & Scientific Libraries Stack - Talk + Keynote

- Beyond the Black Box: Interpreting ML models with SHAP + Narwhals: enabling universal dataframe support

- By: Avik Basu + By: Marco Gorelli
- As machine learning models become more accurate and complex, explainability remains essential. Explainability helps not just with trust and transparency but also with generating actionable insights an... + Ever tried passing a Polars Dataframe to a data science library and found that it...just works? No errors, no panics, no noticeable overhead, just...results? This is becoming increasingly common in 20...
-
+
- - Visualisation & Jupyter + + PyData & Scientific Libraries Stack Talk @@ -1305,51 +1313,30 @@

- Building an A/B Testing Framework with NiceGUI + Risk Budget Optimization for Causal Mix Models

- By: Wessel van de Goor + By: Carlos Trujillo
- NiceGUI is a Python-based web UI framework that enables developers to build interactive web applications without using JavaScript. In this talk, I’ll share how my team used NiceGUI to create an intern... + Traditional budget planners chase the highest predicted return and hope for the best. Bayesian models take the opposite route: they quantify uncertainty first, then let us optimize budgets with that u...
-
+
- - Visualisation & Jupyter + + PyData & Scientific Libraries Stack Talk
-

- Democratizing Digital Maps: How Protomaps Changes the Game -

- -
- By: Veit Schiele -
- -
- Digital mapping has long been dominated by commercial providers, creating barriers of cost, complexity, and privacy concerns. This talk introduces Protomaps, an open-source project that reimagines how... -
-
- -
-
-
- - Talk -
-
-

Scaling Probabilistic Models with Variational Inference

@@ -1363,150 +1350,153 @@

-
+
- Talk -
-
- -

- TBA Talk 1 -

- -
- By: -
- -
- Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam sit amet varius neque, in condimentum turpis. Praesent maximus turpis lorem, vel pellentesque est rutrum nec. Donec aliquet dolor ut mas... -
-
- -
-
-
+ + PyData & Scientific Libraries Stack + Talk

- TBA Talk 2 + The Importance and Elegance of Polars Expressions

- By: + By: Jeroen Janssens
- Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam sit amet varius neque, in condimentum turpis. Praesent maximus turpis lorem, vel pellentesque est rutrum nec. Donec aliquet dolor ut mas... + Polars is known for its speed, but its elegance comes from its use of expressions. In this talk, we’ll explore how Polars expressions work and why they are key to efficient and elegant data manipulati...
-
+
+ + Visualisation & Jupyter + + Talk

- TBA Talk 3 + Accessible Data Visualizations

- By: + By: Maris Nieuwenhuis
- Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam sit amet varius neque, in condimentum turpis. Praesent maximus turpis lorem, vel pellentesque est rutrum nec. Donec aliquet dolor ut mas... + Data visualizations often exclude users with visual impairments and temporary or situational constraints. Many regulations (European Accessibility Act, American Disabilities Act) now mandate inclusive...
-
+
+ + Visualisation & Jupyter + + Talk

- TBA Talk 4 + Beyond Linear Funnels: Visualizing Conditional User Journeys with Python

- By: + By: Yaseen Esmaeelpour
- Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam sit amet varius neque, in condimentum turpis. Praesent maximus turpis lorem, vel pellentesque est rutrum nec. Donec aliquet dolor ut mas... + Optimizing user funnels is a common task for data analysts and data scientists. Funnels are not always linear in the real world. often, the next step depends on earlier responses or actions. This resu...
-
+
+ + Visualisation & Jupyter + + Talk

- TBA Talk 5 + Beyond the Black Box: Interpreting ML models with SHAP

- By: + By: Avik Basu
- Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam sit amet varius neque, in condimentum turpis. Praesent maximus turpis lorem, vel pellentesque est rutrum nec. Donec aliquet dolor ut mas... + As machine learning models become more accurate and complex, explainability remains essential. Explainability helps not just with trust and transparency but also with generating actionable insights an...
-
+
+ + Visualisation & Jupyter + + Talk

- TBA Talk 6 + Building an A/B Testing Framework with NiceGUI

- By: + By: Wessel van de Goor
- Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam sit amet varius neque, in condimentum turpis. Praesent maximus turpis lorem, vel pellentesque est rutrum nec. Donec aliquet dolor ut mas... + NiceGUI is a Python-based web UI framework that enables developers to build interactive web applications without using JavaScript. In this talk, I’ll share how my team used NiceGUI to create an intern...
-
+
- Talk (long) + + Visualisation & Jupyter + + + Talk

- TBA: Keystone + Democratizing Digital Maps: How Protomaps Changes the Game

- By: + By: Veit Schiele
- Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam sit amet varius neque, in condimentum turpis. Praesent maximus turpis lorem, vel pellentesque est rutrum nec. Donec aliquet dolor ut mas... + Digital mapping has long been dominated by commercial providers, creating barriers of cost, complexity, and privacy concerns. This talk introduces Protomaps, an open-source project that reimagines how...
diff --git a/images/social/3BVEKT.png b/images/social/3BVEKT.png new file mode 100644 index 0000000..fef963e Binary files /dev/null and b/images/social/3BVEKT.png differ diff --git a/images/social/3LDDAB.png b/images/social/3LDDAB.png index 9b8c51b..85ca5bf 100644 Binary files a/images/social/3LDDAB.png and b/images/social/3LDDAB.png differ diff --git a/images/social/8UJA37.png b/images/social/8UJA37.png index 2d42c6b..24f1185 100644 Binary files a/images/social/8UJA37.png and b/images/social/8UJA37.png differ diff --git a/images/social/AU8F9U.png b/images/social/AU8F9U.png new file mode 100644 index 0000000..9c4695c Binary files /dev/null and b/images/social/AU8F9U.png differ diff --git a/images/social/CAUAZY.png b/images/social/CAUAZY.png new file mode 100644 index 0000000..ab1e058 Binary files /dev/null and b/images/social/CAUAZY.png differ diff --git a/images/social/DBL9PQ.png b/images/social/DBL9PQ.png index c27dad4..15912ed 100644 Binary files a/images/social/DBL9PQ.png and b/images/social/DBL9PQ.png differ diff --git a/images/social/FDBZSR.png b/images/social/FDBZSR.png new file mode 100644 index 0000000..6345cfd Binary files /dev/null and b/images/social/FDBZSR.png differ diff --git a/images/social/FPDP3E.png b/images/social/FPDP3E.png new file mode 100644 index 0000000..76605c1 Binary files /dev/null and b/images/social/FPDP3E.png differ diff --git a/images/social/GBVFJ8.png b/images/social/GBVFJ8.png index 5b227d5..12436bd 100644 Binary files a/images/social/GBVFJ8.png and b/images/social/GBVFJ8.png differ diff --git a/images/social/GQBX3J.png b/images/social/GQBX3J.png index db3b2d2..09c3e01 100644 Binary files a/images/social/GQBX3J.png and b/images/social/GQBX3J.png differ diff --git a/images/social/GRZ3RG.png b/images/social/GRZ3RG.png index d33d0d0..6971c7c 100644 Binary files a/images/social/GRZ3RG.png and b/images/social/GRZ3RG.png differ diff --git a/images/social/GZUXGZ.png b/images/social/GZUXGZ.png index b307eb7..bf775fb 100644 Binary files a/images/social/GZUXGZ.png and b/images/social/GZUXGZ.png differ diff --git a/images/social/HKMYHY.png b/images/social/HKMYHY.png index 344c711..2cc8b25 100644 Binary files a/images/social/HKMYHY.png and b/images/social/HKMYHY.png differ diff --git a/images/social/HUNUEB.png b/images/social/HUNUEB.png index d1b2e8c..32de3b9 100644 Binary files a/images/social/HUNUEB.png and b/images/social/HUNUEB.png differ diff --git a/images/social/HYGHBG.png b/images/social/HYGHBG.png index 8fe73ea..c899577 100644 Binary files a/images/social/HYGHBG.png and b/images/social/HYGHBG.png differ diff --git a/images/social/JE8YJT.png b/images/social/JE8YJT.png index 5883c2c..4daf4d7 100644 Binary files a/images/social/JE8YJT.png and b/images/social/JE8YJT.png differ diff --git a/images/social/JEKYLT.png b/images/social/JEKYLT.png index 4c068d8..69821c5 100644 Binary files a/images/social/JEKYLT.png and b/images/social/JEKYLT.png differ diff --git a/images/social/KBEEHS.png b/images/social/KBEEHS.png new file mode 100644 index 0000000..d272045 Binary files /dev/null and b/images/social/KBEEHS.png differ diff --git a/images/social/KCPVYN.png b/images/social/KCPVYN.png index 517c34c..96fc2f2 100644 Binary files a/images/social/KCPVYN.png and b/images/social/KCPVYN.png differ diff --git a/images/social/KEJJSP.png b/images/social/KEJJSP.png index 41adc02..8c05778 100644 Binary files a/images/social/KEJJSP.png and b/images/social/KEJJSP.png differ diff --git a/images/social/KPHH7H.png b/images/social/KPHH7H.png new file mode 100644 index 0000000..e54fcaa Binary files /dev/null and b/images/social/KPHH7H.png differ diff --git a/images/social/L3PET8.png b/images/social/L3PET8.png index 969cdbc..7d75a06 100644 Binary files a/images/social/L3PET8.png and b/images/social/L3PET8.png differ diff --git a/images/social/LZYBVH.png b/images/social/LZYBVH.png index 2385c42..5faa361 100644 Binary files a/images/social/LZYBVH.png and b/images/social/LZYBVH.png differ diff --git a/images/social/PPAYDV.png b/images/social/PPAYDV.png index 7fe6b94..03a9da4 100644 Binary files a/images/social/PPAYDV.png and b/images/social/PPAYDV.png differ diff --git a/images/social/QMPX9V.png b/images/social/QMPX9V.png index 84d06b4..7797fd7 100644 Binary files a/images/social/QMPX9V.png and b/images/social/QMPX9V.png differ diff --git a/images/social/SCQE8H.png b/images/social/SCQE8H.png index 4def6f7..eed1bb3 100644 Binary files a/images/social/SCQE8H.png and b/images/social/SCQE8H.png differ diff --git a/images/social/VBCU9H.png b/images/social/VBCU9H.png index 2def103..3113753 100644 Binary files a/images/social/VBCU9H.png and b/images/social/VBCU9H.png differ diff --git a/images/social/W9Q7JY.png b/images/social/W9Q7JY.png index 61dca54..2a5e90a 100644 Binary files a/images/social/W9Q7JY.png and b/images/social/W9Q7JY.png differ diff --git a/images/social/WGJJQN.png b/images/social/WGJJQN.png index ac29fa3..b6c095c 100644 Binary files a/images/social/WGJJQN.png and b/images/social/WGJJQN.png differ diff --git a/images/social/WMRLUD.png b/images/social/WMRLUD.png index 2d12458..bd840d3 100644 Binary files a/images/social/WMRLUD.png and b/images/social/WMRLUD.png differ diff --git a/images/social/WXPVCS.png b/images/social/WXPVCS.png index 8c54fbe..f6b3a0b 100644 Binary files a/images/social/WXPVCS.png and b/images/social/WXPVCS.png differ diff --git a/images/social/XE9F7X.png b/images/social/XE9F7X.png index 7b5d0c9..90bcd32 100644 Binary files a/images/social/XE9F7X.png and b/images/social/XE9F7X.png differ diff --git a/images/social/XFPTWN.png b/images/social/XFPTWN.png new file mode 100644 index 0000000..78318d6 Binary files /dev/null and b/images/social/XFPTWN.png differ diff --git a/images/social/XNMYDK.png b/images/social/XNMYDK.png index 12fbd68..f6b1864 100644 Binary files a/images/social/XNMYDK.png and b/images/social/XNMYDK.png differ diff --git a/images/social/YKFWKQ.png b/images/social/YKFWKQ.png index dac7c23..4d46d3d 100644 Binary files a/images/social/YKFWKQ.png and b/images/social/YKFWKQ.png differ diff --git a/images/social/ZLJRNN.png b/images/social/ZLJRNN.png new file mode 100644 index 0000000..44843a6 Binary files /dev/null and b/images/social/ZLJRNN.png differ diff --git a/images/social/ZXTLEW.png b/images/social/ZXTLEW.png index cd904f0..0fb069c 100644 Binary files a/images/social/ZXTLEW.png and b/images/social/ZXTLEW.png differ diff --git a/scripts/generate_social_cards.py b/scripts/generate_social_cards.py index ea05f4b..b74ac13 100644 --- a/scripts/generate_social_cards.py +++ b/scripts/generate_social_cards.py @@ -54,12 +54,12 @@ def _load_speakers(self) -> dict[str, Speaker]: # nasty bug in pretalx: no pictures sometimes # add missing picture links via local file # format is {speaker_id: {'file': 'filename.jpg', 'name'': 'speaker name for readability'}} - with open("../../_data/berlin2025_speaker_pic_add.json") as f: - extra_pics = json.load(f) - for speaker_id in extra_pics: - speaker_dict[ - speaker_id - ].picture = f"/media/avatars/{extra_pics[speaker_id]['file']}" + # with open("../../_data/berlin2025_speaker_pic_add.json") as f: + # extra_pics = json.load(f) + # for speaker_id in extra_pics: + # speaker_dict[ + # speaker_id + # ].picture = f"/media/avatars/{extra_pics[speaker_id]['file']}" return speaker_dict def _load_fonts(self) -> dict[str, ImageFont.FreeTypeFont]: