PolymathicAI
diff --git a/‎_config.yml‎
Lines changed: 5 additions & 0 deletions b/‎_config.yml‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎_posts/2025-07-21-latent-space.md‎
Lines changed: 80 additions & 0 deletions b/‎_posts/2025-07-21-latent-space.md‎
Lines changed: 80 additions & 0 deletions
diff --git a/‎collaborators/david-fouhey.md‎
Lines changed: 7 additions & 0 deletions b/‎collaborators/david-fouhey.md‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎collaborators/francesco-pio-ramunno.md‎
Lines changed: 7 additions & 0 deletions b/‎collaborators/francesco-pio-ramunno.md‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎collaborators/francois-rozet.md‎
Lines changed: 7 additions & 0 deletions b/‎collaborators/francois-rozet.md‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎collaborators/ghazal-khalighinejad.md‎
Lines changed: 7 additions & 0 deletions b/‎collaborators/ghazal-khalighinejad.md‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎collaborators/helen-qu.md‎
Lines changed: 7 additions & 0 deletions b/‎collaborators/helen-qu.md‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎collaborators/jake-kovalic.md‎
Lines changed: 7 additions & 0 deletions b/‎collaborators/jake-kovalic.md‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎collaborators/jiequn-han.md‎
Lines changed: 7 additions & 0 deletions b/‎collaborators/jiequn-han.md‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎collaborators/keiya-hirashima.md‎
Lines changed: 7 additions & 6 deletions b/‎collaborators/keiya-hirashima.md‎
Lines changed: 7 additions & 6 deletions
@@ -380,6 +380,11 @@ team:
     website: https://flanusse.net/
     bio: Francois Lanusse is an interdisciplinary researcher at the intersection of Deep Learning, Statistical Modeling, and Observational Cosmology. Dr. Lanusse holds a permanent position at the CNRS, and is currently an Associate Research Scientist at the Simons Foundation. He received his PhD in Astrophysics at CEA Paris-Saclay and was subsequently a postdoctoral researcher at Carnegie Mellon University and UC Berkeley.
 
+  - full_name: Tanya Marwah
+    avatar: tanya_marwah.png
+    website: https://tm157.github.io/
+    bio: Tanya Marwah is a Research Fellow at the Simons Foundation working with Polymathic AI. She is broadly interested in theoretical and empirical foundations of Machine Learning and its applications to scientific domains. Her current interests are around generative modeling of scientific phenomena, inverse problems and building scientific agents. Her ultimate goal is to develop ML algorithms and methods that help us accelerate the scientific process and enable scientific discovery. She recently graduated with a PhD from the Machine Learning Department at Carnegie Mellon University and holds a Masters in Robotics from the Robotics Institute at CMU and was a Siebel Scholar.
+
   - full_name: Michael McCabe
     avatar: michael_mccabe.jpg
     website: https://mikemccabe210.github.io/
 
@@ -0,0 +1,80 @@
+---
+layout: post
+title: "Lost in Latent Space: the Pros and Cons of Latent Physics Emulation"
+authors: François Rozet, Ruben Ohana, Michael McCabe, Gilles Louppe, François Lanusse, Shirley Ho
+shorttitle: "Lost in Latent Space"
+date: 2025-07-21 9:00
+smallimage: latent_space_s.jpg
+image: latent_space.jpg
+blurb: We show that latent diffusion models are robust to compression in the context of physics emulation, reducing computational cost while consistently outperforming non-generative alternatives.
+shortblurb: We show that latent diffusion models are robust to compression in the context of physics emulation, reducing computational cost while consistently outperforming non-generative alternatives.
+splashimage: /images/blog/latent_space.jpg
+link: https://arxiv.org/abs/2507.02608
+github_link: https://github.com/PolymathicAI/lola
+permalink: /blog/lostinlatentspace/
+---
+
+Numerical simulations are fundamental to scientific progress, enabling everything from weather forecasting to plasma control in fusion reactors. However, achieving high-fidelity results often requires significant computational resources, making these simulations a bottleneck for rapid research and development.
+
+At <a href="https://polymathic-ai.org/">Polymathic</a>, we believe that neural network-based emulators are a promising alternative to traditional numerical solvers, enabling orders of magnitude faster simulations. Recently, latent diffusion models were applied with success to the problem of emulating dynamical systems (<a href="https://arxiv.org/abs/2307.10422">Gao et al., 2023</a>; <a href="https://arxiv.org/abs/2403.05940">Du et al., 2024</a>; <a href="https://arxiv.org/abs/2504.18720">Andry et al., 2025</a>), sometimes even outperforming pixel-space emulation. In this work, we asked ourselves a simple question: *What is the impact of latent-space compression on emulation accuracy?*
+
+The answer surprised us, and we think it will surprise you too.
+
+#### From Pixel Space to Latent Space
+
+The core idea of latent diffusion models (<a href="https://arxiv.org/abs/2112.10752">Rombach et al., 2022</a>), which have proven highly effective for image and video generation, is to perform the generative process not in the high-dimensional pixel space, but in a compressed, low-dimensional latent space learned by an autoencoder. For natural images, compression serves a dual purpose: reducing computational cost and filtering out perceptually irrelevant patterns that might distract the generative model from semantically meaningful information.
+
+In our case, the methodology involves three stages. First, an autoencoder is trained to compress high-dimensional physical states into compact latent representations. Second, a diffusion model is trained to predict/emulate the temporal evolution of the system within this compressed latent space. Third, after training, the diffusion model is used to predict the sequence of latent states which are then mapped back to the pixel space with the autoencoder's decoder.
+
+<p align="center">
+  <img src="/images/blog/latent_emulation.svg" alt="Latent emulation" width="95%" style="mix-blend-mode: darken;">
+</p>
+
+#### Findings
+
+To answer our research question, we trained and evaluated latent-space emulators across a wide range of compression rates – from modest (x48) to extreme (x1280) – on three challenging datasets from <a href="https://polymathic-ai.org/blog/thewell">The Well</a>:
+
+- **Euler Multi-Quadrants**, describing compressible fluids and shock waves.
+<p align="center">
+<video width="95%" controls>
+  <source src="/images/blog/latent_space_vid/euler_f32c64.mp4" type="video/mp4">
+  Your browser does not support the video tag.
+</video>
+</p>
+
+- **Rayleigh-Bénard**, modeling buoyancy driven convection currents.
+<p align="center">
+<video width="95%" controls>
+  <source src="/images/blog/latent_space_vid/rb_f32c64.mp4" type="video/mp4">
+  Your browser does not support the video tag.
+</video>
+</p>
+
+- **Turbulence Gravity Cooling**, simulating the formation and radiative cooling of stars in interstellar media.
+<p align="center">
+<video width="95%" controls>
+  <source src="/images/blog/latent_space_vid/tgc_f32c64.mp4" type="video/mp4">
+  Your browser does not support the video tag.
+</video>
+</p>
+
+Our experiments reveal two key findings.
+
+**1. Robustness to Compression**
+
+Our most striking finding is the **remarkable resilience of latent emulation to the compression rate** of the latent space with respect to pixel space. While reconstruction quality deteriorates as compression increases, we do not observe any significant degradation in the emulation accuracy itself. In all cases, **latent emulators outperform pixel-space baselines**, despite using fewer parameters and less training compute.
+
+Nevertheless, our evaluation reveals potential overfitting issues at extreme compression rates. This makes intuitive sense: as compression increases, the effective size of the dataset in latent space decreases, making overfitting more likely at fixed model capacity. This underscores the importance of efforts like <a href="https://polymathic-ai.org/blog/thewell">The Well</a>, which provides curated, large-scale physics data for training and benchmarking emulators.
+
+**2. Generative Models over Deterministic Solvers**
+
+Across all tasks and compression rates, **diffusion-based emulators are consistently more accurate than deterministic neural solvers**. They not only produce better and more plausible trajectories, but also capture the uncertainty and diversity inherent to turbulent and chaotic dynamical systems.
+
+#### Practical Recommendations for Practitioners
+
+Our findings translate into clear, actionable recommendations for practitioners developing physics emulators. First, **try latent-space approaches**. They offer reduced computational requirements and provide comparable or superior performances across a wide range of compression rates. In our case, it also greatly simplified the development and training of the emulator as we could rely on widespread transformer architectures with well known scaling properties. Second, **prefer generative over deterministic emulators**. They yield better accuracy, more plausible dynamics, stable rollouts, and naturally handle uncertainty.
+
+For more details, check out the <a href="https://arxiv.org/abs/2507.02608">paper</a>.
+
+---
+Image by [JJ Ying](https://unsplash.com/@jjying?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash) via [Unsplash](https://unsplash.com/photos/white-cloth-lot-WmnsGyaFnCQ?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash).
@@ -0,0 +1,7 @@
+   
+    ---
+    layout: collaborator
+    full_name: "David Fouhey"
+    ---
+    
+    
@@ -0,0 +1,7 @@
+   
+    ---
+    layout: collaborator
+    full_name: "Francesco Pio Ramunno"
+    ---
+    
+    
@@ -0,0 +1,7 @@
+   
+    ---
+    layout: collaborator
+    full_name: "Francois Rozet"
+    ---
+    
+    
@@ -0,0 +1,7 @@
+   
+    ---
+    layout: collaborator
+    full_name: "Ghazal Khalighinejad"
+    ---
+    
+    
@@ -0,0 +1,7 @@
+   
+    ---
+    layout: collaborator
+    full_name: "Helen Qu"
+    ---
+    
+    
@@ -0,0 +1,7 @@
+   
+    ---
+    layout: collaborator
+    full_name: "Jake Kovalic"
+    ---
+    
+    
@@ -0,0 +1,7 @@
+   
+    ---
+    layout: collaborator
+    full_name: "Jiequn Han"
+    ---
+    
+    
@@ -1,6 +1,7 @@
----
-layout: collaborator
-full_name: "Keiya Hirashima"
----
-
-
+   
+    ---
+    layout: collaborator
+    full_name: "Keiya Hirashima"
+    ---
+    
+
-Original file line number
+Diff line change
@@ @@ -0,0 +1,7 @@ @@
++
 +    ---
 +    layout: collaborator
 +    full_name: "David Fouhey"
 +    ---
++
++