You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Computational pathology needs whole-slide image (WSI) foundation models that transfer across diverse clinical tasks, yet current approaches remain largely slide-centric, often depend on private data and expensive paired-report supervision, and do not explicitly model relationships among multiple slides from the same patient. We present MOOZY, a patient-first pathology foundation model in which the patient case, not the individual slide, is the core unit of representation. MOOZY explicitly models dependencies across all slides from the same patient via a case transformer during pretraining, combining multi-stage open self-supervision with scaled low-cost task supervision. In Stage 1, we pretrain a vision-only slide encoder on 77,134 public slide feature grids using masked self-distillation. In Stage 2, we align these representations with clinical semantics using a case transformer and multi-task supervision over 333 tasks from 56 public datasets, including 205 classification and 128 survival tasks across four endpoints. Across eight held-out tasks with five-fold frozen-feature probe evaluation, MOOZY achieves best or tied-best performance on most metrics and improves macro averages over TITAN by +7.37%, +5.50%, and +7.83% and over PRISM by +8.83%, +10.70%, and +9.78% for weighted F1, weighted ROC-AUC, and balanced accuracy, respectively.
104
105
</p>
105
106
</div>
@@ -134,7 +135,7 @@ <h2 class="section-heading">Training Data Scale</h2>
134
135
style="width:100%; border-radius:8px;">
135
136
</div>
136
137
<divclass="column is-6">
137
-
<pstyle="line-height:1.7">
138
+
<pstyle="line-height:1.8; font-size:1.1rem;">
138
139
MOOZY is trained entirely on public data. Stage 1 uses 77,134 slide feature grids
139
140
(53,286 at 20× and 23,848 at 40×) extracted from ~1.67 billion patches across ~31.8 TB of raw WSI data.
140
141
Stage 2 uses 41,089 supervised cases across 333 tasks from 56 datasets —
@@ -308,7 +309,7 @@ <h2 class="section-heading">Where Does MOOZY Look?</h2>
308
309
<spanclass="dot"></span>
309
310
</div>
310
311
<!-- Captions per slide -->
311
-
<divclass="carousel-caption fig-caption" style="display:block">Lung adenocarcinoma. MOOZY and TITAN: balanced, comprehensive coverage. CHIEF and Madeleine: cancer-biased with semantic gaps.</div>
312
+
<divclass="carousel-caption fig-caption" style="display:block">Attention-map comparison on a lung adenocarcinoma slide. MOOZY and TITAN: balanced, comprehensive coverage (shift 3, gap 1). PRISM: balanced shift with moderate gaps (shift 3, gap 2). CHIEF and Madeleine: cancer-biased with frequent semantic gaps (shift 2, gap 3).</div>
312
313
<divclass="carousel-caption fig-caption" style="display:none">Attention comparison across five encoders on a representative WSI.</div>
313
314
<divclass="carousel-caption fig-caption" style="display:none">Attention comparison across five encoders on a representative WSI.</div>
314
315
<divclass="carousel-caption fig-caption" style="display:none">Attention comparison across five encoders on a representative WSI.</div>
@@ -321,7 +322,7 @@ <h2 class="section-heading">Where Does MOOZY Look?</h2>
0 commit comments