You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<spanclassName="block text-orange-500">Creative Writing, Social Simulation, ..., and Your Task!</span>
736
+
{/* <RollingText /> */}
736
737
</h2>
737
738
<divclassName="text-center mb-8">
738
739
<img
@@ -756,25 +757,6 @@ export default function HomePage() {
756
757
state names in the pretraining data. The verbalized probability distribution generated by VS, when averaged over 10 trials, closely aligns with this reference pretraining distribution (KL=0.12).
757
758
In contrast, direct prompting collapses into a few modes, repeatedly outputting states like California and Texas.
We observe an <strong>emergent trend</strong> where larger models benefit more from VS. Figure 5 shows the diversity gain over the direct prompting which suffers from mode collapse.
763
-
Across all VS variants, larger models (GPT-4.1, Gemini-2.5-Pro) achieve diversity gains 1.5 to 2 times greater than smaller models (GPT-4.1-Mini, Gemini-2.5-Flash).
764
-
</p>
765
-
</div>
766
-
<divclassName="mt-8 lg:mt-0">
767
-
<img
768
-
src="/images/emergent_trend.png"
769
-
alt="Emergent Trend: Larger Models Benefit More from VS"
We observe an <strong>emergent trend</strong> where larger models benefit more from VS. Figure 5 shows the diversity gain over the direct prompting which suffers from mode collapse.
776
+
Across all VS variants, larger models (GPT-4.1, Gemini-2.5-Pro) achieve diversity gains 1.5 to 2 times greater than smaller models (GPT-4.1-Mini, Gemini-2.5-Flash).
777
+
</p>
778
+
</div>
779
+
</div>
780
+
<divclassName="mt-8 lg:mt-0">
781
+
<img
782
+
src="/images/emergent_trend.png"
783
+
alt="Emergent Trend: Larger Models Benefit More from VS"
Verbalized Sampling provides a training-free, model-agnostic approach to mitigating mode collapse by prompting the model to generate response distributions with verbalized probability estimates.
0 commit comments