Skip to content

Commit b78b43c

Browse files
committed
update gh-pages
1 parent 790ab8b commit b78b43c

File tree

1 file changed

+31
-14
lines changed

1 file changed

+31
-14
lines changed

index.html

Lines changed: 31 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -295,12 +295,6 @@ <h2 class="subtitle is-3 publication-subtitle"> Think before You Score in Genera
295295
</div>
296296
</div>
297297

298-
<centering>
299-
<div style="text-align: center;">
300-
<img id="teaser" width="85%" src="static/images/teaser.png">
301-
</div>
302-
</centering>
303-
304298
</div>
305299
</div>
306300
</div>
@@ -345,15 +339,24 @@ <h2 class="title is-3">
345339
</div>
346340
</section>
347341

342+
<div class="hero-body has-text-centered">
343+
<h2 class="title is-4">
344+
<span style="vertical-align: middle">Overview</span>
345+
</h2>
346+
</div>
347+
<div style="text-align: center;">
348+
<img id="teaser" width="85%" src="static/images/teaser.png">
349+
</div>
350+
348351
<section class="section">
349352
<div class="container is-max-desktop">
350353
<div class="columns is-centered">
351354
<div class="column is-full-width">
352355
<div class="content has-text-justified">
353356
<p>
354-
VideoScore2 is trained on the VideoFeedback2 dataset containing 27K human-annotated videos with both scores and rationales across three dimensions. We adopt a two-stage pipeline: first, supervised fine-tuning (SFT) on Qwen2.5-VL-7B-Instruct to establish format-following and scoring ability; then, reinforcement learning with Group Relative Policy Optimization (GRPO) to further align model outputs with human judgment and enhance analytical robustness.
357+
VideoScore2 is trained on the VideoFeedback2 dataset containing 27K human-annotated videos with both scores and rationales across three dimensions. We adopt a two-stage pipeline: first, supervised fine-tuning (SFT) on Qwen2.5-VL-7B-Instruct to establish format-following and scoring ability; then, reinforcement learning with Group Relative Policy Optimization (GRPO) to further align model outputs with human judgment and enhance analytical robustness.
355358

356-
Compared to VideoScore (v1), VS2 introduces interpretable scoring for three dimensions (Visual Quality, Text Alignment, Physical/Common-sense Consistency) and CoT-style rationales, achieving stronger generalization on out-of-domain benchmarks while providing transparent and human-aligned video evaluation.
359+
Compared to VideoScore (v1), VS2 introduces interpretable scoring for three dimensions (Visual Quality, Text Alignment, Physical/Common-sense Consistency) and CoT-style rationales, achieving stronger generalization on out-of-domain benchmarks while providing transparent and human-aligned video evaluation.
357360
</p>
358361
</div>
359362
</div>
@@ -362,7 +365,7 @@ <h2 class="title is-3">
362365

363366
<div class="hero-body has-text-centered">
364367
<h2 class="title is-4">
365-
<span style="vertical-align: middle">Evaluation Benchmarks</span>
368+
<span style="vertical-align: middle">Evaluation Results</span>
366369
</h2>
367370
</div>
368371
<div class="container is-max-desktop">
@@ -389,13 +392,27 @@ <h2 class="title is-5"><span style="font-size: 100%;">
389392
</div>
390393
</div>
391394
</div>
392-
393-
<div class="hero-body has-text-centered">
394-
<h2 class="title is-4">
395-
<span style="vertical-align: middle">Best-of-N Sampling</span>
396-
</h2>
395+
396+
<div class="container is-max-desktop">
397+
<div class="columns is-centered">
398+
<div class="column is-full-width">
399+
<h2 class="title is-5"><span style="font-size: 100%;">
400+
Best-of-N Sampling</span></h2>
401+
<p>
402+
We evaluate VIDEOSCORE2 with best-of-n (BoN) sampling (n = 5), where the model selects the
403+
best video among candidates. Six T2V models of moderate or poor quality are used, avoiding very
404+
strong ones to highlight the BoN effect. For 500 prompts, each model generates 500 × 5 videos.
405+
Comparison on VBench shows BoN consistently outperforms random sampling, confirm
406+
ing that VIDEOSCORE2 effectively guides higher-quality selection.
407+
</p>
408+
<img id="BoN" width="100%" src="static/images/BoN.png">
409+
</div>
410+
</div>
397411
</div>
398412

413+
414+
415+
399416

400417
</section>
401418

0 commit comments

Comments
 (0)