Skip to content

Commit 8db8cda

Browse files
author
xinzhuo20
committed
images updated
1 parent 30f5e87 commit 8db8cda

File tree

5 files changed

+11
-9
lines changed

5 files changed

+11
-9
lines changed

projects/hallusegbench/index.html

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -126,10 +126,10 @@ <h2 class="title is-3">Abstract</h2>
126126
Existing evaluations rely almost entirely on text- or label-based perturbations, which check only whether the predicted mask matches the queried label. Such evaluations overlook the spatial footprint and severity of hallucination and therefore fail to reveal vision-driven hallucinations, which are more challenging and more prevalent.
127127
To address this gap, we formalize the task of <span style="font-style: italic;">Counterfactual Segmentation Reasoning (CSR)</span>, where a model must segment the referenced object in the factual image and abstain in its counterfactual counterpart.
128128
To support this task, we curate <span class="model-name-gradient">HalluSegBench</span>, the first large-scale benchmark to diagnose referring and reasoning expression segmentation hallucinations using controlled visual counterfactuals, alongside new evaluation metrics that measure hallucination severity and disentangle vision- and language-driven failure modes.
129-
We further introduce <span class="model-name-gradient">RobustSeg</span>, a segmentation VLM trained with counterfactual fine-tuning (CFT) to learn when to segment and when to abstain. Experimental results confirm <span class="model-name-gradient">RobustSeg</span> reduces hallucinations by 30%, while improving segmentation performance on FP-RefCOCO(+/g).\\
129+
We further introduce <span class="model-name-gradient">RobustSeg</span>, a segmentation VLM trained with counterfactual fine-tuning (CFT) to learn when to segment and when to abstain. Experimental results confirm <span class="model-name-gradient">RobustSeg</span> reduces hallucinations by 30%, while improving segmentation performance on FP-RefCOCO(+/g).
130130
</p>
131131
</div>
132-
<img src="./static/images/teaser.png" class="interpolation-image"
132+
<img src="./static/images/teaser_new.jpg" class="interpolation-image"
133133
alt="Interpolate start reference image." width="85%" />
134134
</div>
135135
</div>
@@ -166,7 +166,7 @@ <h2 class="title is-3">✅ Contributions</h2>
166166
<h2 class="title is-3">Quantitative Results</h2>
167167
<div class="content has-text-justified">
168168
<div style="text-align: center; padding: 0 0 20px 0;">
169-
<img src="./static/images/quat_result.png" class="interpolation-image" alt="HalluSegBench results." width="85%" />
169+
<img src="./static/images/quat_new.jpg" class="interpolation-image" alt="HalluSegBench results." width="85%" />
170170
<p class="has-text-justified">
171171
Comparison of Reasoning Segmentation Models on <span class="model-name">HalluSegBench</span> Metrics, including textual and visual IoU drop for referral and reasoning tasks (<code>&Delta;IoU Referral</code>, <code>&Delta;IoU Reasoning</code>),
172172
factual and counterfactual Confusion Mask Score ( <code>CMS</code>).
@@ -200,21 +200,21 @@ <h2 class="title is-3">Qualitative Results</h2>
200200
</div> -->
201201
<!-- <div class="qual-example"> -->
202202
<div>
203-
<img src="static/images/example3.png" alt="Image 2" style="height: 800px; width: auto;">
203+
<img src="static/images/example1_new.jpg" alt="Image 2" style="height: 800px; width: auto;">
204204
</div>
205205
<p class="caption has-text-centered">
206-
Here, <i>c</i> = “full grown sheep” and <i>c′</i> = “a cow”.
206+
Here, <i>c</i> = “giant refrigerator” and <i>c′</i> = “microwave oven”.
207207
</p>
208208
<!-- </div> -->
209209
<!-- <div class="qual-example"> -->
210210
</br>
211211
</br>
212212
</br>
213213
<div>
214-
<img src="static/images/example2.png" alt="Image 3" style="height: 800px; width: auto;">
214+
<img src="static/images/example2_new.jpg" alt="Image 3" style="height: 800px; width: auto;">
215215
</div>
216216
<p class="caption has-text-centered">
217-
Here, <i>c</i> = “front cow” and <i>c′</i> = “front pig”.
217+
Here, <i>c</i> = “Where in the picture would be suitable for storing wine?” and <i>c′</i> = “Where in the picture would be suitablefor resting one's feet?”.
218218
</p>
219219
<!-- </div> -->
220220

@@ -258,8 +258,10 @@ <h2 class="title">BibTeX</h2>
258258
Commons Attribution-ShareAlike 4.0 International License</a>. We gratefully acknowledge
259259
<a href="https://arxiv.org/pdf/2308.00692">LISA</a>,
260260
<a href="https://arxiv.org/pdf/2311.03356">GLaMM</a>,
261-
<a href="https://arxiv.org/pdf/2312.02228">PixelLM</a>, and <a
262-
href="https://arxiv.org/pdf/2312.08366">SESAME</a>
261+
<a href="https://arxiv.org/pdf/2312.02228">PixelLM</a>,
262+
<a href="https://arxiv.org/pdf/2503.06520">Seg-Zero</a>,
263+
<a href="https://arxiv.org/pdf/2505.12081">VisionReasoner</a>,
264+
and <a href="https://arxiv.org/pdf/2312.08366">SESAME</a>
263265
for open-sourcing their models.
264266
</p>
265267
</div>
130 KB
Loading
117 KB
Loading
163 KB
Loading
372 KB
Loading

0 commit comments

Comments
 (0)