👀 What?

This repository contains code for implementing models and evaluating them for the Visual Storytelling task (using the VWP¹ dataset):
On the Challenges in Evaluating Visually Grounded Stories—In proceedings of the Text2Story workshop (ECIR 2025).

Note: Despite being proposed specifically for visual storytelling, this method is generalizable and can be extended to any task involving model-generated outputs with corresponding references.

🤔 Why?

VWP dataset is constructed using scenes from movies. Compared to the popular VIST dataset:

Visual sequences in VWP are well-connected and centered around recurring characters
Stories are longer with diverse entities

The recently proposed $d_{HM}$² metric evaluates model-generated stories by measuring their closeness to human stories along three dimensions—Coherence, Visual grounding, Repetition

In this work, we use the $d_{HM}$ metric and compare several general-purpose foundation vision-language-models (VLMs) with models trained specifically on the VWPv2.1 dataset. We discuss their performance, underline the challenges in evaluating the visually-grounded stories, and argue for considering more dimensions important for automatic narrative generation.

🤖 How?

For generating stories using VLMs, use the following code:
pip install -r requirements.txt
python -u generate_stories.py --model qwen-vl (run python generate_stories.py --help for more options)

For training & generating stories using the TAPM (+LLAMA 2) model and for evaluating stories using the $d_{HM}$ metric, we followed the instructions in this repository.

🔗 If you find this work useful, please consider citing it:

@inproceedings{
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
generate_stories.py		generate_stories.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

👀 What?

🤔 Why?

🤖 How?

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

akskuchi/vwp-visual-storytelling

Folders and files

Latest commit

History

Repository files navigation

👀 What?

🤔 Why?

🤖 How?

Footnotes

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages