Long-Term Ad Memorability: Understanding & Generating Memorable Ads

🔥 NEW: We have released the Behavior LLaVA dataset, and training scripts! Just replace the data in the train.sh script after processing data from BLIFT.

Installation and Setup

Follow the steps below to install the required packages and set up the environment.

Step 1: Clone the Repository

Open your terminal and clone the repository using the following command:

git clone https://github.com/behavior-in-the-wild/ad-memorability.git

Step 2: Set Up the Conda Environment

Create and activate the Conda environment:

conda create -n admem python=3.10 -y
conda activate admem
pip install --upgrade pip  # Enable PEP 660 support
pip install -e .
pip install ninja
pip install flash-attn --no-build-isolation
pip install opencv-python
pip install numpy==1.26.4

Step 3: Set Up Model Zoo

Create directories and download the required models:

mkdir model_zoo
mkdir model_zoo/LAVIS
cd ./model_zoo/LAVIS
wget https://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/BLIP2/eva_vit_g.pth

Step 4: Set Up LLaMA-VID

cd path/to/ad-memorability
mkdir work_dirs
cd work_dirs
git lfs install
git clone https://huggingface.co/YanweiLi/llama-vid-13b-full-224-video-fps-1

Step 5: Prepare Data Directory

cd path/to/ad-memorability
mkdir data
cd ./data

LAMBDA videos are available on huggingface!

LAMBDA sampled frames coming soon!

Create .npy files of your videos. A sample file is given in the sample folder.
Store them in as ./data/videos/video_scenes/{id}.npy
Similarly, store the images in ./data/images/{id}.npy

Training

Install the desired version of DeepSpeed.
Update the train.sh script: Replace the --data_path argument with one of the following options, depending on your training task: lambda_bs_train.json lambda_combine_train.json lambda_cs_train.json blift.json
If you don't have the frames and want to train directly on the video @ 1 FPS, reformat your data as given here and replace the --data_path argument with lambda_train.json.
If you're training on your own dataset, create a train.json file. Each entry should contain an id and a conversation. You can use lambda_bs_train.json as a reference for formatting.

bash train.sh

Inference

For predicting memorability scores:

bash eval_bs.sh

For generating memorable videos:

bash eval_cs.sh

For BLIFT: Transform the test set similar to this For videos, follow the same as eval_cs.sh For images, replace videos with images in eval_cs.sh and replace the gt_file_question and gt_file_answers with the corresponding image files.

Citation

If you find this repo useful for your research, please consider citing the paper

@InProceedings{Si_2025_WACV,
    author    = {Si, Harini and Singh, Somesh and Singla, Yaman Kumar and Bhattacharyya, Aanisha and Baths, Veeky and Chen, Changyou and Shah, Rajiv Ratn and Krishnamurthy, Balaji},
    title     = {Long-Term Ad Memorability: Understanding \& Generating Memorable Ads},
    booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)},
    month     = {February},
    year      = {2025},
    pages     = {5707-5718}
}
@inproceedings{DBLP:conf/iclr/SinghISCSBK25,
  author       = {Somesh Kumar Singh and
                  Harini S. I and
                  Yaman Kumar Singla and
                  Changyou Chen and
                  Rajiv Ratn Shah and
                  Veeky Baths and
                  Balaji Krishnamurthy},
  title        = {Teaching Human Behavior Improves Content Understanding Abilities Of
                  VLMs},
  booktitle    = {The Thirteenth International Conference on Learning Representations,
                  {ICLR} 2025, Singapore, April 24-28, 2025},
  publisher    = {OpenReview.net},
  year         = {2025},
  url          = {https://openreview.net/forum?id=ff2V3UR9sC},
  timestamp    = {Thu, 15 May 2025 17:19:05 +0200},
  biburl       = {https://dblp.org/rec/conf/iclr/SinghISCSBK25.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

Acknowledgement

We would like to thank the following repos for their great work:

This work is built upon the LLaMA-VID,LLaVA, Vicuna.
This work utilizes pretrained weights from InstructBLIP.
We perform video-based evaluation from Video-ChatGPT.
Behavior LLaVA Dataset is built upon Reddit, YouTube, and AdsOfTheWorld.

License

The data and checkpoint is intended and licensed for research use only. They are also restricted to uses that follow the license agreement of LLaMA-VID,LLaVA, LLaMA, Vicuna, Reddit, YouTube, and GPT-4.

Contact

In case of any queries please feel free to reach out to ([email protected])

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data		data
imgs		imgs
llamavid		llamavid
sample		sample
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval_bs.sh		eval_bs.sh
eval_cs.sh		eval_cs.sh
pyproject.toml		pyproject.toml
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Long-Term Ad Memorability: Understanding & Generating Memorable Ads

Installation and Setup

Step 1: Clone the Repository

Step 2: Set Up the Conda Environment

Step 3: Set Up Model Zoo

Step 4: Set Up LLaMA-VID

Step 5: Prepare Data Directory

Training

Inference

Citation

Acknowledgement

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Languages

License

behavior-in-the-wild/ad-memorability

Folders and files

Latest commit

History

Repository files navigation

Long-Term Ad Memorability: Understanding & Generating Memorable Ads

Installation and Setup

Step 1: Clone the Repository

Step 2: Set Up the Conda Environment

Step 3: Set Up Model Zoo

Step 4: Set Up LLaMA-VID

Step 5: Prepare Data Directory

Training

Inference

Citation

Acknowledgement

License

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Languages

Packages