ARGUS: Hallucination and Omission Evaluation for Video‑LLMs

ARGUS is a framework to calculate the degree of hallucination and omission in free-form video captions.

ArgusCost‑H (or Hallucination-Cost) — degree of hallucinated content in the video-caption
ArgusCost‑O (or Omission-Cost) — degree of omitted content in the video-caption

Lower values indicate better "performance".

Repository Layout

assets/                  # store the clips, prompts, etc
code/                    # implementation
    gen_model_caption.py
    nli_eval.py
    compute_metrics.py
    single_caption_inference.py
scripts/                 # exact commands to run
requirements.txt         # default environment
qwen_requirements.txt    # alt env for Qwen / SmolVLM

Installation

git clone https://github.com/JARVVVIS/argus.git
cd argus

conda create -n argus python=3.11 -y
conda activate argus

## install torch
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0
# (if neeeded) install cudatoolkit
conda install -c conda-forge cudatoolkit-dev
## install flash-attn
pip install flash-attn --no-build-isolation

## rest of the dependencies
pip install -r requirements.txt
pip install -e .  # installs in‑house video_models for easy evaluations

# note: dedicated env. for Qwen / SmolVLM
# create a new env. and run previous commands except using "qwen_requirements.txt" instead of "requirements.txt"
# pip install -r qwen_requirements.txt

Preparing the Data

To download the evaluation clips and place them under assets/clips/:

python code/download_argus_videos.py

Basic Usage

First set ROOT_DIR in code/configs.py

Broad Steps: Generate captions → judge with NLI → compute metrics:

python code/gen_model_caption.py --model_name $model --clip_name $clip_name

python code/nli_eval.py --model_name $model --clip_name $clip_name

python code/compute_metrics.py --model_name $model --clip_name $clip_name

NOTE: If "clip_name" is not specified, it will use all clips in assets/clips/. Similarly, if "model_name" is not specified, it will use all models in code/configs.py.

Single‑caption inference:

python code/single_caption_inference.py \
  --caption "A panda eats bamboo." \
  --clip_path assets/clips/clip_12345.mp4

Ready‑made pipelines are available under scripts/.

Citation

TODO:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ARGUS: Hallucination and Omission Evaluation for Video‑LLMs

Repository Layout

Installation

Preparing the Data

Basic Usage

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
code		code
scripts		scripts
video_models		video_models
README.md		README.md
qwen_requirements.txt		qwen_requirements.txt
requirements.txt		requirements.txt
setup.py		setup.py

JARVVVIS/argus

Folders and files

Latest commit

History

Repository files navigation

ARGUS: Hallucination and Omission Evaluation for Video‑LLMs

Repository Layout

Installation

Preparing the Data

Basic Usage

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages