Skip to content

Issue: LongVideoBench Video-LLaVA fails with empty frame list (ValueError: need at least one array to stack) #964

@omrastogi

Description

@omrastogi

Hi,

I’m running lmms-eval on LongVideoBench with the video_llava model and consistently hit a failure at the very start of generation:

Error
ValueError: need at least one array to stack

Stack trace (relevant part)

  • lmms_eval/models/video_llava.py:187 calls read_video_pyav(visuals[0], self.num_frames)

  • lmms_eval/models/model_utils/load_video.py:84:

    return np.stack([x.to_ndarray(format="rgb24") for x in frames])
  • frames is empty → np.stack([]) raises ValueError: need at least one array to stack

Repro command

python -m accelerate.commands.launch --num_processes=8 -m lmms_eval \
  --model video_llava \
  --tasks longvideobench_val_v \
  --batch_size 1 \
  --log_samples \
  --log_samples_suffix video_llava_lvb_v \
  --output_path ./logs/

Environment

  • Python: 3.11.14 (conda env)
  • accelerate: installed (launch works)
  • torch: (installed; CUDA device is visible, model loads on cuda:0)
  • transformers: present (warning about TRANSFORMERS_CACHE deprecated)
  • Running on an HPC cluster (Northeastern Explorer), but the failure occurs after the model loads and the task builds contexts successfully.

Observed behavior

  • Task builds contexts successfully: 1337/1337 contexts built.
  • Fails immediately when entering generate_until, at the first read_video_pyav call.

Expected behavior

  • The video loader should return at least one frame for valid videos, or gracefully handle unreadable/empty videos (e.g., retry with a different backend, skip sample, or raise a clearer error including the video path / doc_id).

Request
Could you please:

  1. Add a guard in read_video_pyav to handle the case where frames is empty (raise a clearer exception with video_path, or return a placeholder), and/or
  2. Add logging to print the offending video_path / doc_id / split when frames is empty, so users can debug dataset integrity/path issues.

If helpful, I can rerun with --verbosity=DEBUG and share the first failing doc_id/video path once I add a print around the callsite.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions