Skip to content

[CVPR2025] Official repository for "VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide"

Notifications You must be signed in to change notification settings

DoHunLee1/VideoGuide

Repository files navigation

[CVPR2025] VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide

This repository is the official implementation of VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide, led by

Dohun Lee*, Bryan Sangwoo Kim*, Geon Yeong Park, Jong Chul Ye

main figure

Project Website arXiv


🔥 Summary

VideoGuide 🚀 enhances temporal quality in video diffusion models without additional training or fine-tuning by leveraging a pretrained model as a guide. During inference, it uses a guiding model to provide a temporally consistent sample, which is interpolated with the sampling model's output to improve consistency. VideoGuide shows the following advantages:

  1. Improved temporal consistency with preserved imaging quality and motion smoothness
  2. Fast inference as application only to early steps is proved sufficient
  3. Prior distillation of the guiding model

🗓 ️News

  • [8 Oct 2024] Code and paper are uploaded.

🛠️ Setup

First, create your environment. We recommend using the following comments.

git clone https://github.com/DoHunLee1/VideoGuide.git
cd VideoGuide

conda create -n videoguide python=3.10
conda activate videoguide
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt
pip install xformers==0.0.22.post4 --index-url https://download.pytorch.org/whl/cu118

⏳ Models

Models Checkpoints
VideoCrafter2 Hugging Face
AnimateDiff Hugging Face
RealisticVision Hugging Face
Stable Diffusion v1.5 Hugging Face

Please refer to the official repositories of AnimateDiff and VideoCrafter for detailed explanation and setup guide for each model. We thank them for sharing their impressive work!

🌄 Example

An example of using VideoGuide is provided in the inference.sh code.

📝 Citation

If you find our method useful, please cite as below or leave a star to this repository.

@article{lee2024videoguide,
  title={VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide},
  author={Lee, Dohun and Kim, Bryan S and Park, Geon Yeong and Ye, Jong Chul},
  journal={arXiv preprint arXiv:2410.04364},
  year={2024}
}

🤗 Acknowledgements

We thank the authors of AnimateDiff, VideoCrafter, Stable Diffusion for sharing their awesome work. We also thank the CivitAI community for sharing their impressive T2I models!

Note

This work is currently in the preprint stage, and there may be some changes to the code.

About

[CVPR2025] Official repository for "VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •