diff --git a/README.md b/README.md index d10c875..882fcd6 100644 --- a/README.md +++ b/README.md @@ -34,8 +34,14 @@ - [2024/10/15] 🔥 Video-XL is released, including model, training and evaluation code. ## Model weights -Please download our pre-trained and finetuned model weights from the [link](https://huggingface.co/sy1998/Video_XL/tree/main) - +Please download our pre-trained and finetuned model weights from the [link](https://huggingface.co/sy1998/Video_XL/tree/main), and modify the model path in the demo code. +```bash +git lfs install +git clone https://huggingface.co/sy1998/Video_XL +git clone https://huggingface.co/openai/clip-vit-large-patch14-336 +``` + + ## Installation ```bash conda create -n videoxl python=3.10 -y && conda activate videoxl diff --git a/demo.py b/demo.py index 14a62a2..067213e 100644 --- a/demo.py +++ b/demo.py @@ -8,13 +8,14 @@ # fix seed torch.manual_seed(0) - +# Please change the following paths to your own paths model_path = "/share/junjie/shuyan/VideoXL_weight_8" +clip_path = "/share/junjie/shuyan/clip-vit-large-patch14-336" video_path="/share/junjie/shuyan/test_demo/ad2_watch_15min.mp4" max_frames_num =900 # you can change this to several thousands so long you GPU memory can handle it :) gen_kwargs = {"do_sample": True, "temperature": 1, "top_p": None, "num_beams": 1, "use_cache": True, "max_new_tokens": 1024} -tokenizer, model, image_processor, _ = load_pretrained_model(model_path, None, "llava_qwen", device_map="cuda:0") +tokenizer, model, image_processor, _ = load_pretrained_model(model_path, None, "llava_qwen", device_map="cuda:0", mm_vision_tower=clip_path) model.config.beacon_ratio=[8] # you can delete this line to realize random compression of {2,4,8} ratio