Skip to content

hunarbatra/PiXLLaVA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

234 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

wakatime

PiXLLaVA: Interleaved object-centric Vision-Language Alignment for improving MLLMs

Add .env file:

WANDB_API_KEY=YOUR_WANDB_API_KEY
HF_TOKEN=YOUR_HUGGING_FACE_TOKEN # to upload trained models to HuggingFace Hub

Steps to run:

# install packages and load in editable mode
pip install -e .

# download data (96 GB)
python download_data.py pretrain_data
python download_data.py finetune_data # takes 1-2 hours

# init base model
bash scripts/pixllava/get_base_model.sh

# pretrain
bash scripts/pixllava/pretrain.sh

# finetune
bash scripts/pixllava/finetune.sh

Or run the script bash run.sh to run all the scripts.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •