Before starting training, ensure you have completed:
- Environment Setup: Install Being-VL and dependencies (see README.md)
- Data Preparation: Complete all steps in Data.md
- VQ token extraction
- vBPE tokenizer training
- Dataset tokenization for all stages
- Model Initialization: Convert LLaMA to Being-VL base model
- We use
acceleratefor multi-node training. You may modify the training scripts to use your own framework likeslurmordeepspeed, etc.
Initialize the base model from LLaMA-3.1-8B for training:
python beingvl/utils/convert_llama_to_being.py \
--llama_path /path/to/your/workspace/models/Llama-3.1-8B \
--being_tokenizer_config_path /path/to/your/workspace/models/being-tokenizer \
--being_vq_path /path/to/your/workspace/models/BeingVL-VQ-8K \
--output_path /path/to/your/workspace/models/beingvl/base \
--verify_loadingExpected Output:
- Initialized Being-VL base model in
/path/to/your/workspace/models/beingvl/base/ - Model with extended vocabulary for VQ and vBPE tokens
- Verification logs confirming successful model loading
Being-VL employs a 3-stage training methodology that combines curriculum-based data composition with progressive parameter unfreezing.
# Edit beingvl/scripts/train-stage-1.sh with your paths:
MODEL_PATH="/path/to/your/workspace/models/beingvl/base"
DATA_PATH="/path/to/your/workspace/data/tokenized/pt/pretrain_data_vbpe.jsonl"
OUTPUT_DIR="/path/to/your/workspace/models/beingvl/stage-1"
LOG_DIR="/path/to/your/workspace/logs/stage-1"
# Run Stage 1 training
bash beingvl/scripts/train-stage-1.sh <master_node_id># Edit beingvl/scripts/train-stage-2.sh with your paths:
DATA_PATH="/path/to/your/workspace/data/tokenized/sft_stage2/sft_stage2_data_vbpe.jsonl"
OUTPUT_DIR="/path/to/your/workspace/models/beingvl/stage-2"
LOG_DIR="/path/to/your/workspace/logs/stage-2"
# Run Stage 2 training (auto-detects Stage 1 output)
bash beingvl/scripts/train-stage-2.sh <master_node_id># Edit beingvl/scripts/train-stage-3.sh with your paths:
DATA_PATH="/path/to/your/workspace/data/tokenized/sft_stage3/sft_stage3_data_vbpe.jsonl"
OUTPUT_DIR="/path/to/your/workspace/models/beingvl/stage-3"
LOG_DIR="/path/to/your/workspace/logs/stage-3"
# Run Stage 3 training (auto-detects Stage 2 output)
bash beingvl/scripts/train-stage-3.sh <master_node_id>