Part of Tiny Aya Expedition - Tiny Aya Vision
Requires Python 3.12+ and uv.
uv syncFor development (includes pytest, ruff):
uv sync --group devThe default configuration pulls PyTorch wheels for CUDA 12.4. To use a different CUDA version or CPU-only:
# CPU-only
UV_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cpu uv sync
# CUDA 12.1
UV_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cu121 uv syncDownload the dataset (~13 GB)
python scripts/download_llava_pretrain.py --output-dir data/llava-pretrainTrain Alignment
python pipeline/train_alignment.py --vision-encoder siglip --llm global --models-dir outputs/checkpoints --data-dir data/llava-pretrainWe use Hydra for configuration management. You can run training locally or on Modal.
# Run with defaults
python pipeline/train_alignment.py
# Switch vision encoder to siglip and customize parameters inline
python pipeline/train_alignment.py vision=siglip training.batch_size=16 llm=global
# Resume an existing run
python pipeline/train_alignment.py resume="my-previous-uuid"Run the alignment training seamlessly on Modal without touching code. Overrides are passed directly:
# Run on Modal with defaults
modal run scripts/modal_train_alignment.py
# Or with Hydra overrides
modal run scripts/modal_train_alignment.py vision=siglip training.batch_size=32