VGG-Flow

Official implementation of Value Gradient Guidance for Flow Matching Alignment (VGG-Flow), NeurIPS 2025.

Overview

VGG-Flow is an efficient and robust RL finetuning method for flow-matching models.

This repository currently provides:

SD3 LoRA finetuning with VGG-Flow.
Multiple reward models (aesthetic_score, pickscore, imagereward, hpscore).
Config-driven training with command-line overrides.

Project Structure

train_vggflow.py: main training entrypoint.
config/default_config.py: base config values.
config/*.py: reward-specific experiment configs.
lib/: model, flow matching, reward, and training modules.
run.sh: example launch command with many overrides.

Setup

Requirements

Python >= 3.8
CUDA (tested with 12.4)
PyTorch + TorchVision

Install dependencies:

pip install -r requirements.txt

Model Access

By default, training loads stabilityai/stable-diffusion-3-medium-diffusers via Diffusers. Make sure your environment has access to required Hugging Face model weights.

Configuration

Before training, review config/default_config.py and update values as needed.

Important fields:

config.logging.wandb_key: replace "PLACEHOLDER" if using Weights & Biases.
config.logging.use_wandb: set to False to disable W&B logging.
config.logging.wandb_dir: local W&B output directory.
config.saving.output_dir: checkpoint/output directory.

Reward presets:

config/aesthetic.py
config/pickscore.py
config/imagereward.py
config/hpsv2.py

Quick Start

Example: 2-GPU single-node training with the aesthetic reward preset:

torchrun --standalone --nproc_per_node=2 train_vggflow.py \
  --config=config/aesthetic.py \
  --seed=1 \
  --exp_name=exp_aesthetic

Single-GPU run:

torchrun --standalone --nproc_per_node=1 train_vggflow.py \
  --config=config/aesthetic.py \
  --seed=1 \
  --exp_name=exp_aesthetic

Override any config value from the command line, for example:

torchrun --standalone --nproc_per_node=2 train_vggflow.py \
  --config=config/aesthetic.py \
  --config.model.reward_scale=1e4 \
  --config.sampling.num_steps=20 \
  --config.training.lr=1e-3 \
  --seed=1 \
  --exp_name=exp_custom

Key Hyperparameters

config.model.reward_scale: reward strength; larger values push harder toward the reward objective.
config.model.timestep_fraction: fraction of trajectory transitions used for updates.
config.sampling.num_steps: number of Euler sampling steps.
config.training.lr: optimizer learning rate.
config.training.batch_size + config.training.gradient_accumulation_steps: effective optimization batch size.
config.model.unet_reg_scale: regularization strength for preserving base behavior.

Outputs

Checkpoints are saved under:
- config.saving.output_dir/<reward>_vggflow_<exp_name>_seed<seed>/checkpoint_epoch*
Training stats are written to:
- .../result.json (compressed pickle format)
If enabled, metrics and sample images are also logged to W&B.

Citation

If you find this work useful, please cite:

@inproceedings{liu2025vggflow,
  title={Value Gradient Guidance for Flow Matching Alignment},
  author={Liu, Zhen and Xiao, Tim Z. and Liu, Weiyang and Domingo-Enrich, Carles and Zhang, Dinghuai},
  booktitle={NeurIPS},
  year={2025},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VGG-Flow

Overview

Project Structure

Setup

Requirements

Model Access

Configuration

Quick Start

Key Hyperparameters

Outputs

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
config		config
lib		lib
README.md		README.md
requirements.txt		requirements.txt
run.sh		run.sh
train_vggflow.py		train_vggflow.py

Folders and files

Latest commit

History

Repository files navigation

VGG-Flow

Overview

Project Structure

Setup

Requirements

Model Access

Configuration

Quick Start

Key Hyperparameters

Outputs

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages