Follow-Your-Preference

Follow-Your-Prefence

📝 Introduction

This repository contains the implementation code for paper:

Follow-Your-Preference: Towards Preference-Aligned Image Inpainting

International Conference on Learning Representations (ICLR 2026)

Yutao Shen✳, Junkun Yuan✳✉, Toru Aonishi, Hideki Nakayama, Yue Ma✉

We study image inpainting with preference alignment, giving insights into its effectiveness, scalability, challenges.

⚙️ Installation

Our method has been implemented with python 3.12 and CUDA 12.1. Basically, you need to prepare two envs:

One for inference and training
One for scoring and evaluation

📦 Dataset

Download BrushData
- Hugging Face dataset: https://huggingface.co/datasets/random123123/BrushData
- Each item in BrushData contains fields like:
  - 000279227.aesthetic_score (unused)
  - 000279227.caption (string)
  - 000279227.height (int)
  - 000279227.image (binary; can be read as PNG — we highly recommend saving images locally as .png)
  - 000279227.original_key (unused)
  - 000279227.segmentation (dict)
  - 000279227.url (unused)
  - 000279227.width (int)

Extract and build JSON

Untar all archives from BrushData, gather the useful fields, and create a JSON list like:

[
  {
    "gt_image_path": "/your/path/to/000279227.png",
    "caption": "000279227.caption",
    "height": 000279227.height,
    "width": 000279227.width,
    "segmentation": 000279227.segmentation,
    "image_id": "00027_000279227"
  }
  // ...
]

📚 Training and Inference

git clone --recursive https://github.com/shenytzzz/Follow-Your-Preference.git
conda create -n train python=3.12
cd ./flux
pip install -e ".[all]"
cd ../diffusers
pip install -e ".[torch]"
pip install omegaconf transformers==4.52.0 accelerate==1.6.0 datasets==3.5.0 deepspeed==0.17.1

⚽️ Score and Evaluation

git clone --recursive https://github.com/shenytzzz/Follow-Your-Preference.git
conda create -n eval python=3.12
pip install image-reward
pip install hpsv2==1.2.0
pip install open-clip-torch==2.32.0
pip install clip
pip install torchmetrics==1.7.4
cd ./t2v_metrics
conda install pip -y
conda install ffmpeg -c conda-forge
pip install -e .
cd ..
pip install transformers==4.45.2
pip install hpsv3
pip install tensorboard
pip install wandb

Note

Place extra/bpe_simple_vocab_16e6.txt.gz into ~/miniconda3/envs/test_eval/lib/python3.12/site-packages/hpsv2/src/open_clip/
We modified score functions in ~/miniconda3/envs/test_eval/lib/python3.12/site-packages/hpsv2/__init__.py and ~/miniconda3/envs/test_eval/lib/python3.12/site-packages/hpsv2/img_score.py for multi-gpu evaluation. Our modified versions are also provided under extra/
Our code is built opon diffusers 0.33.1. Following BrushNet, we:
- Added directory brushnet under diffusers/src/diffusers/pipelines/
- Added file brushnet.py under diffusers/src/diffusers/models/
- Updated __init__.py accordingly.
- Replace the entire directory unet under diffusers/src/diffusers/models/ with the one from diffusers 0.27.0 to ensure the compatibility with BrushNet. (Other models that rely on the unet from diffusers 0.33.1 may not work as expected.)

🚀 Try Our Image Inpainting Model

We released our models on the huggingface, feel free to have a try 😉.

BruPA

from diffusers import StableDiffusionBrushNetPipeline, BrushNetModel, UniPCMultistepScheduler
import torch
import cv2
import numpy as np
from PIL import Image

brushnet = BrushNetModel.from_pretrained(
    "shenyt/BruPA", 
    subfolder="brushnet", 
    torch_dtype=torch.float16
).to("cuda")
pipe = StableDiffusionBrushNetPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", brushnet=brushnet, torch_dtype=torch.float16
).to("cuda")
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)

init_image = cv2.imread(...)[:,:,::-1]
mask_image = 1.*(cv2.imread(...).sum(-1)>255)[:,:,np.newaxis]
init_image = init_image * (1-mask_image)
init_image = Image.fromarray(init_image.astype(np.uint8)).convert("RGB")
mask_image = Image.fromarray(mask_image.astype(np.uint8).repeat(3,-1)*255).convert("RGB")

image = pipe(
    caption, 
    init_image, 
    mask_image, 
    num_inference_steps=50, 
    generator=generator,
    brushnet_conditioning_scale=brushnet_conditioning_scale
).images[0]
image.save(f"output.png")

FluPA

import torch
from diffusers import FluxFillPipeline, FluxTransformer2DModel
from PIL import Image

image = Image.open(...).convert("RGB")
mask = Image.open(...).convert("RGB")
ckpt_path = ...

transformer = FluxTransformer2DModel.from_pretrained(
    "shenyt/FluPA-fill",
    subfolder="transformer",
    torch_dtype=torch.bfloat16
).to("cuda")
pipe = FluxFillPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-Fill-dev", 
    torch_dtype=torch.bfloat16,
    transformer=transformer
).to("cuda")

image = pipe(
    prompt="...",
    image=image,
    mask_image=mask,
    height=512,
    width=512,
    guidance_scale=30,
    num_inference_steps=20,
    generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save(f"output.png")

✨ Basic Usage

📂 Generate Candidates

Configure paths and seed
- Set the JSON path from Installation and the seed generator_seed in either:
  - Sample part in /path/to/configs_flux.yaml
  - Sample part in /path/to/configs_brushnet.yaml

Run the sampler

Use one of:
- scripts/sample_flux.py
- scripts/sample_brushnet.py

For sampling, we use config files to manage everthing. Specify generator_seed in configs/configs_<model>.yaml to control diversity.

# BrushNet
accelerate launch \
  --config_file configs/accelerate_default.yaml \
  scripts/sample_brushnet.py
# FLUX.1 Fill
accelerate launch \
  --config_file configs/accelerate_default.yaml \
  scripts/sample_flux.py

🏷️ Construct Preference Data

Create annotation list
- Follow Merge the annotations for sampled images in scripts/merge_score_jsons.ipynb to produce a list-style JSON for scoring.
Start vLLM server Following UnifiedReward to create the env for CodeGoat24/UnifiedReward-qwen-7b
```
conda activate vllm
bash scripts/vllm_server.sh
```

Run scoring

General metrics:

conda activate eval
accelerate launch
  --config_file configs/accelerate_default.yaml \
  scripts/score.py \
  --metric <ensemble> \
  --annotation_path /path/to/annotation_list \
  --output_dir /path/to/output \

UnifiedReward only:

python scripts/score_unifiedreward.py
  --annotation_path /path/to/annotation_list
  --output_dir /path/to/output
  --seed 0
  --port <port_you_specify_when_launching_vllm>

Merge scores into one json file
- Continue in scripts/merge_score_jsons.ipynb to attach scores to annotations for training.

🏋️ Preference Alignment Training

For training, we use config files to manage everthing. You could find it in configs/configs_<model>.yaml.

FLUX.1 Fill

conda activate train
accelerate launch \
  --config_file configs/accelerate_flux.yaml \
  scripts/train_flux_dpo.py

BrushNet

conda activate train
accelerate launch \
  --config_file configs/accelerate_default.yaml \
  scripts/train_brushnet_dpo.py

💯 Evaluation

Generate images for testing

conda activate train
accelerate launch \
  --num_processes 8 \
  --mixed_precision bf16 \
  scripts/gen.py \
  --ckpt_path /path/to/ckpt \
  --base_model_path [black-forest-labs/FLUX.1-Fill-dev|runwayml/stable-diffusion-v1-5] \ 
  --image_save_path /path/to/save/images \
  --mapping_file /path/to/BrushBench/mapping_file_list.json \
  --base_dir /path/to/BrushBench \
  --use_blended \
  --model flux \
  --benchmark brushbench \
  --num_steps 50 # To reproduce our results, 20 for FLUX.1 Fill and 10 for BrushNet

Evaluate images Remember to launch the vLLM server first

conda activate eval
accelerate launch 
  --num_processes 8 \
  --mixed_precision bf16 \
  scripts/eval.py \
  --image_save_path /path/to/save/images \
  --benchmark brushbench \
  --mapping_file /path/to/BrushBench/mapping_file_list.json \
  --base_dir /path/to/BrushBench \
  --use_blend

Evaluate images with GPT4

python scripts/eval_gpt4.py \
  --save_path /path/to/save/results \
  --mapping_file /path/to/BrushBench/mapping_file_list.json \
  --mask_key inpainting_mask \
  --base_dir /path/to/bench \
  --image_dir /path/to/images/to/eval

📑 Citation

@article{fyp2026,
  title={Follow-Your-Preference: Towards Preference-Aligned Image Inpainting},
  author={Yutao Shen, Junkun Yuan, Toru Aonishi, Hideki Nakayama, Yue Ma},
  journal={International Conference on Learning Representations},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
configs		configs
diffusers		diffusers
extra		extra
flux @ 802fb47		flux @ 802fb47
perception_models @ 430a012		perception_models @ 430a012
scripts		scripts
t2v_metrics @ 98f184d		t2v_metrics @ 98f184d
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
result.png		result.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Follow-Your-Preference

📝 Introduction

⚙️ Installation

📦 Dataset

📚 Training and Inference

⚽️ Score and Evaluation

🚀 Try Our Image Inpainting Model

✨ Basic Usage

📂 Generate Candidates

🏷️ Construct Preference Data

🏋️ Preference Alignment Training

💯 Evaluation

📑 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Follow-Your-Preference

📝 Introduction

⚙️ Installation

📦 Dataset

📚 Training and Inference

⚽️ Score and Evaluation

🚀 Try Our Image Inpainting Model

✨ Basic Usage

📂 Generate Candidates

🏷️ Construct Preference Data

🏋️ Preference Alignment Training

💯 Evaluation

📑 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages