DIAMOND: Directed Inference for Artifact Mitigation in Flow Matching Models

Alicja Polowczyk*, Agnieszka Polowczyk*, Piotr Borycki, Joanna Waczyńska, Jacek Tabor, Przemysław Spurek
(*equal contribution)

DIAMOND is a training-free, inference-time guidance framework that tackles one of the most persistent challenges in modern text-to-image generation: visual and anatomical artifacts.

While recent models such as FLUX achieve impressive realism, they still frequently produce distorted structures, malformed anatomy, and visual inconsistencies. Unlike existing post-hoc or weight-modifying approaches, DIAMOND intervenes directly during the generative process by reconstructing a clean sample estimate at each step and steering the sampling trajectory away from artifact-prone latent states.

The method requires no additional training, no finetuning, and no weight modification, and can be applied to both flow matching models and standard diffusion models, enabling robust, zero-shot, high-fidelity image synthesis with substantially reduced artifacts.

📰 News

Feb. 2026: Initial codebase released with support for FLUX models (FLUX.1-dev, FLUX-schnell, FLUX-2-dev).
Feb. 2026: Paper is available on arXiv.
Coming Soon: SDXL code will be added to the repository.

⚙️ Environment Setup

We provide two separate environment configurations depending on the model variant.

🔹 Option A — FLUX.1 [dev], FLUX.1 [schnell], SDXL

Create and activate the Conda environment:

conda create -n diamond python=3.11 -y
conda activate diamond

Install PyTorch and remaining dependencies:

pip install torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu126
pip install -r requirements.txt

🔹 Option B — FLUX-2-dev

Requires a newer version of diffusers installed directly from GitHub.

conda create -n diamond-flux2 python=3.10 -y
conda activate diamond-flux2

pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 \
  --index-url https://download.pytorch.org/whl/cu118

pip uninstall diffusers -y
pip install git+https://github.com/huggingface/diffusers.git -U

pip install -r requirements2.txt

📦 SOTA Method Weights

We release our trained model weights for several state-of-the-art artifact mitigation methods.

Base Model	DiffDoctor	HPSv2	HandsXL
FLUX.1 [dev]	Coming Soon	Coming Soon	Coming Soon
FLUX.1 [schnell]	Coming Soon	Coming Soon	—
SDXL	—	—	Coming Soon
FLUX.2 [dev]	—	—	—

Full evaluation datasets (CSV files with prompts and corresponding random seeds) are provided in the datasets/ directory.
For SDXL, a shortened dataset variant is released, as no random seeds producing artifact-containing images could be found for some prompts.

DIAMOND

🚀 Generate a Single Image

Move to the repository root:

cd DIAMOND

You can select the base model using model=dev (FLUX.1 [dev]) or model=schnell (FLUX.1 [schnell]). Setting guidance.enabled=true enables DIAMOND guidance during sampling. To run without DIAMOND (baseline), set guidance.enabled=false. You can also modify the loss type and the lambda_schedule to explore different guidance behaviors.

Run Generation

python src/generate_single_image.py \
  model=dev \
  'prompt="Luxury crystal blue diamond, premium brand mark, vector style, simple and iconic, 4k resolution"' \
  seed=100285 \
  guidance.enabled=false \
  loss=power \
  lambda_schedule=power \
  lambda_schedule.start=25 \
  lambda_schedule.end=1 \
  lambda_schedule.power=2 \
  output.run_name=example_run

For FLUX.2 [dev], use the separate script:

python src/generate_single_image_flux2.py \
  model=flux2dev \
  'prompt="Luxury crystal blue diamond, premium brand mark, vector style, simple and iconic, 4k resolution"' \
  seed=100285 \
  output.run_name=example_run

Important

Activate the correct Conda environment before running (see Environment Setup). Outputs are saved to the outputs/ directory.

LoRA-based SOTA Methods

See the 📦 SOTA Method Weights table for model support. Enable LoRA and set the appropriate checkpoint in lora.path.

Example (HandsXL)

python src/generate_single_image.py \
  model=dev \
  'prompt="A South Asian man, 35 years old, with a visual impairment, reading braille books in a library."' \
  seed=100283 \
  lora=enabled \
  lora.path="checkpoints/lora/people_handv1.safetensors" \
  guidance.enabled=false \
  output.run_name=lora_example

Important

When using LoRA-based SOTA methods, always set guidance.enabled=false.

🚀 Generate Multiple Images

The generation setup is identical to single-image generation. DIAMOND can be enabled or disabled using guidance.enabled=true/false.
LoRA-based SOTA methods can be used by setting lora=enabled and specifying lora.path.

For FLUX.1 [dev], FLUX.1 [schnell], use:

python src/generate_images_csv.py \
  model=schnell \
  csv_path=/path/to/prompts.csv \
  loss=power \
  lambda_schedule=power \
  lambda_schedule.start=25 \
  lambda_schedule.end=1 \
  lambda_schedule.power=2 \
  output.run_name=example_run

For FLUX.2 [dev], use:

python src/generate_csv_flux2.py \
  model=flux2dev \
  csv_path=/path/to/prompts.csv \
  loss=power \
  lambda_schedule=power \
  lambda_schedule.start=25 \
  lambda_schedule.end=1 \
  lambda_schedule.power=2 \
  output.run_name=example_run

📊 Evaluation / Metrics

This script computes quantitative evaluation metrics for generated images.
Results are saved to outputs/metrics/results.txt by default and can be customized if needed.

The following metrics are computed: CLIP-T, MeanArtifactFreq (%), ArtifactPixelRatio (%), MAE, MAE(A), MAE(NA).

Run metric computation:

python src/generate_metrics.py \
  metrics.generated_dir=/path/to/generated/images \
  metrics.reference_dir=/path/to/reference/images \
  metrics.prompts_csv=/path/to/prompts.csv

For computing ImageReward, please refer to the official repository: https://github.com/zai-org/ImageReward

Note

Prompt CSV files used for evaluation are provided in the datasets/ directory.

🗂 Generate Custom Evaluation Dataset

Generate a dataset by searching for valid seeds and saving prompts + seeds into a CSV file.
Prompts are provided as .txt files (one per line). Example files are in prompts/. The script also saves generated images and corresponding artifact masks. The seed parameter specifies the starting seed from which the search begins

python src/generate_dataset.py \
  model=dev \
  seed=100000 \
  dataset.prompts_file=prompts/animals.txt \
  dataset.name=my_dataset \
  output.run_name=dataset_gen

Note

Dataset generation is supported for FLUX.1 [dev], FLUX.1 [schnell], FLUX.2 [dev], and SDXL.
To switch models, only the script name and the model value need to be changed:

generate_dataset.py → dev/schnell
generate_dataset_flux2.py → flux2dev
generate_dataset_sdxl.py → sdxl

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
config		config
prompts		prompts
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DIAMOND: Directed Inference for Artifact Mitigation in Flow Matching Models

📰 News

⚙️ Environment Setup

🔹 Option A — FLUX.1 [dev], FLUX.1 [schnell], SDXL

🔹 Option B — FLUX-2-dev

📦 SOTA Method Weights

DIAMOND

🚀 Generate a Single Image

Run Generation

LoRA-based SOTA Methods

Example (HandsXL)

🚀 Generate Multiple Images

📊 Evaluation / Metrics

Run metric computation:

🗂 Generate Custom Evaluation Dataset

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

gmum/DIAMOND

Folders and files

Latest commit

History

Repository files navigation

DIAMOND: Directed Inference for Artifact Mitigation in Flow Matching Models

📰 News

⚙️ Environment Setup

🔹 Option A — FLUX.1 [dev], FLUX.1 [schnell], SDXL

🔹 Option B — FLUX-2-dev

📦 SOTA Method Weights

DIAMOND

🚀 Generate a Single Image

Run Generation

LoRA-based SOTA Methods

Example (HandsXL)

🚀 Generate Multiple Images

📊 Evaluation / Metrics

Run metric computation:

🗂 Generate Custom Evaluation Dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages