🌐 Project Page | 📄 arXiv
Alicja Polowczyk*, Agnieszka Polowczyk*, Piotr Borycki, Joanna Waczyńska, Jacek Tabor, Przemysław Spurek
(*equal contribution)
DIAMOND is a training-free, inference-time guidance framework that tackles one of the most persistent challenges in modern text-to-image generation: visual and anatomical artifacts.
While recent models such as FLUX achieve impressive realism, they still frequently produce distorted structures, malformed anatomy, and visual inconsistencies. Unlike existing post-hoc or weight-modifying approaches, DIAMOND intervenes directly during the generative process by reconstructing a clean sample estimate at each step and steering the sampling trajectory away from artifact-prone latent states.
The method requires no additional training, no finetuning, and no weight modification, and can be applied to both flow matching models and standard diffusion models, enabling robust, zero-shot, high-fidelity image synthesis with substantially reduced artifacts.
- Feb. 2026: Initial codebase released with support for FLUX models (FLUX.1-dev, FLUX-schnell, FLUX-2-dev).
- Feb. 2026: Paper is available on arXiv.
- Coming Soon: SDXL code will be added to the repository.
We provide two separate environment configurations depending on the model variant.
Create and activate the Conda environment:
conda create -n diamond python=3.11 -y
conda activate diamondInstall PyTorch and remaining dependencies:
pip install torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu126
pip install -r requirements.txtRequires a newer version of diffusers installed directly from GitHub.
conda create -n diamond-flux2 python=3.10 -y
conda activate diamond-flux2
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 \
--index-url https://download.pytorch.org/whl/cu118
pip uninstall diffusers -y
pip install git+https://github.com/huggingface/diffusers.git -U
pip install -r requirements2.txt
We release our trained model weights for several state-of-the-art artifact mitigation methods.
| Base Model | DiffDoctor | HPSv2 | HandsXL |
|---|---|---|---|
| FLUX.1 [dev] | Coming Soon | Coming Soon | Coming Soon |
| FLUX.1 [schnell] | Coming Soon | Coming Soon | — |
| SDXL | — | — | Coming Soon |
| FLUX.2 [dev] | — | — | — |
Full evaluation datasets (CSV files with prompts and corresponding random seeds) are provided in the datasets/ directory.
For SDXL, a shortened dataset variant is released, as no random seeds producing artifact-containing images could be found for some prompts.
Move to the repository root:
cd DIAMONDYou can select the base model using model=dev (FLUX.1 [dev]) or model=schnell (FLUX.1 [schnell]).
Setting guidance.enabled=true enables DIAMOND guidance during sampling. To run without DIAMOND (baseline), set guidance.enabled=false.
You can also modify the loss type and the lambda_schedule to explore different guidance behaviors.
python src/generate_single_image.py \
model=dev \
'prompt="Luxury crystal blue diamond, premium brand mark, vector style, simple and iconic, 4k resolution"' \
seed=100285 \
guidance.enabled=false \
loss=power \
lambda_schedule=power \
lambda_schedule.start=25 \
lambda_schedule.end=1 \
lambda_schedule.power=2 \
output.run_name=example_runFor FLUX.2 [dev], use the separate script:
python src/generate_single_image_flux2.py \
model=flux2dev \
'prompt="Luxury crystal blue diamond, premium brand mark, vector style, simple and iconic, 4k resolution"' \
seed=100285 \
output.run_name=example_runImportant
Activate the correct Conda environment before running (see Environment Setup).
Outputs are saved to the outputs/ directory.
See the 📦 SOTA Method Weights table for model support. Enable LoRA and set the appropriate checkpoint in lora.path.
python src/generate_single_image.py \
model=dev \
'prompt="A South Asian man, 35 years old, with a visual impairment, reading braille books in a library."' \
seed=100283 \
lora=enabled \
lora.path="checkpoints/lora/people_handv1.safetensors" \
guidance.enabled=false \
output.run_name=lora_exampleImportant
When using LoRA-based SOTA methods, always set guidance.enabled=false.
The generation setup is identical to single-image generation. DIAMOND can be enabled or disabled using guidance.enabled=true/false.
LoRA-based SOTA methods can be used by setting lora=enabled and specifying lora.path.
For FLUX.1 [dev], FLUX.1 [schnell], use:
python src/generate_images_csv.py \
model=schnell \
csv_path=/path/to/prompts.csv \
loss=power \
lambda_schedule=power \
lambda_schedule.start=25 \
lambda_schedule.end=1 \
lambda_schedule.power=2 \
output.run_name=example_runFor FLUX.2 [dev], use:
python src/generate_csv_flux2.py \
model=flux2dev \
csv_path=/path/to/prompts.csv \
loss=power \
lambda_schedule=power \
lambda_schedule.start=25 \
lambda_schedule.end=1 \
lambda_schedule.power=2 \
output.run_name=example_runThis script computes quantitative evaluation metrics for generated images.
Results are saved to outputs/metrics/results.txt by default and can be customized if needed.
The following metrics are computed: CLIP-T, MeanArtifactFreq (%), ArtifactPixelRatio (%), MAE, MAE(A), MAE(NA).
python src/generate_metrics.py \
metrics.generated_dir=/path/to/generated/images \
metrics.reference_dir=/path/to/reference/images \
metrics.prompts_csv=/path/to/prompts.csv For computing ImageReward, please refer to the official repository: https://github.com/zai-org/ImageReward
Note
Prompt CSV files used for evaluation are provided in the datasets/ directory.
Generate a dataset by searching for valid seeds and saving prompts + seeds into a CSV file.
Prompts are provided as .txt files (one per line). Example files are in prompts/.
The script also saves generated images and corresponding artifact masks.
The seed parameter specifies the starting seed from which the search begins
python src/generate_dataset.py \
model=dev \
seed=100000 \
dataset.prompts_file=prompts/animals.txt \
dataset.name=my_dataset \
output.run_name=dataset_genNote
Dataset generation is supported for FLUX.1 [dev], FLUX.1 [schnell], FLUX.2 [dev], and SDXL.
To switch models, only the script name and the model value need to be changed:
generate_dataset.py→ dev/schnellgenerate_dataset_flux2.py→ flux2devgenerate_dataset_sdxl.py→ sdxl
