Follow-up works

Based on ADM, the downstream works are listed as follows:

Paper	Task	Date	Conference/Journal
CycleDiff	Cycle Generation	2026-01-17	TIP 2026
DISCO	Combinatorial optimization	2025-08-13	CAD/Graphics 2025 Best Paper >>Graphical Models
LaDi-WM	World modeling	2025-08-02	CoRL 2025
EA6D	6D Pose estimation	2025-06-26	ICCV 2025
PASG	Shape generation	2025-04-18	IEEE TVCG
DiffMOT	Multiple object tracking	2024-02-27	CVPR 2024
DiffusionEdge	Edge detection	2023-12-09	AAAI 2024

ADM

Simultaneous Image to Zero and Zero to Noise: Diffusion Models with Analytical Image Attenuation ()

Framework

News

Update ddm_const_2, replacing the noise scheduler \sqrt(t) with t.
We now update training for text-2-img, please refer to text-2-img.
We now modify the two-branch UNet, resulting a single-decoder UNet architecture.
You can use the single-decoder UNet in uncond-unet-sd and cond-unet-sd.

I. Before Starting.

install torch

pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

install other packages.

pip install -r requirement.txt

prepare accelerate config.

accelerate config

II. Prepare Data.

The file structure should look like:
(a) unconditional cifar10:

cifar-10-python
|-- cifar-10-batches-py
|   |-- data_batch_1
|   |-- data_batch_2
|   |-- XXX

(b) unconditional Celeb-AHQ:

celebahq
|-- celeba_hq_256
|   |-- 00000.jpg
|   |-- 00001.jpg
|   |-- XXXXX.jpg

(c) conditional DIV2K:

DIV2K
|-- DIV2K_train_HR
|   |-- 0001.png
|   |-- 0002.png
|   |-- XXXX.png
|-- DIV2K_valid_HR
|   |-- 0801.png
|   |-- 0802.png
|   |-- XXXX.png

(d) conditional DUTS:

DUTS
|-- DUTS-TR
|   |-- DUTS-TR-Image
|   |   |-- XXX.jpg
|   |-- DUTS-TR-Mask
|   |   |-- XXX.png
|-- DUTS-TE
|   |-- DUTS-TE-Image
|   |   |-- XXX.jpg
|   |-- DUTS-TE-Mask
|   |   |-- XXX.png

III. Unconditional training on image space for Cifar10 dataset.

accelerate launch train_uncond_dpm.py --cfg ./configs/cifar10/ddm_uncond_const_uncond_unet.yaml

IV. Unconditional training on latent space for CelebAHQ256 dataset.

training auto-encoder:

accelerate launch train_vae.py --cfg ./configs/celebahq/celeb_ae_kl_256x256_d4.yaml

you should add the model weights in the first step to config file ./configs/celebahq/celeb_uncond_ddm_const_uncond_unet_ldm.yaml (line 41), then train latent diffusion model:

accelerate launch train_uncond_ldm.py --cfg ./configs/celebahq/celeb_uncond_ddm_const_uncond_unet_ldm.yaml

V. Conditional training on latent space for DIV2K dataset. (super-resolution task for example.)

training auto-encoder:

accelerate launch train_vae.py --cfg ./configs/super-resolution/div2k_ae_kl_512x512_d4.yaml

training latent diffusion model:

accelerate launch train_cond_ldm.py --cfg ./configs/super-resolution/div2k_cond_ddm_const_ldm.yaml

VI. Conditional training on image space. (saliency detection task for example.)

accelerate launch train_cond_dpm.py --cfg ./configs/saliency/DUTS_ddm_const_dpm_114.yaml

VII. Faster Sampling

change the sampling steps "sampling_timesteps" in the config file

unconditional generation:

python sample_uncond.py --cfg ./configs/cifar10/ddm_uncond_const_uncond_unet.yaml
python sample_uncond.py --cfg ./configs/celebahq/celeb_uncond_ddm_const_uncond_unet_ldm.yaml

conditional generation (Latent space model):

Super-resolution:

python ./eval_downstream/eval_sr.py --cfg ./configs/super-resolution/div2k_sample.yaml

Inpainting:

python ./eval_downstream/sample_inpainting.py --cfg ./configs/celebahq/celeb_uncond_ddm_const_uncond_unet_ldm_sample.yaml

Saliency:

python ./eval_downstream/eval_saliency.py --cfg ./configs/saliency/DUTS_sample_114.yaml

VIII. Training for Text-2-Iamge

download laion data from laion.
download metadata using img2dataset, please refer to here.
install clip.

pip install ftfy regex tqdm
pip install git+https://github.com/openai/CLIP.git

The final data structure looks like:

|-- laion
|   |-- 00000.tar
|   |-- 00001.tar
|   |-- XXXXX.tar

training with config file text-2-img.

accelerate launch train_cond_ldm.py --cfg ./configs/text2img/ddm_uncond_const.yaml

Note that the pretrained weight of the AutoEncoder is downloaded from here, and you should unzip the file.

Pretrained Weight (the weights are not correct, please wait for updates.)

Task	Weight	Config

Contact

If you have some questions, please contact huangai@nudt.edu.cn.

Thanks

Thanks to the public repos: DDPM and LDM for providing the base code.

Citation

@article{huang2023decoupled,
  title={Simultaneous Image to Zero and Zero to Noise: Diffusion Models with Analytical Image Attenuation},
  author={Huang, Yuhang and Qin, Zheng and Liu, Xinwang and Xu, Kai},
  journal={arXiv preprint arXiv:2306.13720},
  year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Follow-up works

ADM

Simultaneous Image to Zero and Zero to Noise: Diffusion Models with Analytical Image Attenuation ()

Framework

News

I. Before Starting.

II. Prepare Data.

III. Unconditional training on image space for Cifar10 dataset.

IV. Unconditional training on latent space for CelebAHQ256 dataset.

V. Conditional training on latent space for DIV2K dataset. (super-resolution task for example.)

VI. Conditional training on image space. (saliency detection task for example.)

VII. Faster Sampling

VIII. Training for Text-2-Iamge

Pretrained Weight (the weights are not correct, please wait for updates.)

Contact

Thanks

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.idea		.idea
assets		assets
configs		configs
ddm		ddm
eval_downstream		eval_downstream
metrics		metrics
taming		taming
torch_utils		torch_utils
unet		unet
README.md		README.md
requirement.txt		requirement.txt
sample_cond_ldm.py		sample_cond_ldm.py
sample_uncond.py		sample_uncond.py
train_cond_dpm.py		train_cond_dpm.py
train_cond_ldm.py		train_cond_ldm.py
train_uncond_dpm.py		train_uncond_dpm.py
train_uncond_ldm.py		train_uncond_ldm.py
train_vae.py		train_vae.py

Folders and files

Latest commit

History

Repository files navigation

Follow-up works

ADM

Simultaneous Image to Zero and Zero to Noise: Diffusion Models with Analytical Image Attenuation ()

Framework

News

I. Before Starting.

II. Prepare Data.

III. Unconditional training on image space for Cifar10 dataset.

IV. Unconditional training on latent space for CelebAHQ256 dataset.

V. Conditional training on latent space for DIV2K dataset. (super-resolution task for example.)

VI. Conditional training on image space. (saliency detection task for example.)

VII. Faster Sampling

VIII. Training for Text-2-Iamge

Pretrained Weight (the weights are not correct, please wait for updates.)

Contact

Thanks

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages