Skip to content

Commit 372e46c

Browse files
committed
Update README for transformer models
1 parent a88cf30 commit 372e46c

File tree

1 file changed

+7
-9
lines changed

1 file changed

+7
-9
lines changed

README.md

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
# k-diffusion
22

3-
An implementation of [Elucidating the Design Space of Diffusion-Based Generative Models](https://arxiv.org/abs/2206.00364) (Karras et al., 2022) for PyTorch.
3+
An implementation of [Elucidating the Design Space of Diffusion-Based Generative Models](https://arxiv.org/abs/2206.00364) (Karras et al., 2022) for PyTorch, with enhancements and additional features, such as improved sampling algorithms and transformer-based diffusion models.
44

55
## Installation
66

77
`k-diffusion` can be installed via PyPI (`pip install k-diffusion`) but it will not include training and inference scripts, only library code that others can depend on. To run the training and inference scripts, clone this repository and run `pip install -e <path to repository>`.
88

9-
## Training:
9+
## Training
1010

1111
To train models:
1212

@@ -17,7 +17,7 @@ $ ./train.py --config CONFIG_FILE --name RUN_NAME
1717
For instance, to train a model on MNIST:
1818

1919
```sh
20-
$ ./train.py --config configs/config_mnist.json --name RUN_NAME
20+
$ ./train.py --config configs/config_mnist_transformer.json --name RUN_NAME
2121
```
2222

2323
The configuration file allows you to specify the dataset type. Currently supported types are `"imagefolder"` (finds all images in that folder and its subfolders, recursively), `"cifar10"` (CIFAR-10), and `"mnist"` (MNIST). `"huggingface"` [Hugging Face Datasets](https://huggingface.co/docs/datasets/index) is also supported.
@@ -28,15 +28,15 @@ Multi-GPU and multi-node training is supported with [Hugging Face Accelerate](ht
2828
$ accelerate config
2929
```
3030

31-
on all nodes, then running:
31+
then running:
3232

3333
```sh
3434
$ accelerate launch train.py --config CONFIG_FILE --name RUN_NAME
3535
```
3636

37-
on all nodes.
37+
## Enhancements/additional features
3838

39-
## Enhancements/additional features:
39+
- k-diffusion has support for training transformer-based diffusion models (like [DiT](https://arxiv.org/abs/2212.09748) but improved).
4040

4141
- k-diffusion supports a soft version of [Min-SNR loss weighting](https://arxiv.org/abs/2303.09556) for improved training at high resolutions with less hyperparameters than the loss weighting used in Karras et al. (2022).
4242

@@ -52,8 +52,6 @@ on all nodes.
5252

5353
- k-diffusion can calculate, during training, the gradient noise scale (1 / SNR), from _An Empirical Model of Large-Batch Training_, https://arxiv.org/abs/1812.06162).
5454

55-
## To do:
56-
57-
- Anything except unconditional image diffusion models
55+
## To do
5856

5957
- Latent diffusion

0 commit comments

Comments
 (0)