Skip to content

Commit d9c449e

Browse files
patrickvonplatenpcuencaapolinario
authored
Custome Pipelines (#744)
* [Custom Pipelines] * uP * make style * finish * finish * remove ipdb * upload * fix * finish docs * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: apolinario <[email protected]> * finish * final uploads * remove unnecessary test Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: apolinario <[email protected]>
1 parent f3128c8 commit d9c449e

File tree

10 files changed

+577
-18
lines changed

10 files changed

+577
-18
lines changed

docs/source/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@
1212
title: "Loading Pipelines, Models, and Schedulers"
1313
- local: using-diffusers/configuration
1414
title: "Configuring Pipelines, Models, and Schedulers"
15+
- local: using-diffusers/custom_pipelines
16+
title: "Loading and Creating Custom Pipelines"
1517
title: "Loading"
1618
- sections:
1719
- local: using-diffusers/unconditional_image_generation
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# Custom Pipelines
14+
15+
Diffusers allows you to conveniently load any custom pipeline from the Hugging Face Hub as well as any [official community pipeline](https://github.com/huggingface/diffusers/tree/main/examples/community)
16+
via the [`DiffusionPipeline`] class.
17+
18+
## Loading custom pipelines from the Hub
19+
20+
Custom pipelines can be easily loaded from any model repository on the Hub that defines a diffusion pipeline in a `pipeline.py` file.
21+
Let's load a dummy pipeline from [hf-internal-testing/diffusers-dummy-pipeline](https://huggingface.co/hf-internal-testing/diffusers-dummy-pipeline).
22+
23+
All you need to do is pass the custom pipeline repo id with the `custom_pipeline` argument alongside the repo from where you wish to load the pipeline modules.
24+
25+
```python
26+
from diffusers import DiffusionPipeline
27+
28+
pipeline = DiffusionPipeline.from_pretrained(
29+
"google/ddpm-cifar10-32", custom_pipeline="hf-internal-testing/diffusers-dummy-pipeline"
30+
)
31+
```
32+
33+
This will load the custom pipeline as defined in the [model repository](https://huggingface.co/hf-internal-testing/diffusers-dummy-pipeline/blob/main/pipeline.py).
34+
35+
<Tip warning={true} >
36+
37+
By loading a custom pipeline from the Hugging Face Hub, you are trusting that the code you are loading
38+
is safe 🔒. Make sure to check out the code online before loading & running it automatically.
39+
40+
</Tip>
41+
42+
## Loading official community pipelines
43+
44+
Community pipelines are summarized in the [community examples folder](https://github.com/huggingface/diffusers/tree/main/examples/community)
45+
46+
Similarly, you need to pass both the *repo id* from where you wish to load the weights as well as the `custom_pipeline` argument. Here the `custom_pipeline` argument should consist simply of the filename of the community pipeline excluding the `.py` suffix, *e.g.* `clip_guided_stable_diffusion`.
47+
48+
Since community pipelines are often more complex, one can mix loading weights from an official *repo id*
49+
and passing pipeline modules directly.
50+
51+
```python
52+
from diffusers import DiffusionPipeline
53+
from transformers import CLIPFeatureExtractor, CLIPModel
54+
55+
clip_model_id = "laion/CLIP-ViT-B-32-laion2B-s34B-b79K"
56+
57+
feature_extractor = CLIPFeatureExtractor.from_pretrained(clip_model_id)
58+
clip_model = CLIPModel.from_pretrained(clip_model_id)
59+
60+
pipeline = DiffusionPipeline.from_pretrained(
61+
"CompVis/stable-diffusion-v1-4",
62+
custom_pipeline="clip_guided_stable_diffusion",
63+
clip_model=clip_model,
64+
feature_extractor=feature_extractor,
65+
)
66+
```
67+
68+
## Adding custom pipelines to the Hub
69+
70+
To add a custom pipeline to the Hub, all you need to do is to define a pipeline class that inherits
71+
from [`DiffusionPipeline`] in a `pipeline.py` file.
72+
Make sure that the whole pipeline is encapsulated within a single class and that the `pipeline.py` file
73+
has only one such class.
74+
75+
Let's quickly define an example pipeline.
76+
77+
78+
```python
79+
import torch
80+
from diffusers import DiffusionPipeline
81+
82+
83+
class MyPipeline(DiffusionPipeline):
84+
def __init__(self, unet, scheduler):
85+
super().__init__()
86+
87+
self.register_modules(unet=unet, scheduler=scheduler)
88+
89+
@torch.no_grad()
90+
def __call__(self, batch_size: int = 1, num_inference_steps: int = 50):
91+
# Sample gaussian noise to begin loop
92+
image = torch.randn((batch_size, self.unet.in_channels, self.unet.sample_size, self.unet.sample_size))
93+
94+
image = image.to(self.device)
95+
96+
# set step values
97+
self.scheduler.set_timesteps(num_inference_steps)
98+
99+
for t in self.progress_bar(self.scheduler.timesteps):
100+
# 1. predict noise model_output
101+
model_output = self.unet(image, t).sample
102+
103+
# 2. predict previous mean of image x_t-1 and add variance depending on eta
104+
# eta corresponds to η in paper and should be between [0, 1]
105+
# do x_t -> x_t-1
106+
image = self.scheduler.step(model_output, t, image, eta).prev_sample
107+
108+
image = (image / 2 + 0.5).clamp(0, 1)
109+
image = image.cpu().permute(0, 2, 3, 1).numpy()
110+
111+
return image
112+
```
113+
114+
Now you can upload this short file under the name `pipeline.py` in your preferred [model repository](https://huggingface.co/docs/hub/models-uploading). For Stable Diffusion pipelines, you may also [join the community organisation for shared pipelines](https://huggingface.co/organizations/sd-diffusers-pipelines-library/share/BUPyDUuHcciGTOKaExlqtfFcyCZsVFdrjr) to upload yours.
115+
Finally, we can load the custom pipeline by passing the model repository name, *e.g.* `sd-diffusers-pipelines-library/my_custom_pipeline` alongside the model repository from where we want to load the `unet` and `scheduler` components.
116+
117+
```python
118+
my_pipeline = DiffusionPipeline.from_pretrained(
119+
"google/ddpm-cifar10-32", custom_pipeline="patrickvonplaten/my_custom_pipeline"
120+
)
121+
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
# Copyright 2022 The HuggingFace Team. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
14+
# limitations under the License.
15+
16+
17+
from typing import Optional, Tuple, Union
18+
19+
import torch
20+
21+
from diffusers.pipeline_utils import DiffusionPipeline, ImagePipelineOutput
22+
23+
24+
class CustomPipeline(DiffusionPipeline):
25+
r"""
26+
This model inherits from [`DiffusionPipeline`]. Check the superclass documentation for the generic methods the
27+
library implements for all the pipelines (such as downloading or saving, running on a particular device, etc.)
28+
29+
Parameters:
30+
unet ([`UNet2DModel`]): U-Net architecture to denoise the encoded image.
31+
scheduler ([`SchedulerMixin`]):
32+
A scheduler to be used in combination with `unet` to denoise the encoded image. Can be one of
33+
[`DDPMScheduler`], or [`DDIMScheduler`].
34+
"""
35+
36+
def __init__(self, unet, scheduler):
37+
super().__init__()
38+
self.register_modules(unet=unet, scheduler=scheduler)
39+
40+
@torch.no_grad()
41+
def __call__(
42+
self,
43+
batch_size: int = 1,
44+
generator: Optional[torch.Generator] = None,
45+
eta: float = 0.0,
46+
num_inference_steps: int = 50,
47+
output_type: Optional[str] = "pil",
48+
return_dict: bool = True,
49+
**kwargs,
50+
) -> Union[ImagePipelineOutput, Tuple]:
51+
r"""
52+
Args:
53+
batch_size (`int`, *optional*, defaults to 1):
54+
The number of images to generate.
55+
generator (`torch.Generator`, *optional*):
56+
A [torch generator](https://pytorch.org/docs/stable/generated/torch.Generator.html) to make generation
57+
deterministic.
58+
eta (`float`, *optional*, defaults to 0.0):
59+
The eta parameter which controls the scale of the variance (0 is DDIM and 1 is one type of DDPM).
60+
num_inference_steps (`int`, *optional*, defaults to 50):
61+
The number of denoising steps. More denoising steps usually lead to a higher quality image at the
62+
expense of slower inference.
63+
output_type (`str`, *optional*, defaults to `"pil"`):
64+
The output format of the generate image. Choose between
65+
[PIL](https://pillow.readthedocs.io/en/stable/): `PIL.Image.Image` or `np.array`.
66+
return_dict (`bool`, *optional*, defaults to `True`):
67+
Whether or not to return a [`~pipeline_utils.ImagePipelineOutput`] instead of a plain tuple.
68+
69+
Returns:
70+
[`~pipeline_utils.ImagePipelineOutput`] or `tuple`: [`~pipelines.utils.ImagePipelineOutput`] if
71+
`return_dict` is True, otherwise a `tuple. When returning a tuple, the first element is a list with the
72+
generated images.
73+
"""
74+
75+
# Sample gaussian noise to begin loop
76+
image = torch.randn(
77+
(batch_size, self.unet.in_channels, self.unet.sample_size, self.unet.sample_size),
78+
generator=generator,
79+
)
80+
image = image.to(self.device)
81+
82+
# set step values
83+
self.scheduler.set_timesteps(num_inference_steps)
84+
85+
for t in self.progress_bar(self.scheduler.timesteps):
86+
# 1. predict noise model_output
87+
model_output = self.unet(image, t).sample
88+
89+
# 2. predict previous mean of image x_t-1 and add variance depending on eta
90+
# eta corresponds to η in paper and should be between [0, 1]
91+
# do x_t -> x_t-1
92+
image = self.scheduler.step(model_output, t, image, eta).prev_sample
93+
94+
image = (image / 2 + 0.5).clamp(0, 1)
95+
image = image.cpu().permute(0, 2, 3, 1).numpy()
96+
if output_type == "pil":
97+
image = self.numpy_to_pil(image)
98+
99+
if not return_dict:
100+
return (image,)
101+
102+
return ImagePipelineOutput(images=image), "This is a test"
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
b8fa12635e53eebebc22f95ee863e7af4fc2fb07
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../blobs/bbbcb9f65616524d6199fa3bc16dc0500fb2cbbb

0 commit comments

Comments
 (0)