Skip to content

Commit c1e8bdf

Browse files
DN6sayakpaul
andauthored
Move ControlNetXS into Community Folder (#6316)
* update * update * update * update * update * make style * remove docs * update * move to research folder. * fix-copies * remove _toctree entry. --------- Co-authored-by: Sayak Paul <[email protected]>
1 parent 78b87dc commit c1e8bdf

File tree

17 files changed

+153
-985
lines changed

17 files changed

+153
-985
lines changed

docs/source/en/_toctree.yml

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -266,10 +266,6 @@
266266
title: ControlNet
267267
- local: api/pipelines/controlnet_sdxl
268268
title: ControlNet with Stable Diffusion XL
269-
- local: api/pipelines/controlnetxs
270-
title: ControlNet-XS
271-
- local: api/pipelines/controlnetxs_sdxl
272-
title: ControlNet-XS with Stable Diffusion XL
273269
- local: api/pipelines/dance_diffusion
274270
title: Dance Diffusion
275271
- local: api/pipelines/ddim
Lines changed: 1 addition & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,3 @@
1-
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
2-
3-
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4-
the License. You may obtain a copy of the License at
5-
6-
http://www.apache.org/licenses/LICENSE-2.0
7-
8-
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9-
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10-
specific language governing permissions and limitations under the License.
11-
-->
12-
131
# ControlNet-XS
142

153
ControlNet-XS was introduced in [ControlNet-XS](https://vislearn.github.io/ControlNet-XS/) by Denis Zavadski and Carsten Rother. It is based on the observation that the control model in the [original ControlNet](https://huggingface.co/papers/2302.05543) can be made much smaller and still produce good results.
@@ -24,16 +12,5 @@ Here's the overview from the [project page](https://vislearn.github.io/ControlNe
2412

2513
This model was contributed by [UmerHA](https://twitter.com/UmerHAdil). ❤️
2614

27-
<Tip>
28-
29-
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
30-
31-
</Tip>
32-
33-
## StableDiffusionControlNetXSPipeline
34-
[[autodoc]] StableDiffusionControlNetXSPipeline
35-
- all
36-
- __call__
3715

38-
## StableDiffusionPipelineOutput
39-
[[autodoc]] pipelines.stable_diffusion.StableDiffusionPipelineOutput
16+
> 🧠 Make sure to check out the Schedulers [guide](https://huggingface.co/docs/diffusers/main/en/using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](https://huggingface.co/docs/diffusers/main/en/using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
Lines changed: 1 addition & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,3 @@
1-
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
2-
3-
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4-
the License. You may obtain a copy of the License at
5-
6-
http://www.apache.org/licenses/LICENSE-2.0
7-
8-
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9-
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10-
specific language governing permissions and limitations under the License.
11-
-->
12-
131
# ControlNet-XS with Stable Diffusion XL
142

153
ControlNet-XS was introduced in [ControlNet-XS](https://vislearn.github.io/ControlNet-XS/) by Denis Zavadski and Carsten Rother. It is based on the observation that the control model in the [original ControlNet](https://huggingface.co/papers/2302.05543) can be made much smaller and still produce good results.
@@ -24,22 +12,4 @@ Here's the overview from the [project page](https://vislearn.github.io/ControlNe
2412

2513
This model was contributed by [UmerHA](https://twitter.com/UmerHAdil). ❤️
2614

27-
<Tip warning={true}>
28-
29-
🧪 Many of the SDXL ControlNet checkpoints are experimental, and there is a lot of room for improvement. Feel free to open an [Issue](https://github.com/huggingface/diffusers/issues/new/choose) and leave us feedback on how we can improve!
30-
31-
</Tip>
32-
33-
<Tip>
34-
35-
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
36-
37-
</Tip>
38-
39-
## StableDiffusionXLControlNetXSPipeline
40-
[[autodoc]] StableDiffusionXLControlNetXSPipeline
41-
- all
42-
- __call__
43-
44-
## StableDiffusionPipelineOutput
45-
[[autodoc]] pipelines.stable_diffusion.StableDiffusionPipelineOutput
15+
> 🧠 Make sure to check out the Schedulers [guide](https://huggingface.co/docs/diffusers/main/en/using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](https://huggingface.co/docs/diffusers/main/en/using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.

src/diffusers/models/controlnetxs.py renamed to examples/research_projects/controlnetxs/controlnetxs.py

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,12 @@
2121
from torch.nn import functional as F
2222
from torch.nn.modules.normalization import GroupNorm
2323

24-
from ..configuration_utils import ConfigMixin, register_to_config
25-
from ..utils import BaseOutput, logging
26-
from .attention_processor import USE_PEFT_BACKEND, AttentionProcessor
27-
from .autoencoders import AutoencoderKL
28-
from .lora import LoRACompatibleConv
29-
from .modeling_utils import ModelMixin
30-
from .unet_2d_blocks import (
24+
from diffusers.configuration_utils import ConfigMixin, register_to_config
25+
from diffusers.models.attention_processor import USE_PEFT_BACKEND, AttentionProcessor
26+
from diffusers.models.autoencoders import AutoencoderKL
27+
from diffusers.models.lora import LoRACompatibleConv
28+
from diffusers.models.modeling_utils import ModelMixin
29+
from diffusers.models.unet_2d_blocks import (
3130
CrossAttnDownBlock2D,
3231
CrossAttnUpBlock2D,
3332
DownBlock2D,
@@ -37,7 +36,8 @@
3736
UpBlock2D,
3837
Upsample2D,
3938
)
40-
from .unet_2d_condition import UNet2DConditionModel
39+
from diffusers.models.unet_2d_condition import UNet2DConditionModel
40+
from diffusers.utils import BaseOutput, logging
4141

4242

4343
logger = logging.get_logger(__name__) # pylint: disable=invalid-name
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# !pip install opencv-python transformers accelerate
2+
import argparse
3+
4+
import cv2
5+
import numpy as np
6+
import torch
7+
from controlnetxs import ControlNetXSModel
8+
from PIL import Image
9+
from pipeline_controlnet_xs import StableDiffusionControlNetXSPipeline
10+
11+
from diffusers.utils import load_image
12+
13+
14+
parser = argparse.ArgumentParser()
15+
parser.add_argument(
16+
"--prompt", type=str, default="aerial view, a futuristic research complex in a bright foggy jungle, hard lighting"
17+
)
18+
parser.add_argument("--negative_prompt", type=str, default="low quality, bad quality, sketches")
19+
parser.add_argument("--controlnet_conditioning_scale", type=float, default=0.7)
20+
parser.add_argument(
21+
"--image_path",
22+
type=str,
23+
default="https://hf.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png",
24+
)
25+
parser.add_argument("--num_inference_steps", type=int, default=50)
26+
27+
args = parser.parse_args()
28+
29+
prompt = args.prompt
30+
negative_prompt = args.negative_prompt
31+
# download an image
32+
image = load_image(args.image_path)
33+
34+
# initialize the models and pipeline
35+
controlnet_conditioning_scale = args.controlnet_conditioning_scale
36+
controlnet = ControlNetXSModel.from_pretrained("UmerHA/ConrolNetXS-SD2.1-canny", torch_dtype=torch.float16)
37+
pipe = StableDiffusionControlNetXSPipeline.from_pretrained(
38+
"stabilityai/stable-diffusion-2-1", controlnet=controlnet, torch_dtype=torch.float16
39+
)
40+
pipe.enable_model_cpu_offload()
41+
42+
# get canny image
43+
image = np.array(image)
44+
image = cv2.Canny(image, 100, 200)
45+
image = image[:, :, None]
46+
image = np.concatenate([image, image, image], axis=2)
47+
canny_image = Image.fromarray(image)
48+
49+
num_inference_steps = args.num_inference_steps
50+
51+
# generate image
52+
image = pipe(
53+
prompt,
54+
controlnet_conditioning_scale=controlnet_conditioning_scale,
55+
image=canny_image,
56+
num_inference_steps=num_inference_steps,
57+
).images[0]
58+
image.save("cnxs_sd.canny.png")
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# !pip install opencv-python transformers accelerate
2+
import argparse
3+
4+
import cv2
5+
import numpy as np
6+
import torch
7+
from controlnetxs import ControlNetXSModel
8+
from PIL import Image
9+
from pipeline_controlnet_xs import StableDiffusionControlNetXSPipeline
10+
11+
from diffusers.utils import load_image
12+
13+
14+
parser = argparse.ArgumentParser()
15+
parser.add_argument(
16+
"--prompt", type=str, default="aerial view, a futuristic research complex in a bright foggy jungle, hard lighting"
17+
)
18+
parser.add_argument("--negative_prompt", type=str, default="low quality, bad quality, sketches")
19+
parser.add_argument("--controlnet_conditioning_scale", type=float, default=0.7)
20+
parser.add_argument(
21+
"--image_path",
22+
type=str,
23+
default="https://hf.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png",
24+
)
25+
parser.add_argument("--num_inference_steps", type=int, default=50)
26+
27+
args = parser.parse_args()
28+
29+
prompt = args.prompt
30+
negative_prompt = args.negative_prompt
31+
# download an image
32+
image = load_image(args.image_path)
33+
# initialize the models and pipeline
34+
controlnet_conditioning_scale = args.controlnet_conditioning_scale
35+
controlnet = ControlNetXSModel.from_pretrained("UmerHA/ConrolNetXS-SDXL-canny", torch_dtype=torch.float16)
36+
pipe = StableDiffusionControlNetXSPipeline.from_pretrained(
37+
"stabilityai/stable-diffusion-xl-base-1.0", controlnet=controlnet, torch_dtype=torch.float16
38+
)
39+
pipe.enable_model_cpu_offload()
40+
41+
# get canny image
42+
image = np.array(image)
43+
image = cv2.Canny(image, 100, 200)
44+
image = image[:, :, None]
45+
image = np.concatenate([image, image, image], axis=2)
46+
canny_image = Image.fromarray(image)
47+
48+
num_inference_steps = args.num_inference_steps
49+
50+
# generate image
51+
image = pipe(
52+
prompt,
53+
controlnet_conditioning_scale=controlnet_conditioning_scale,
54+
image=canny_image,
55+
num_inference_steps=num_inference_steps,
56+
).images[0]
57+
image.save("cnxs_sdxl.canny.png")

src/diffusers/pipelines/controlnet_xs/pipeline_controlnet_xs.py renamed to examples/research_projects/controlnetxs/pipeline_controlnet_xs.py

Lines changed: 11 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -19,74 +19,30 @@
1919
import PIL.Image
2020
import torch
2121
import torch.nn.functional as F
22+
from controlnetxs import ControlNetXSModel
2223
from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer
2324

24-
from ...image_processor import PipelineImageInput, VaeImageProcessor
25-
from ...loaders import FromSingleFileMixin, LoraLoaderMixin, TextualInversionLoaderMixin
26-
from ...models import AutoencoderKL, ControlNetXSModel, UNet2DConditionModel
27-
from ...models.lora import adjust_lora_scale_text_encoder
28-
from ...schedulers import KarrasDiffusionSchedulers
29-
from ...utils import (
25+
from diffusers.image_processor import PipelineImageInput, VaeImageProcessor
26+
from diffusers.loaders import FromSingleFileMixin, LoraLoaderMixin, TextualInversionLoaderMixin
27+
from diffusers.models import AutoencoderKL, UNet2DConditionModel
28+
from diffusers.models.lora import adjust_lora_scale_text_encoder
29+
from diffusers.pipelines.pipeline_utils import DiffusionPipeline
30+
from diffusers.pipelines.stable_diffusion.pipeline_output import StableDiffusionPipelineOutput
31+
from diffusers.pipelines.stable_diffusion.safety_checker import StableDiffusionSafetyChecker
32+
from diffusers.schedulers import KarrasDiffusionSchedulers
33+
from diffusers.utils import (
3034
USE_PEFT_BACKEND,
3135
deprecate,
3236
logging,
33-
replace_example_docstring,
3437
scale_lora_layers,
3538
unscale_lora_layers,
3639
)
37-
from ...utils.torch_utils import is_compiled_module, is_torch_version, randn_tensor
38-
from ..pipeline_utils import DiffusionPipeline
39-
from ..stable_diffusion.pipeline_output import StableDiffusionPipelineOutput
40-
from ..stable_diffusion.safety_checker import StableDiffusionSafetyChecker
40+
from diffusers.utils.torch_utils import is_compiled_module, is_torch_version, randn_tensor
4141

4242

4343
logger = logging.get_logger(__name__) # pylint: disable=invalid-name
4444

4545

46-
EXAMPLE_DOC_STRING = """
47-
Examples:
48-
```py
49-
>>> # !pip install opencv-python transformers accelerate
50-
>>> from diffusers import StableDiffusionControlNetXSPipeline, ControlNetXSModel
51-
>>> from diffusers.utils import load_image
52-
>>> import numpy as np
53-
>>> import torch
54-
55-
>>> import cv2
56-
>>> from PIL import Image
57-
58-
>>> prompt = "aerial view, a futuristic research complex in a bright foggy jungle, hard lighting"
59-
>>> negative_prompt = "low quality, bad quality, sketches"
60-
61-
>>> # download an image
62-
>>> image = load_image(
63-
... "https://hf.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png"
64-
... )
65-
66-
>>> # initialize the models and pipeline
67-
>>> controlnet_conditioning_scale = 0.5
68-
>>> controlnet = ControlNetXSModel.from_pretrained(
69-
... "UmerHA/ConrolNetXS-SD2.1-canny", torch_dtype=torch.float16
70-
... )
71-
>>> pipe = StableDiffusionControlNetXSPipeline.from_pretrained(
72-
... "stabilityai/stable-diffusion-2-1", controlnet=controlnet, torch_dtype=torch.float16
73-
... )
74-
>>> pipe.enable_model_cpu_offload()
75-
76-
>>> # get canny image
77-
>>> image = np.array(image)
78-
>>> image = cv2.Canny(image, 100, 200)
79-
>>> image = image[:, :, None]
80-
>>> image = np.concatenate([image, image, image], axis=2)
81-
>>> canny_image = Image.fromarray(image)
82-
>>> # generate image
83-
>>> image = pipe(
84-
... prompt, controlnet_conditioning_scale=controlnet_conditioning_scale, image=canny_image
85-
... ).images[0]
86-
```
87-
"""
88-
89-
9046
class StableDiffusionControlNetXSPipeline(
9147
DiffusionPipeline, TextualInversionLoaderMixin, LoraLoaderMixin, FromSingleFileMixin
9248
):
@@ -669,7 +625,6 @@ def disable_freeu(self):
669625
self.unet.disable_freeu()
670626

671627
@torch.no_grad()
672-
@replace_example_docstring(EXAMPLE_DOC_STRING)
673628
def __call__(
674629
self,
675630
prompt: Union[str, List[str]] = None,

0 commit comments

Comments
 (0)