Skip to content

Commit 12fca1c

Browse files
committed
Merge branch 'main' of github.com:huggingface/diffusers into Add-AnyText
2 parents a8dbbe2 + 4c6152c commit 12fca1c

File tree

54 files changed

+676
-184
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

54 files changed

+676
-184
lines changed

docs/source/en/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -237,6 +237,8 @@
237237
title: AutoencoderKL
238238
- local: api/models/asymmetricautoencoderkl
239239
title: AsymmetricAutoencoderKL
240+
- local: api/models/stable_cascade_unet
241+
title: StableCascadeUNet
240242
- local: api/models/autoencoder_tiny
241243
title: Tiny AutoEncoder
242244
- local: api/models/autoencoder_oobleck

docs/source/en/api/loaders/single_file.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ The [`~loaders.FromSingleFileMixin.from_single_file`] method allows you to load:
5151
- [`AutoencoderKL`]
5252
- [`ControlNetModel`]
5353
- [`SD3Transformer2DModel`]
54+
- [`FluxTransformer2DModel`]
5455

5556
## FromSingleFileMixin
5657

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# StableCascadeUNet
14+
15+
A UNet model from the [Stable Cascade pipeline](../pipelines/stable_cascade.md).
16+
17+
## StableCascadeUNet
18+
19+
[[autodoc]] models.unets.unet_stable_cascade.StableCascadeUNet

docs/source/en/training/instructpix2pix.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
[InstructPix2Pix](https://hf.co/papers/2211.09800) is a Stable Diffusion model trained to edit images from human-provided instructions. For example, your prompt can be "turn the clouds rainy" and the model will edit the input image accordingly. This model is conditioned on the text prompt (or editing instruction) and the input image.
1616

17-
This guide will explore the [train_instruct_pix2pix.py](https://github.com/huggingface/diffusers/blob/main/examples/instruct_pix2pix/train_instruct_pix2pix.py) training script to help you become familiar with it, and how you can adapt it for your own use-case.
17+
This guide will explore the [train_instruct_pix2pix.py](https://github.com/huggingface/diffusers/blob/main/examples/instruct_pix2pix/train_instruct_pix2pix.py) training script to help you become familiar with it, and how you can adapt it for your own use case.
1818

1919
Before running the script, make sure you install the library from source:
2020

@@ -117,7 +117,7 @@ optimizer = optimizer_cls(
117117
)
118118
```
119119

120-
Next, the edited images and and edit instructions are [preprocessed](https://github.com/huggingface/diffusers/blob/64603389da01082055a901f2883c4810d1144edb/examples/instruct_pix2pix/train_instruct_pix2pix.py#L624) and [tokenized](https://github.com/huggingface/diffusers/blob/64603389da01082055a901f2883c4810d1144edb/examples/instruct_pix2pix/train_instruct_pix2pix.py#L610C24-L610C24). It is important the same image transformations are applied to the original and edited images.
120+
Next, the edited images and edit instructions are [preprocessed](https://github.com/huggingface/diffusers/blob/64603389da01082055a901f2883c4810d1144edb/examples/instruct_pix2pix/train_instruct_pix2pix.py#L624) and [tokenized](https://github.com/huggingface/diffusers/blob/64603389da01082055a901f2883c4810d1144edb/examples/instruct_pix2pix/train_instruct_pix2pix.py#L610C24-L610C24). It is important the same image transformations are applied to the original and edited images.
121121

122122
```py
123123
def preprocess_train(examples):
@@ -249,4 +249,4 @@ The SDXL training script is discussed in more detail in the [SDXL training](sdxl
249249

250250
Congratulations on training your own InstructPix2Pix model! 🥳 To learn more about the model, it may be helpful to:
251251

252-
- Read the [Instruction-tuning Stable Diffusion with InstructPix2Pix](https://huggingface.co/blog/instruction-tuning-sd) blog post to learn more about some experiments we've done with InstructPix2Pix, dataset preparation, and results for different instructions.
252+
- Read the [Instruction-tuning Stable Diffusion with InstructPix2Pix](https://huggingface.co/blog/instruction-tuning-sd) blog post to learn more about some experiments we've done with InstructPix2Pix, dataset preparation, and results for different instructions.

docs/source/en/tutorials/using_peft_for_inference.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ pipe_id = "stabilityai/stable-diffusion-xl-base-1.0"
3434
pipe = DiffusionPipeline.from_pretrained(pipe_id, torch_dtype=torch.float16).to("cuda")
3535
```
3636

37-
Next, load a [CiroN2022/toy-face](https://huggingface.co/CiroN2022/toy-face) adapter with the [`~diffusers.loaders.StableDiffusionXLLoraLoaderMixin.load_lora_weights`] method. With the 🤗 PEFT integration, you can assign a specific `adapter_name` to the checkpoint, which let's you easily switch between different LoRA checkpoints. Let's call this adapter `"toy"`.
37+
Next, load a [CiroN2022/toy-face](https://huggingface.co/CiroN2022/toy-face) adapter with the [`~diffusers.loaders.StableDiffusionXLLoraLoaderMixin.load_lora_weights`] method. With the 🤗 PEFT integration, you can assign a specific `adapter_name` to the checkpoint, which lets you easily switch between different LoRA checkpoints. Let's call this adapter `"toy"`.
3838

3939
```python
4040
pipe.load_lora_weights("CiroN2022/toy-face", weight_name="toy_face_sdxl.safetensors", adapter_name="toy")

docs/source/en/using-diffusers/callback.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License.
1212

1313
# Pipeline callbacks
1414

15-
The denoising loop of a pipeline can be modified with custom defined functions using the `callback_on_step_end` parameter. The callback function is executed at the end of each step, and modifies the pipeline attributes and variables for the next step. This is really useful for *dynamically* adjusting certain pipeline attributes or modifying tensor variables. This versatility allows for interesting use-cases such as changing the prompt embeddings at each timestep, assigning different weights to the prompt embeddings, and editing the guidance scale. With callbacks, you can implement new features without modifying the underlying code!
15+
The denoising loop of a pipeline can be modified with custom defined functions using the `callback_on_step_end` parameter. The callback function is executed at the end of each step, and modifies the pipeline attributes and variables for the next step. This is really useful for *dynamically* adjusting certain pipeline attributes or modifying tensor variables. This versatility allows for interesting use cases such as changing the prompt embeddings at each timestep, assigning different weights to the prompt embeddings, and editing the guidance scale. With callbacks, you can implement new features without modifying the underlying code!
1616

1717
> [!TIP]
1818
> 🤗 Diffusers currently only supports `callback_on_step_end`, but feel free to open a [feature request](https://github.com/huggingface/diffusers/issues/new/choose) if you have a cool use-case and require a callback function with a different execution point!
@@ -75,7 +75,7 @@ out.images[0].save("official_callback.png")
7575
<figcaption class="mt-2 text-center text-sm text-gray-500">without SDXLCFGCutoffCallback</figcaption>
7676
</div>
7777
<div>
78-
<img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/with_cfg_callback.png" alt="generated image of a a sports car at the road with cfg callback" />
78+
<img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/with_cfg_callback.png" alt="generated image of a sports car at the road with cfg callback" />
7979
<figcaption class="mt-2 text-center text-sm text-gray-500">with SDXLCFGCutoffCallback</figcaption>
8080
</div>
8181
</div>

docs/source/en/using-diffusers/custom_pipeline_overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -289,9 +289,9 @@ scheduler = DPMSolverMultistepScheduler.from_pretrained(pipe_id, subfolder="sche
289289
3. Load an image processor:
290290

291291
```python
292-
from transformers import CLIPFeatureExtractor
292+
from transformers import CLIPImageProcessor
293293

294-
feature_extractor = CLIPFeatureExtractor.from_pretrained(pipe_id, subfolder="feature_extractor")
294+
feature_extractor = CLIPImageProcessor.from_pretrained(pipe_id, subfolder="feature_extractor")
295295
```
296296

297297
<Tip warning={true}>

docs/source/en/using-diffusers/inference_with_tcd_lora.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -212,14 +212,14 @@ TCD-LoRA is very versatile, and it can be combined with other adapter types like
212212
import torch
213213
import numpy as np
214214
from PIL import Image
215-
from transformers import DPTFeatureExtractor, DPTForDepthEstimation
215+
from transformers import DPTImageProcessor, DPTForDepthEstimation
216216
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline
217217
from diffusers.utils import load_image, make_image_grid
218218
from scheduling_tcd import TCDScheduler
219219

220220
device = "cuda"
221221
depth_estimator = DPTForDepthEstimation.from_pretrained("Intel/dpt-hybrid-midas").to(device)
222-
feature_extractor = DPTFeatureExtractor.from_pretrained("Intel/dpt-hybrid-midas")
222+
feature_extractor = DPTImageProcessor.from_pretrained("Intel/dpt-hybrid-midas")
223223

224224
def get_depth_map(image):
225225
image = feature_extractor(images=image, return_tensors="pt").pixel_values.to(device)

docs/source/ko/using-diffusers/loading.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -307,7 +307,7 @@ print(pipeline)
307307

308308
위의 코드 출력 결과를 확인해보면, `pipeline`[`StableDiffusionPipeline`]의 인스턴스이며, 다음과 같이 총 7개의 컴포넌트로 구성된다는 것을 알 수 있습니다.
309309

310-
- `"feature_extractor"`: [`~transformers.CLIPFeatureExtractor`]의 인스턴스
310+
- `"feature_extractor"`: [`~transformers.CLIPImageProcessor`]의 인스턴스
311311
- `"safety_checker"`: 유해한 컨텐츠를 스크리닝하기 위한 [컴포넌트](https://github.com/huggingface/diffusers/blob/e55687e1e15407f60f32242027b7bb8170e58266/src/diffusers/pipelines/stable_diffusion/safety_checker.py#L32)
312312
- `"scheduler"`: [`PNDMScheduler`]의 인스턴스
313313
- `"text_encoder"`: [`~transformers.CLIPTextModel`]의 인스턴스

docs/source/ko/using-diffusers/textual_inversion_inference.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ import PIL
2424
from PIL import Image
2525

2626
from diffusers import StableDiffusionPipeline
27-
from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer
27+
from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer
2828

2929

3030
def image_grid(imgs, rows, cols):

0 commit comments

Comments
 (0)