Skip to content

Commit ef44689

Browse files
authored
Merge branch 'main' into remove-extra-validation
2 parents 5baa965 + 7a2b78b commit ef44689

File tree

74 files changed

+1726
-156
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

74 files changed

+1726
-156
lines changed

docs/source/en/api/pipelines/qwenimage.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,10 @@
1414

1515
# QwenImage
1616

17+
<div class="flex flex-wrap space-x-1">
18+
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
19+
</div>
20+
1721
Qwen-Image from the Qwen team is an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing. Experiments show strong general capabilities in both image generation and editing, with exceptional performance in text rendering, especially for Chinese.
1822

1923
Qwen-Image comes in the following variants:
@@ -86,6 +90,12 @@ image.save("qwen_fewsteps.png")
8690

8791
</details>
8892

93+
<Tip>
94+
95+
The `guidance_scale` parameter in the pipeline is there to support future guidance-distilled models when they come up. Note that passing `guidance_scale` to the pipeline is ineffective. To enable classifier-free guidance, please pass `true_cfg_scale` and `negative_prompt` (even an empty negative prompt like " ") should enable classifier-free guidance computations.
96+
97+
</Tip>
98+
8999
## QwenImagePipeline
90100

91101
[[autodoc]] QwenImagePipeline

docs/source/en/api/pipelines/wan.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -333,6 +333,8 @@ The general rule of thumb to keep in mind when preparing inputs for the VACE pip
333333

334334
- Wan 2.1 and 2.2 support using [LightX2V LoRAs](https://huggingface.co/Kijai/WanVideo_comfy/tree/main/Lightx2v) to speed up inference. Using them on Wan 2.2 is slightly more involed. Refer to [this code snippet](https://github.com/huggingface/diffusers/pull/12040#issuecomment-3144185272) to learn more.
335335

336+
- Wan 2.2 has two denoisers. By default, LoRAs are only loaded into the first denoiser. One can set `load_into_transformer_2=True` to load LoRAs into the second denoiser. Refer to [this](https://github.com/huggingface/diffusers/pull/12074#issue-3292620048) and [this](https://github.com/huggingface/diffusers/pull/12074#issuecomment-3155896144) examples to learn more.
337+
336338
## WanPipeline
337339

338340
[[autodoc]] WanPipeline

docs/source/zh/_toctree.yml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,25 @@
2525
- local: optimization/xformers
2626
title: xFormers
2727

28+
- title: Modular Diffusers
29+
isExpanded: false
30+
sections:
31+
- local: modular_diffusers/overview
32+
title: Overview
33+
- local: modular_diffusers/quickstart
34+
title: Quickstart
35+
- local: modular_diffusers/modular_diffusers_states
36+
title: States
37+
- local: modular_diffusers/pipeline_block
38+
title: ModularPipelineBlocks
39+
- local: modular_diffusers/sequential_pipeline_blocks
40+
title: SequentialPipelineBlocks
41+
- local: modular_diffusers/loop_sequential_pipeline_blocks
42+
title: LoopSequentialPipelineBlocks
43+
- local: modular_diffusers/auto_pipeline_blocks
44+
title: AutoPipelineBlocks
45+
- local: modular_diffusers/modular_pipeline
46+
title: ModularPipeline
2847

2948
- title: Training
3049
isExpanded: false
@@ -63,6 +82,8 @@
6382
sections:
6483
- title: Task recipes
6584
sections:
85+
- local: community_projects
86+
title: Projects built with Diffusers
6687
- local: conceptual/philosophy
6788
title: Philosophy
6889
- local: conceptual/contribution
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
<!--版权 2025 The HuggingFace Team。保留所有权利。
2+
3+
根据Apache许可证,版本2.0("许可证")授权;除非符合许可证,否则不得使用此文件。您可以在
4+
5+
http://www.apache.org/licenses/LICENSE-2.0
6+
7+
获取许可证的副本。
8+
9+
除非适用法律要求或书面同意,根据许可证分发的软件是按"原样"分发的,没有任何形式的明示或暗示的担保或条件。有关许可证的特定语言,请参阅许可证。
10+
-->
11+
12+
# 社区项目
13+
14+
欢迎来到社区项目。这个空间致力于展示我们充满活力的社区使用`diffusers`库创建的令人难以置信的工作和创新应用。
15+
16+
本节旨在:
17+
18+
- 突出使用`diffusers`构建的多样化和鼓舞人心的项目
19+
- 促进我们社区内的知识共享
20+
- 提供如何利用`diffusers`的实际例子
21+
22+
探索愉快,感谢您成为Diffusers社区的一部分!
23+
24+
<table>
25+
<tr>
26+
<th>项目名称</th>
27+
<th>描述</th>
28+
</tr>
29+
<tr style="border-top: 2px solid black">
30+
<td><a href="https://github.com/carson-katri/dream-textures"> dream-textures </a></td>
31+
<td>Stable Diffusion内置到Blender</td>
32+
</tr>
33+
<tr style="border-top: 2px solid black">
34+
<td><a href="https://github.com/megvii-research/HiDiffusion"> HiDiffusion </a></td>
35+
<td>仅通过添加一行代码即可提高扩散模型的分辨率和速度</td>
36+
</tr>
37+
<tr style="border-top: 2px solid black">
38+
<td><a href="https://github.com/lllyasviel/IC-Light"> IC-Light </a></td>
39+
<td>IC-Light是一个用于操作图像照明的项目</td>
40+
</tr>
41+
<tr style="border-top: 2px solid black">
42+
<td><a href="https://github.com/InstantID/InstantID"> InstantID </a></td>
43+
<td>InstantID:零样本身份保留生成在几秒钟内</td>
44+
</tr>
45+
<tr style="border-top: 2px solid black">
46+
<td><a href="https://github.com/Sanster/IOPaint"> IOPaint </a></td>
47+
<td>由SOTA AI模型驱动的图像修复工具。从您的图片中移除任何不需要的物体、缺陷、人物,或擦除并替换(由stable_diffusion驱动)图片上的任何内容。</td>
48+
</tr>
49+
<tr style="border-top: 2px solid black">
50+
<td><a href="https://github.com/bmaltais/kohya_ss"> Kohya </a></td>
51+
<td>Kohya的Stable Diffusion训练器的Gradio GUI</td>
52+
</tr>
53+
<tr style="border-top: 2px solid black">
54+
<td><a href="https://github.com/magic-research/magic-animate"> MagicAnimate </a></td>
55+
<td>MagicAnimate:使用扩散模型进行时间一致的人体图像动画</td>
56+
</tr>
57+
<tr style="border-top: 2px solid black">
58+
<td><a href="https://github.com/levihsu/OOTDiffusion"> OOTDiffusion </a></td>
59+
<td>基于潜在扩散的虚拟试穿控制</td>
60+
</tr>
61+
<tr style="border-top: 2px solid black">
62+
<td><a href="https://github.com/vladmandic/automatic"> SD.Next </a></td>
63+
<td>SD.Next: Stable Diffusion 和其他基于Diffusion的生成图像模型的高级实现</td>
64+
</tr>
65+
<tr style="border-top: 2px solid black">
66+
<td><a href="https://github.com/ashawkey/stable-dreamfusion"> stable-dreamfusion </a></td>
67+
<td>使用 NeRF + Diffusion 进行文本到3D & 图像到3D & 网格导出</td>
68+
</tr>
69+
<tr style="border-top: 2px solid black">
70+
<td><a href="https://github.com/HVision-NKU/StoryDiffusion"> StoryDiffusion </a></td>
71+
<td>StoryDiffusion 可以通过生成一致的图像和视频来创造一个神奇的故事。</td>
72+
</tr>
73+
<tr style="border-top: 2px solid black">
74+
<td><a href="https://github.com/cumulo-autumn/StreamDiffusion"> StreamDiffusion </a></td>
75+
<td>实时交互生成的管道级解决方案</td>
76+
</tr>
77+
<tr style="border-top: 2px solid black">
78+
<td><a href="https://github.com/Netwrck/stable-diffusion-server"> Stable Diffusion Server </a></td>
79+
<td>配置用于使用一个 stable diffusion 模型进行修复/生成/img2img 的服务器</td>
80+
</tr>
81+
<tr style="border-top: 2px solid black">
82+
<td><a href="https://github.com/suzukimain/auto_diffusers"> Model Search </a></td>
83+
<td>在 Civitai 和 Hugging Face 上搜索模型</td>
84+
</tr>
85+
<tr style="border-top: 2px solid black">
86+
<td><a href="https://github.com/beinsezii/skrample"> Skrample </a></td>
87+
<td>完全模块化的调度器功能,具有一流的 diffusers 集成。</td>
88+
</tr>
89+
</table>
Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
<!--版权所有 2025 The HuggingFace Team。保留所有权利。
2+
3+
根据Apache许可证2.0版("许可证")授权;除非符合许可证,否则不得使用此文件。您可以在
4+
5+
http://www.apache.org/licenses/LICENSE-2.0
6+
7+
获取许可证的副本。
8+
9+
除非适用法律要求或书面同意,根据许可证分发的软件按"原样"分发,无任何明示或暗示的担保或条件。有关许可证的特定语言管理权限和限制,请参阅许可证。
10+
-->
11+
12+
# AutoPipelineBlocks
13+
14+
[`~modular_pipelines.AutoPipelineBlocks`] 是一种包含支持不同工作流程的块的多块类型。它根据运行时提供的输入自动选择要运行的子块。这通常用于将多个工作流程(文本到图像、图像到图像、修复)打包到一个管道中以便利。
15+
16+
本指南展示如何创建 [`~modular_pipelines.AutoPipelineBlocks`]
17+
18+
创建三个 [`~modular_pipelines.ModularPipelineBlocks`] 用于文本到图像、图像到图像和修复。这些代表了管道中可用的不同工作流程。
19+
20+
<hfoptions id="auto">
21+
<hfoption id="text-to-image">
22+
23+
```py
24+
import torch
25+
from diffusers.modular_pipelines import ModularPipelineBlocks, InputParam, OutputParam
26+
27+
class TextToImageBlock(ModularPipelineBlocks):
28+
model_name = "text2img"
29+
30+
@property
31+
def inputs(self):
32+
return [InputParam(name="prompt")]
33+
34+
@property
35+
def intermediate_outputs(self):
36+
return []
37+
38+
@property
39+
def description(self):
40+
return "我是一个文本到图像的工作流程!"
41+
42+
def __call__(self, components, state):
43+
block_state = self.get_block_state(state)
44+
print("运行文本到图像工作流程")
45+
# 在这里添加你的文本到图像逻辑
46+
# 例如:根据提示生成图像
47+
self.set_block_state(state, block_state)
48+
return components, state
49+
```
50+
51+
52+
</hfoption>
53+
<hfoption id="image-to-image">
54+
55+
```py
56+
class ImageToImageBlock(ModularPipelineBlocks):
57+
model_name = "img2img"
58+
59+
@property
60+
def inputs(self):
61+
return [InputParam(name="prompt"), InputParam(name="image")]
62+
63+
@property
64+
def intermediate_outputs(self):
65+
return []
66+
67+
@property
68+
def description(self):
69+
return "我是一个图像到图像的工作流程!"
70+
71+
def __call__(self, components, state):
72+
block_state = self.get_block_state(state)
73+
print("运行图像到图像工作流程")
74+
# 在这里添加你的图像到图像逻辑
75+
# 例如:根据提示转换输入图像
76+
self.set_block_state(state, block_state)
77+
return components, state
78+
```
79+
80+
81+
</hfoption>
82+
<hfoption id="inpaint">
83+
84+
```py
85+
class InpaintBlock(ModularPipelineBlocks):
86+
model_name = "inpaint"
87+
88+
@property
89+
def inputs(self):
90+
return [InputParam(name="prompt"), InputParam(name="image"), InputParam(name="mask")]
91+
92+
@property
93+
94+
def intermediate_outputs(self):
95+
return []
96+
97+
@property
98+
def description(self):
99+
return "我是一个修复工作流!"
100+
101+
def __call__(self, components, state):
102+
block_state = self.get_block_state(state)
103+
print("运行修复工作流")
104+
# 在这里添加你的修复逻辑
105+
# 例如:根据提示填充被遮罩的区域
106+
self.set_block_state(state, block_state)
107+
return components, state
108+
```
109+
110+
</hfoption>
111+
</hfoptions>
112+
113+
创建一个包含子块类及其对应块名称列表的[`~modular_pipelines.AutoPipelineBlocks`]类。
114+
115+
你还需要包括`block_trigger_inputs`,一个触发相应块的输入名称列表。如果在运行时提供了触发输入,则选择该块运行。使用`None`来指定如果未检测到触发输入时运行的默认块。
116+
117+
最后,重要的是包括一个`description`,清楚地解释哪些输入触发哪些工作流。这有助于用户理解如何运行特定的工作流。
118+
119+
```py
120+
from diffusers.modular_pipelines import AutoPipelineBlocks
121+
122+
class AutoImageBlocks(AutoPipelineBlocks):
123+
# 选择子块类的列表
124+
block_classes = [block_inpaint_cls, block_i2i_cls, block_t2i_cls]
125+
# 每个块的名称,顺序相同
126+
block_names = ["inpaint", "img2img", "text2img"]
127+
# 决定运行哪个块的触发输入
128+
# - "mask" 触发修复工作流
129+
# - "image" 触发img2img工作流(但仅在未提供mask时)
130+
# - 如果以上都没有,运行text2img工作流(默认)
131+
block_trigger_inputs = ["mask", "image", None]
132+
# 对于AutoPipelineBlocks来说,描述极其重要
133+
134+
def description(self):
135+
return (
136+
"Pipeline generates images given different types of conditions!\n"
137+
+ "This is an auto pipeline block that works for text2img, img2img and inpainting tasks.\n"
138+
+ " - inpaint workflow is run when `mask` is provided.\n"
139+
+ " - img2img workflow is run when `image` is provided (but only when `mask` is not provided).\n"
140+
+ " - text2img workflow is run when neither `image` nor `mask` is provided.\n"
141+
)
142+
```
143+
144+
包含`description`以避免任何关于如何运行块和需要什么输入的混淆**非常**重要。虽然[`~modular_pipelines.AutoPipelineBlocks`]很方便,但如果它没有正确解释,其条件逻辑可能难以理解。
145+
146+
创建`AutoImageBlocks`的一个实例。
147+
148+
```py
149+
auto_blocks = AutoImageBlocks()
150+
```
151+
152+
对于更复杂的组合,例如在更大的管道中作为子块使用的嵌套[`~modular_pipelines.AutoPipelineBlocks`]块,使用[`~modular_pipelines.SequentialPipelineBlocks.get_execution_blocks`]方法根据你的输入提取实际运行的块。
153+
154+
```py
155+
auto_blocks.get_execution_blocks("mask")
156+
```

0 commit comments

Comments
 (0)