Skip to content

Commit 70331b9

Browse files
Merge branch 'main' into main
2 parents 74e2c0d + 0d1c5b0 commit 70331b9

File tree

184 files changed

+2319
-1210
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

184 files changed

+2319
-1210
lines changed

docs/source/en/_toctree.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,15 +9,15 @@
99
- local: stable_diffusion
1010
title: Basic performance
1111

12-
- title: DiffusionPipeline
12+
- title: Pipelines
1313
isExpanded: false
1414
sections:
1515
- local: using-diffusers/loading
16-
title: Load pipelines
16+
title: DiffusionPipeline
1717
- local: tutorials/autopipeline
1818
title: AutoPipeline
1919
- local: using-diffusers/custom_pipeline_overview
20-
title: Load community pipelines and components
20+
title: Community pipelines and components
2121
- local: using-diffusers/callback
2222
title: Pipeline callbacks
2323
- local: using-diffusers/reusing_seeds
@@ -77,7 +77,7 @@
7777
- local: optimization/memory
7878
title: Reduce memory usage
7979
- local: optimization/speed-memory-optims
80-
title: Compile and offloading quantized models
80+
title: Compiling and offloading quantized models
8181
- title: Community optimizations
8282
sections:
8383
- local: optimization/pruna

docs/source/en/api/pipelines/flux.md

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -316,6 +316,67 @@ if integrity_checker.test_image(image_):
316316
raise ValueError("Your image has been flagged. Choose another prompt/image or try again.")
317317
```
318318

319+
### Kontext Inpainting
320+
`FluxKontextInpaintPipeline` enables image modification within a fixed mask region. It currently supports both text-based conditioning and image-reference conditioning.
321+
<hfoptions id="kontext-inpaint">
322+
<hfoption id="text-only">
323+
324+
325+
```python
326+
import torch
327+
from diffusers import FluxKontextInpaintPipeline
328+
from diffusers.utils import load_image
329+
330+
prompt = "Change the yellow dinosaur to green one"
331+
img_url = (
332+
"https://github.com/ZenAI-Vietnam/Flux-Kontext-pipelines/blob/main/assets/dinosaur_input.jpeg?raw=true"
333+
)
334+
mask_url = (
335+
"https://github.com/ZenAI-Vietnam/Flux-Kontext-pipelines/blob/main/assets/dinosaur_mask.png?raw=true"
336+
)
337+
338+
source = load_image(img_url)
339+
mask = load_image(mask_url)
340+
341+
pipe = FluxKontextInpaintPipeline.from_pretrained(
342+
"black-forest-labs/FLUX.1-Kontext-dev", torch_dtype=torch.bfloat16
343+
)
344+
pipe.to("cuda")
345+
346+
image = pipe(prompt=prompt, image=source, mask_image=mask, strength=1.0).images[0]
347+
image.save("kontext_inpainting_normal.png")
348+
```
349+
</hfoption>
350+
<hfoption id="image conditioning">
351+
352+
```python
353+
import torch
354+
from diffusers import FluxKontextInpaintPipeline
355+
from diffusers.utils import load_image
356+
357+
pipe = FluxKontextInpaintPipeline.from_pretrained(
358+
"black-forest-labs/FLUX.1-Kontext-dev", torch_dtype=torch.bfloat16
359+
)
360+
pipe.to("cuda")
361+
362+
prompt = "Replace this ball"
363+
img_url = "https://images.pexels.com/photos/39362/the-ball-stadion-football-the-pitch-39362.jpeg?auto=compress&cs=tinysrgb&dpr=1&w=500"
364+
mask_url = "https://github.com/ZenAI-Vietnam/Flux-Kontext-pipelines/blob/main/assets/ball_mask.png?raw=true"
365+
image_reference_url = "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTah3x6OL_ECMBaZ5ZlJJhNsyC-OSMLWAI-xw&s"
366+
367+
source = load_image(img_url)
368+
mask = load_image(mask_url)
369+
image_reference = load_image(image_reference_url)
370+
371+
mask = pipe.mask_processor.blur(mask, blur_factor=12)
372+
image = pipe(
373+
prompt=prompt, image=source, mask_image=mask, image_reference=image_reference, strength=1.0
374+
).images[0]
375+
image.save("kontext_inpainting_ref.png")
376+
```
377+
</hfoption>
378+
</hfoptions>
379+
319380
## Combining Flux Turbo LoRAs with Flux Control, Fill, and Redux
320381

321382
We can combine Flux Turbo LoRAs with Flux Control and other pipelines like Fill and Redux to enable few-steps' inference. The example below shows how to do that for Flux Control LoRA for depth and turbo LoRA from [`ByteDance/Hyper-SD`](https://hf.co/ByteDance/Hyper-SD).
@@ -646,3 +707,15 @@ image.save("flux-fp8-dev.png")
646707
[[autodoc]] FluxFillPipeline
647708
- all
648709
- __call__
710+
711+
## FluxKontextPipeline
712+
713+
[[autodoc]] FluxKontextPipeline
714+
- all
715+
- __call__
716+
717+
## FluxKontextInpaintPipeline
718+
719+
[[autodoc]] FluxKontextInpaintPipeline
720+
- all
721+
- __call__

docs/source/en/api/pipelines/overview.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,3 +113,17 @@ The table below lists all the pipelines currently available in 🤗 Diffusers an
113113
## PushToHubMixin
114114

115115
[[autodoc]] utils.PushToHubMixin
116+
117+
## Callbacks
118+
119+
[[autodoc]] callbacks.PipelineCallback
120+
121+
[[autodoc]] callbacks.SDCFGCutoffCallback
122+
123+
[[autodoc]] callbacks.SDXLCFGCutoffCallback
124+
125+
[[autodoc]] callbacks.SDXLControlnetCFGCutoffCallback
126+
127+
[[autodoc]] callbacks.IPAdapterScaleCutoffCallback
128+
129+
[[autodoc]] callbacks.SD3CFGCutoffCallback

docs/source/en/api/pipelines/qwenimage.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,10 @@ The `guidance_scale` parameter in the pipeline is there to support future guidan
120120
- all
121121
- __call__
122122

123+
## QwenImaggeControlNetPipeline
124+
- all
125+
- __call__
126+
123127
## QwenImagePipelineOutput
124128

125129
[[autodoc]] pipelines.qwenimage.pipeline_output.QwenImagePipelineOutput

docs/source/en/api/pipelines/wan.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
</div>
2121
</div>
2222

23-
# Wan2.1
23+
# Wan
2424

2525
[Wan-2.1](https://huggingface.co/papers/2503.20314) by the Wan Team.
2626

@@ -42,7 +42,7 @@ The following Wan models are supported in Diffusers:
4242
- [Wan 2.2 TI2V 5B](https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B-Diffusers)
4343

4444
> [!TIP]
45-
> Click on the Wan2.1 models in the right sidebar for more examples of video generation.
45+
> Click on the Wan models in the right sidebar for more examples of video generation.
4646
4747
### Text-to-Video Generation
4848

docs/source/en/optimization/fp16.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -209,7 +209,7 @@ There is also a [compile_regions](https://github.com/huggingface/accelerate/blob
209209
# pip install -U accelerate
210210
import torch
211211
from diffusers import StableDiffusionXLPipeline
212-
from accelerate.utils import compile regions
212+
from accelerate.utils import compile_regions
213213

214214
pipeline = StableDiffusionXLPipeline.from_pretrained(
215215
"stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16

docs/source/en/optimization/speed-memory-optims.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
1010
specific language governing permissions and limitations under the License.
1111
-->
1212

13-
# Compile and offloading quantized models
13+
# Compiling and offloading quantized models
1414

1515
Optimizing models often involves trade-offs between [inference speed](./fp16) and [memory-usage](./memory). For instance, while [caching](./cache) can boost inference speed, it also increases memory consumption since it needs to store the outputs of intermediate attention layers. A more balanced optimization strategy combines quantizing a model, [torch.compile](./fp16#torchcompile) and various [offloading methods](./memory#offloading).
1616

@@ -28,7 +28,8 @@ The table below provides a comparison of optimization strategy combinations and
2828
| quantization | 32.602 | 14.9453 |
2929
| quantization, torch.compile | 25.847 | 14.9448 |
3030
| quantization, torch.compile, model CPU offloading | 32.312 | 12.2369 |
31-
<small>These results are benchmarked on Flux with a RTX 4090. The transformer and text_encoder components are quantized. Refer to the [benchmarking script](https://gist.github.com/sayakpaul/0db9d8eeeb3d2a0e5ed7cf0d9ca19b7d) if you're interested in evaluating your own model.</small>
31+
32+
<small>These results are benchmarked on Flux with a RTX 4090. The transformer and text_encoder components are quantized. Refer to the <a href="https://gist.github.com/sayakpaul/0db9d8eeeb3d2a0e5ed7cf0d9ca19b7d">benchmarking script</a> if you're interested in evaluating your own model.</small>
3233

3334
This guide will show you how to compile and offload a quantized model with [bitsandbytes](../quantization/bitsandbytes#torchcompile). Make sure you are using [PyTorch nightly](https://pytorch.org/get-started/locally/) and the latest version of bitsandbytes.
3435

docs/source/en/quicktour.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,9 @@ Take a look at the [Quantization](./quantization/overview) section for more deta
162162

163163
## Optimizations
164164

165+
> [!TIP]
166+
> Optimization is dependent on hardware specs such as memory. Use this [Space](https://huggingface.co/spaces/diffusers/optimized-diffusers-code) to generate code examples that include all of Diffusers' available memory and speed optimization techniques for any model you're using.
167+
165168
Modern diffusion models are very large and have billions of parameters. The iterative denoising process is also computationally intensive and slow. Diffusers provides techniques for reducing memory usage and boosting inference speed. These techniques can be combined with quantization to optimize for both memory usage and inference speed.
166169

167170
### Memory usage

docs/source/en/tutorials/using_peft_for_inference.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ pipeline = AutoPipelineForText2Image.from_pretrained(
9494
pipeline.unet.load_lora_adapter(
9595
"jbilcke-hf/sdxl-cinematic-1",
9696
weight_name="pytorch_lora_weights.safetensors",
97-
adapter_name="cinematic"
97+
adapter_name="cinematic",
9898
prefix="unet"
9999
)
100100
# use cnmt in the prompt to trigger the LoRA
@@ -688,4 +688,4 @@ Browse the [LoRA Studio](https://lorastudio.co/models) for different LoRAs to us
688688
689689
You can find additional LoRAs in the [FLUX LoRA the Explorer](https://huggingface.co/spaces/multimodalart/flux-lora-the-explorer) and [LoRA the Explorer](https://huggingface.co/spaces/multimodalart/LoraTheExplorer) Spaces.
690690

691-
Check out the [Fast LoRA inference for Flux with Diffusers and PEFT](https://huggingface.co/blog/lora-fast) blog post to learn how to optimize LoRA inference with methods like FlashAttention-3 and fp8 quantization.
691+
Check out the [Fast LoRA inference for Flux with Diffusers and PEFT](https://huggingface.co/blog/lora-fast) blog post to learn how to optimize LoRA inference with methods like FlashAttention-3 and fp8 quantization.

0 commit comments

Comments
 (0)