[docs] mask_blur and padding_mask_crop (#6498)

stevhliu · web-flow · commit a551ddf928f0 · 2024-01-10T08:14:34.000-08:00
new inpaint features
diff --git a/docs/source/en/using-diffusers/inpaint.md b/docs/source/en/using-diffusers/inpaint.md
@@ -77,12 +77,42 @@ Throughout this guide, the mask image is provided in all of the code examples fo
 Upload a base image to inpaint on and use the sketch tool to draw a mask. Once you're done, click **Run** to generate and download the mask image.
 
 <iframe
-	src="https://stevhliu-inpaint-mask-maker.hf.space"
-	frameborder="0"
-	width="850"
-	height="450"
+  src="https://stevhliu-inpaint-mask-maker.hf.space"
+  frameborder="0"
+  width="850"
+  height="450"
 ></iframe>
 
+### Mask blur
+
+The [`~VaeImageProcessor.blur`] method provides an option for how to blend the original image and inpaint area. The amount of blur is determined by the `blur_factor` parameter. Increasing the `blur_factor` increases the amount of blur applied to the mask edges, softening the transition between the original image and inpaint area. A low or zero `blur_factor` preserves the sharper edges of the mask.
+
+To use this, create a blurred mask with the image processor.
+
+```py
+import torch
+from diffusers import AutoPipelineForInpainting
+from diffusers.utils import load_image
+from PIL import Image
+
+pipeline = AutoPipelineForInpainting.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16).to('cuda')
+
+mask = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/seashore_mask.png")
+blurred_mask = pipeline.mask_processor.blur(mask, blur_factor=33)
+blurred_mask
+```
+
+<div class="flex gap-4">
+  <div>
+    <img class="rounded-xl" src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/seashore_mask.png"/>
+    <figcaption class="mt-2 text-center text-sm text-gray-500">mask with no blur</figcaption>
+  </div>
+  <div>
+    <img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/mask_blurred.png"/>
+    <figcaption class="mt-2 text-center text-sm text-gray-500">mask with blur applied</figcaption>
+  </div>
+</div>
+
 ## Popular models
 
 [Stable Diffusion Inpainting](https://huggingface.co/runwayml/stable-diffusion-inpainting), [Stable Diffusion XL (SDXL) Inpainting](https://huggingface.co/diffusers/stable-diffusion-xl-1.0-inpainting-0.1), and [Kandinsky 2.2 Inpainting](https://huggingface.co/kandinsky-community/kandinsky-2-2-decoder-inpaint) are among the most popular models for inpainting. SDXL typically produces higher resolution images than Stable Diffusion v1.5, and Kandinsky 2.2 is also capable of generating high-quality images.
@@ -318,7 +348,7 @@ make_image_grid([init_image, image], rows=1, cols=2)
 
 The trade-off of using a non-inpaint specific checkpoint is the overall image quality may be lower, but it generally tends to preserve the mask area (that is why you can see the mask outline). The inpaint specific checkpoints are intentionally trained to generate higher quality inpainted images, and that includes creating a more natural transition between the masked and unmasked areas. As a result, these checkpoints are more likely to change your unmasked area.
 
-If preserving the unmasked area is important for your task, you can use the `apply_overlay` method of [`VaeImageProcessor`] to force the unmasked area of an image to remain the same at the expense of some more unnatural transitions between the masked and unmasked areas.
+If preserving the unmasked area is important for your task, you can use the [`VaeImageProcessor.apply_overlay`] method to force the unmasked area of an image to remain the same at the expense of some more unnatural transitions between the masked and unmasked areas.
 
 ```py
 import PIL
@@ -475,6 +505,39 @@ make_image_grid([init_image, mask_image, image], rows=1, cols=3)
   </figure>
 </div>
 
+### Padding mask crop
+
+A method for increasing the inpainting image quality is to use the [`padding_mask_crop`](https://huggingface.co/docs/diffusers/v0.25.0/en/api/pipelines/stable_diffusion/inpaint#diffusers.StableDiffusionInpaintPipeline.__call__.padding_mask_crop) parameter. When enabled, this option crops the masked area with some user-specified padding and it'll also crop the same area from the original image. Both the image and mask are upscaled to a higher resolution for inpainting, and then overlaid on the original image. This is a quick and easy way to improve image quality without using a separate pipeline like [`StableDiffusionUpscalePipeline`].
+
+Add the `padding_mask_crop` parameter to the pipeline call and set it to the desired padding value.
+
+```py
+import torch
+from diffusers import AutoPipelineForInpainting
+from diffusers.utils import load_image
+from PIL import Image
+
+generator = torch.Generator(device='cuda').manual_seed(0)
+pipeline = AutoPipelineForInpainting.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16).to('cuda')
+
+base = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/seashore.png")
+mask = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/seashore_mask.png")
+
+image = pipeline("boat", image=base, mask_image=mask, strength=0.75, generator=generator, padding_mask_crop=32).images[0]
+image
+```
+
+<div class="flex gap-4">
+  <div>
+    <img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/baseline_inpaint.png"/>
+    <figcaption class="mt-2 text-center text-sm text-gray-500">default inpaint image</figcaption>
+  </div>
+  <div>
+    <img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/padding_mask_crop_inpaint.png"/>
+    <figcaption class="mt-2 text-center text-sm text-gray-500">inpaint image with `padding_mask_crop` enabled</figcaption>
+  </div>
+</div>
+
 ## Chained inpainting pipelines
 
 [`AutoPipelineForInpainting`] can be chained with other 🤗 Diffusers pipelines to edit their outputs. This is often useful for improving the output quality from your other diffusion pipelines, and if you're using multiple pipelines, it can be more memory-efficient to chain them together to keep the outputs in latent space and reuse the same pipeline components.