Skip to content

Commit 5c9b971

Browse files
authored
Merge branch 'main' into fix_flax_use_memory_efficient_attention
2 parents 5959518 + 0f111ab commit 5c9b971

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+4388
-282
lines changed

.github/workflows/build_docker_images.yml

Lines changed: 5 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -90,24 +90,11 @@ jobs:
9090

9191
- name: Post to a Slack channel
9292
id: slack
93-
uses: slackapi/slack-github-action@6c661ce58804a1a20f6dc5fbee7f0381b469e001
93+
uses: huggingface/hf-workflows/.github/actions/post-slack@main
9494
with:
9595
# Slack channel id, channel name, or user id to post message.
9696
# See also: https://api.slack.com/methods/chat.postMessage#channels
97-
channel-id: ${{ env.CI_SLACK_CHANNEL }}
98-
# For posting a rich message using Block Kit
99-
payload: |
100-
{
101-
"text": "${{ matrix.image-name }} Docker Image build result: ${{ job.status }}\n${{ github.event.head_commit.url }}",
102-
"blocks": [
103-
{
104-
"type": "section",
105-
"text": {
106-
"type": "mrkdwn",
107-
"text": "${{ matrix.image-name }} Docker Image build result: ${{ job.status }}\n${{ github.event.head_commit.url }}"
108-
}
109-
}
110-
]
111-
}
112-
env:
113-
SLACK_BOT_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
97+
slack_channel: ${{ env.CI_SLACK_CHANNEL }}
98+
title: "🤗 Results of the ${{ matrix.image-name }} Docker Image build"
99+
status: ${{ job.status }}
100+
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

.github/workflows/pr_tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,7 @@ jobs:
156156
if: ${{ matrix.config.framework == 'pytorch_examples' }}
157157
run: |
158158
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
159-
python -m uv pip install peft
159+
python -m uv pip install peft timm
160160
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
161161
--make-reports=tests_${{ matrix.config.report }} \
162162
examples

.github/workflows/push_tests.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -426,6 +426,7 @@ jobs:
426426
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
427427
run: |
428428
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
429+
python -m uv pip install timm
429430
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v --make-reports=examples_torch_cuda examples/
430431
431432
- name: Failure short reports

.github/workflows/push_tests_fast.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ jobs:
107107
if: ${{ matrix.config.framework == 'pytorch_examples' }}
108108
run: |
109109
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
110-
python -m uv pip install peft
110+
python -m uv pip install peft timm
111111
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
112112
--make-reports=tests_${{ matrix.config.report }} \
113113
examples

.github/workflows/push_tests_mps.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ concurrency:
2323
jobs:
2424
run_fast_tests_apple_m1:
2525
name: Fast PyTorch MPS tests on MacOS
26-
runs-on: [ self-hosted, apple-m1 ]
26+
runs-on: macos-13-xlarge
2727

2828
steps:
2929
- name: Checkout diffusers
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
name: Run (SLOW) desired tests on our runner from a PR (applicable to GPUs only at the moment)
2+
3+
on:
4+
workflow_dispatch:
5+
inputs:
6+
pr_number:
7+
description: 'PR number'
8+
required: true
9+
docker_image:
10+
default: 'diffusers/diffusers-pytorch-cuda'
11+
description: 'Name of the Docker image'
12+
required: true
13+
test_command:
14+
description: 'Test command to run (e.g.: `pytest tests/pipelines/dit/`). Any valid pytest command can be provided.'
15+
required: true
16+
17+
env:
18+
IS_GITHUB_CI: "1"
19+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
20+
HF_HOME: /mnt/cache
21+
DIFFUSERS_IS_CI: yes
22+
OMP_NUM_THREADS: 8
23+
MKL_NUM_THREADS: 8
24+
RUN_SLOW: yes
25+
26+
jobs:
27+
run_tests:
28+
name: "Run a test on our runner from a PR"
29+
runs-on: [single-gpu, nvidia-gpu, "t4", ci]
30+
container:
31+
image: ${{ github.event.inputs.docker_image }}
32+
options: --gpus all --privileged --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
33+
34+
steps:
35+
- name: NVIDIA-SMI
36+
run: |
37+
nvidia-smi
38+
39+
- uses: actions/checkout@v3
40+
- name: Install `gh`
41+
run: |
42+
: # see https://github.com/cli/cli/blob/trunk/docs/install_linux.md#debian-ubuntu-linux-raspberry-pi-os-apt
43+
(type -p wget >/dev/null || (apt update && apt-get install wget -y)) \
44+
&& mkdir -p -m 755 /etc/apt/keyrings \
45+
&& wget -qO- https://cli.github.com/packages/githubcli-archive-keyring.gpg | tee /etc/apt/keyrings/githubcli-archive-keyring.gpg > /dev/null \
46+
&& chmod go+r /etc/apt/keyrings/githubcli-archive-keyring.gpg \
47+
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | tee /etc/apt/sources.list.d/github-cli.list > /dev/null \
48+
&& apt update \
49+
&& apt install gh -y
50+
51+
- name: Checkout the PR branch
52+
run: |
53+
gh pr checkout ${{ github.event.inputs.pr_number }}
54+
55+
- name: Run tests
56+
run: ${{ github.event.inputs.test_command }}

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -355,7 +355,7 @@ You will need basic `git` proficiency to be able to contribute to
355355
manual. Type `git --help` in a shell and enjoy. If you prefer books, [Pro
356356
Git](https://git-scm.com/book/en/v2) is a very good reference.
357357

358-
Follow these steps to start contributing ([supported Python versions](https://github.com/huggingface/diffusers/blob/main/setup.py#L265)):
358+
Follow these steps to start contributing ([supported Python versions](https://github.com/huggingface/diffusers/blob/42f25d601a910dceadaee6c44345896b4cfa9928/setup.py#L270)):
359359

360360
1. Fork the [repository](https://github.com/huggingface/diffusers) by
361361
clicking on the 'Fork' button on the repository's page. This creates a copy of the code

docs/source/en/using-diffusers/callback.md

Lines changed: 64 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,13 +19,74 @@ The denoising loop of a pipeline can be modified with custom defined functions u
1919
2020
This guide will demonstrate how callbacks work by a few features you can implement with them.
2121

22+
## Official callbacks
23+
24+
We provide a list of callbacks you can plug into an existing pipeline and modify the denoising loop. This is the current list of official callbacks:
25+
26+
- `SDCFGCutoffCallback`: Disables the CFG after a certain number of steps for all SD 1.5 pipelines, including text-to-image, image-to-image, inpaint, and controlnet.
27+
- `SDXLCFGCutoffCallback`: Disables the CFG after a certain number of steps for all SDXL pipelines, including text-to-image, image-to-image, inpaint, and controlnet.
28+
- `IPAdapterScaleCutoffCallback`: Disables the IP Adapter after a certain number of steps for all pipelines supporting IP-Adapter.
29+
30+
> [!TIP]
31+
> If you want to add a new official callback, feel free to open a [feature request](https://github.com/huggingface/diffusers/issues/new/choose) or [submit a PR](https://huggingface.co/docs/diffusers/main/en/conceptual/contribution#how-to-open-a-pr).
32+
33+
To set up a callback, you need to specify the number of denoising steps after which the callback comes into effect. You can do so by using either one of these two arguments
34+
35+
- `cutoff_step_ratio`: Float number with the ratio of the steps.
36+
- `cutoff_step_index`: Integer number with the exact number of the step.
37+
38+
```python
39+
import torch
40+
41+
from diffusers import DPMSolverMultistepScheduler, StableDiffusionXLPipeline
42+
from diffusers.callbacks import SDXLCFGCutoffCallback
43+
44+
45+
callback = SDXLCFGCutoffCallback(cutoff_step_ratio=0.4)
46+
# can also be used with cutoff_step_index
47+
# callback = SDXLCFGCutoffCallback(cutoff_step_ratio=None, cutoff_step_index=10)
48+
49+
pipeline = StableDiffusionXLPipeline.from_pretrained(
50+
"stabilityai/stable-diffusion-xl-base-1.0",
51+
torch_dtype=torch.float16,
52+
variant="fp16",
53+
).to("cuda")
54+
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config, use_karras_sigmas=True)
55+
56+
prompt = "a sports car at the road, best quality, high quality, high detail, 8k resolution"
57+
58+
generator = torch.Generator(device="cpu").manual_seed(2628670641)
59+
60+
out = pipeline(
61+
prompt=prompt,
62+
negative_prompt="",
63+
guidance_scale=6.5,
64+
num_inference_steps=25,
65+
generator=generator,
66+
callback_on_step_end=callback,
67+
)
68+
69+
out.images[0].save("official_callback.png")
70+
```
71+
72+
<div class="flex gap-4">
73+
<div>
74+
<img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/without_cfg_callback.png" alt="generated image of a sports car at the road" />
75+
<figcaption class="mt-2 text-center text-sm text-gray-500">without SDXLCFGCutoffCallback</figcaption>
76+
</div>
77+
<div>
78+
<img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/with_cfg_callback.png" alt="generated image of a a sports car at the road with cfg callback" />
79+
<figcaption class="mt-2 text-center text-sm text-gray-500">with SDXLCFGCutoffCallback</figcaption>
80+
</div>
81+
</div>
82+
2283
## Dynamic classifier-free guidance
2384

2485
Dynamic classifier-free guidance (CFG) is a feature that allows you to disable CFG after a certain number of inference steps which can help you save compute with minimal cost to performance. The callback function for this should have the following arguments:
2586

26-
* `pipeline` (or the pipeline instance) provides access to important properties such as `num_timesteps` and `guidance_scale`. You can modify these properties by updating the underlying attributes. For this example, you'll disable CFG by setting `pipeline._guidance_scale=0.0`.
27-
* `step_index` and `timestep` tell you where you are in the denoising loop. Use `step_index` to turn off CFG after reaching 40% of `num_timesteps`.
28-
* `callback_kwargs` is a dict that contains tensor variables you can modify during the denoising loop. It only includes variables specified in the `callback_on_step_end_tensor_inputs` argument, which is passed to the pipeline's `__call__` method. Different pipelines may use different sets of variables, so please check a pipeline's `_callback_tensor_inputs` attribute for the list of variables you can modify. Some common variables include `latents` and `prompt_embeds`. For this function, change the batch size of `prompt_embeds` after setting `guidance_scale=0.0` in order for it to work properly.
87+
- `pipeline` (or the pipeline instance) provides access to important properties such as `num_timesteps` and `guidance_scale`. You can modify these properties by updating the underlying attributes. For this example, you'll disable CFG by setting `pipeline._guidance_scale=0.0`.
88+
- `step_index` and `timestep` tell you where you are in the denoising loop. Use `step_index` to turn off CFG after reaching 40% of `num_timesteps`.
89+
- `callback_kwargs` is a dict that contains tensor variables you can modify during the denoising loop. It only includes variables specified in the `callback_on_step_end_tensor_inputs` argument, which is passed to the pipeline's `__call__` method. Different pipelines may use different sets of variables, so please check a pipeline's `_callback_tensor_inputs` attribute for the list of variables you can modify. Some common variables include `latents` and `prompt_embeds`. For this function, change the batch size of `prompt_embeds` after setting `guidance_scale=0.0` in order for it to work properly.
2990

3091
Your callback function should look something like this:
3192

examples/community/README.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@ Please also check out our [Community Scripts](https://github.com/huggingface/dif
6868
| InstantID Pipeline | Stable Diffusion XL Pipeline that supports InstantID | [InstantID Pipeline](#instantid-pipeline) | [![Hugging Face Space](https://img.shields.io/badge/🤗%20Hugging%20Face-Space-yellow)](https://huggingface.co/spaces/InstantX/InstantID) | [Haofan Wang](https://github.com/haofanwang) |
6969
| UFOGen Scheduler | Scheduler for UFOGen Model (compatible with Stable Diffusion pipelines) | [UFOGen Scheduler](#ufogen-scheduler) | - | [dg845](https://github.com/dg845) |
7070
| Stable Diffusion XL IPEX Pipeline | Accelerate Stable Diffusion XL inference pipeline with BF16/FP32 precision on Intel Xeon CPUs with [IPEX](https://github.com/intel/intel-extension-for-pytorch) | [Stable Diffusion XL on IPEX](#stable-diffusion-xl-on-ipex) | - | [Dan Li](https://github.com/ustcuna/) |
71+
| Stable Diffusion BoxDiff Pipeline | Training-free controlled generation with bounding boxes using [BoxDiff](https://github.com/showlab/BoxDiff) | [Stable Diffusion BoxDiff Pipeline](#stable-diffusion-boxdiff) | - | [Jingyang Zhang](https://github.com/zjysteven/) |
7172

7273
To load a custom pipeline you just need to pass the `custom_pipeline` argument to `DiffusionPipeline`, as one of the files in `diffusers/examples/community`. Feel free to send a PR with your own pipelines, we will merge them quickly.
7374

@@ -1676,6 +1677,68 @@ image = pipe(prompt, image=input_image, strength=0.75,).images[0]
16761677
image.save('tensorrt_img2img_new_zealand_hills.png')
16771678
```
16781679

1680+
### Stable Diffusion BoxDiff
1681+
BoxDiff is a training-free method for controlled generation with bounding box coordinates. It shoud work with any Stable Diffusion model. Below shows an example with `stable-diffusion-2-1-base`.
1682+
```py
1683+
import torch
1684+
from PIL import Image, ImageDraw
1685+
from copy import deepcopy
1686+
1687+
from examples.community.pipeline_stable_diffusion_boxdiff import StableDiffusionBoxDiffPipeline
1688+
1689+
def draw_box_with_text(img, boxes, names):
1690+
colors = ["red", "olive", "blue", "green", "orange", "brown", "cyan", "purple"]
1691+
img_new = deepcopy(img)
1692+
draw = ImageDraw.Draw(img_new)
1693+
1694+
W, H = img.size
1695+
for bid, box in enumerate(boxes):
1696+
draw.rectangle([box[0] * W, box[1] * H, box[2] * W, box[3] * H], outline=colors[bid % len(colors)], width=4)
1697+
draw.text((box[0] * W, box[1] * H), names[bid], fill=colors[bid % len(colors)])
1698+
return img_new
1699+
1700+
pipe = StableDiffusionBoxDiffPipeline.from_pretrained(
1701+
"stabilityai/stable-diffusion-2-1-base",
1702+
torch_dtype=torch.float16,
1703+
)
1704+
pipe.to("cuda")
1705+
1706+
# example 1
1707+
prompt = "as the aurora lights up the sky, a herd of reindeer leisurely wanders on the grassy meadow, admiring the breathtaking view, a serene lake quietly reflects the magnificent display, and in the distance, a snow-capped mountain stands majestically, fantasy, 8k, highly detailed"
1708+
phrases = [
1709+
"aurora",
1710+
"reindeer",
1711+
"meadow",
1712+
"lake",
1713+
"mountain"
1714+
]
1715+
boxes = [[1,3,512,202], [75,344,421,495], [1,327,508,507], [2,217,507,341], [1,135,509,242]]
1716+
1717+
# example 2
1718+
# prompt = "A rabbit wearing sunglasses looks very proud"
1719+
# phrases = ["rabbit", "sunglasses"]
1720+
# boxes = [[67,87,366,512], [66,130,364,262]]
1721+
1722+
boxes = [[x / 512 for x in box] for box in boxes]
1723+
1724+
images = pipe(
1725+
prompt,
1726+
boxdiff_phrases=phrases,
1727+
boxdiff_boxes=boxes,
1728+
boxdiff_kwargs={
1729+
"attention_res": 16,
1730+
"normalize_eot": True
1731+
},
1732+
num_inference_steps=50,
1733+
guidance_scale=7.5,
1734+
generator=torch.manual_seed(42),
1735+
safety_checker=None
1736+
).images
1737+
1738+
draw_box_with_text(images[0], boxes, phrases).save("output.png")
1739+
```
1740+
1741+
16791742
### Stable Diffusion Reference
16801743

16811744
This pipeline uses the Reference Control. Refer to the [sd-webui-controlnet discussion: Reference-only Control](https://github.com/Mikubill/sd-webui-controlnet/discussions/1236)[sd-webui-controlnet discussion: Reference-adain Control](https://github.com/Mikubill/sd-webui-controlnet/discussions/1280).

0 commit comments

Comments
 (0)