[Flux Redux] add prompt & multiple image input #10056

linoytsaban · 2024-11-29T13:51:55Z

Add the following to the flux redux prior pipeline:

prompt inputs
image interpolation

inference example:

image = Image.open("Self-portrait-oil-canvas-Thorn-Necklace-Hummingbird-Frida.jpg").convert("RGB")
image2 = Image.open("Mona_Lisa.jpg").convert("RGB")

pipe_prior_output = pipe_prior_redux([image,image2], prompt=["self portrait by frida khalo", "mona lisa"], 
                                     prompt_embeds_scale=[.9, .75],
                                     pooled_prompt_embeds_scale=[.6,1.25]
                                    )
pipe(
    guidance_scale=2.5,
    height=1024,
    width=1024,
    num_inference_steps=50,
    max_sequence_length=512,
    generator=torch.Generator("cpu").manual_seed(0),
    **pipe_prior_output,
).images[0]

HuggingFaceDocBuilderDev · 2024-11-29T13:58:04Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

# Conflicts: # src/diffusers/pipelines/flux/pipeline_flux_prior_redux.py

yiyixuxu · 2024-12-01T21:56:16Z

let us know when it is ready for a review!
cc @asomoza

linoytsaban · 2024-12-02T11:13:24Z

@yiyixuxu @asomoza I think it's ready for review :)

src/diffusers/pipelines/flux/pipeline_flux_prior_redux.py

Co-authored-by: hlky <[email protected]>

hlky

Works well, thanks! 🤗

Code

import torch
from diffusers import FluxPriorReduxPipeline, FluxPipeline
from diffusers.utils import load_image

device = "cuda"
dtype = torch.bfloat16
repo_redux = "black-forest-labs/FLUX.1-Redux-dev"
repo_base = "black-forest-labs/FLUX.1-dev"
pipe_prior_redux = FluxPriorReduxPipeline.from_pretrained(
  repo_redux, torch_dtype=dtype
).to(device)
pipe = FluxPipeline.from_pretrained(
  repo_base, text_encoder=None, text_encoder_2=None, torch_dtype=torch.bfloat16
).to(device)

image = load_image(
  "https://www.arthistoryproject.com/site/assets/files/19982/frida-kahlo-self-portrait-with-thorn-necklace-and-hummingbird-1940-trivium-art-history.jpg"
)
image2 = load_image(
  "https://upload.wikimedia.org/wikipedia/commons/thumb/6/6a/Mona_Lisa.jpg/1354px-Mona_Lisa.jpg"
)

pipe_prior_output = pipe_prior_redux(
  [image, image2],
  prompt=["self portrait by frida khalo", "mona lisa"],
  prompt_embeds_scale=[0.9, 0.75],
  pooled_prompt_embeds_scale=[0.6, 1.25],
)
images = pipe(
  guidance_scale=2.5,
  height=1024,
  width=1024,
  num_inference_steps=50,
  max_sequence_length=512,
  generator=torch.Generator("cpu").manual_seed(0),
  **pipe_prior_output,
).images[0]
images.save("flux-redux.png")

Output

yiyixuxu

thanks! this is really cool - would love to have it on doc somewhere too

yiyixuxu · 2024-12-02T21:38:57Z

src/diffusers/pipelines/flux/pipeline_flux_prior_redux.py

-                prompt_2=None,
-                prompt_embeds=None,
-                pooled_prompt_embeds=None,
+                prompt=prompt,


let's throw out a warning here:
if prompt inputs is passed but do not have text_encoder/tokenizer, in this case the text inputs will be ignored

it is a bit of different from our regular pipelines, normally, if you pass a prompt and do not have a text_encoder, you will get an error says like from encode_prompt; here we will just use zero prompt embeds instead, so let's be make an explicit warning about that

agree, added one now

yiyixuxu · 2024-12-02T21:44:29Z

src/diffusers/pipelines/flux/pipeline_flux_prior_redux.py

                numpy array and pytorch tensor, the expected value range is between `[0, 1]` If it's a tensor or a list
                or tensors, the expected shape should be `(B, C, H, W)` or `(C, H, W)`. If it is a numpy array or a
                list of arrays, the expected shape should be `(B, H, W, C)` or `(H, W, C)`
+            prompt (`str` or `List[str]`, *optional*):


let's make it clear that it is an experimental feature, and if you pass prompt, you will need to load text_encoders explicitly

sayakpaul · 2024-12-03T03:01:17Z

THIS NEEDS TO BE IN THE DOCS.

@stevhliu any ideas about the location?

stevhliu · 2024-12-03T16:13:02Z

Super cool! 🤩

We can add it to the "Specific Pipeline Examples" section and then build out the Flux doc there as discussed here.

yiyixuxu

thanks! I think we can add doc in a separate PR
but need make style

linoytsaban · 2024-12-04T07:57:06Z

@yiyixuxu I'm not sure what's the issue, when I run make fixup it doesnt make any changes / flag any issues 🤔

hlky · 2024-12-04T08:02:54Z

@linoytsaban It's the doc-builder check, try doc-builder style src/diffusers docs/source --max_len 119 on its own or make style, it's not covered with make fixup

linoytsaban · 2024-12-04T17:36:14Z

thanks @hlky!

sayakpaul · 2024-12-05T02:08:30Z

@linoytsaban let's add some docs (#10056 (comment)) and communicate?

Thekey756 · 2024-12-05T09:44:26Z

效果不太好，并没有达到官方示例的效果

hlky · 2024-12-05T09:49:36Z

@Thekey756 do you have an example comparison to the original?

linoytsaban · 2024-12-05T11:11:34Z

@Thekey756 I think to achieve the original effect you're referring to we need to also take in consideration - #10025

linoytsaban · 2024-12-08T12:45:11Z

Example using prompts and attention masking for improved prompt adherence:

pipe_prior_output = pipe_prior_redux([image], prompt=["anime illustration"], 
                                     prompt_embeds_scale=[1.],
                                     pooled_prompt_embeds_scale=[1.]
                                    )

cond_size = 729
hidden_size = 4096
max_sequence_length = 512
full_attention_size = max_sequence_length + hidden_size + cond_size
attention_mask = torch.zeros(
    (full_attention_size, full_attention_size), device="cuda", dtype=torch.bfloat16
)
reference_scale: float = 0.04 # example
bias = torch.log(
    torch.tensor(reference_scale, dtype=torch.bfloat16, device="cuda").clamp(min=1e-5, max=1)
)
attention_mask[:, max_sequence_length : max_sequence_length + cond_size] = bias
joint_attention_kwargs=dict(attention_mask=attention_mask)

pipe(
    guidance_scale=2.5,
    height=1024,
    width=1024,
    num_inference_steps=50,
    max_sequence_length=512,
    generator=torch.Generator("cpu").manual_seed(0),
    joint_attention_kwargs=joint_attention_kwargs,
    **pipe_prior_output,
).images[0]

* add multiple prompts to flux redux --------- Co-authored-by: hlky <[email protected]>

lhjlhj11 · 2025-01-10T06:50:47Z

Add the following to the flux redux prior pipeline:

* prompt inputs

* image interpolation

inference example:

image = Image.open("Self-portrait-oil-canvas-Thorn-Necklace-Hummingbird-Frida.jpg").convert("RGB")
image2 = Image.open("Mona_Lisa.jpg").convert("RGB")

pipe_prior_output = pipe_prior_redux([image,image2], prompt=["self portrait by frida khalo", "mona lisa"], 
                                     prompt_embeds_scale=[.9, .75],
                                     pooled_prompt_embeds_scale=[.6,1.25]
                                    )
pipe(
    guidance_scale=2.5,
    height=1024,
    width=1024,
    num_inference_steps=50,
    max_sequence_length=512,
    generator=torch.Generator("cpu").manual_seed(0),
    **pipe_prior_output,
).images[0]

Can you give an example for your Fast FLUX.1 Redux in the space of huggingface? Thanks! And what is the masking scale?

MikeHanKK · 2025-03-12T06:17:10Z

how should I set prompt_embeds_scale, pooled_prompt_embeds_scale, and guidance_scale in general? @linoytsaban

linoytsaban added 4 commits November 29, 2024 13:13

add multiple prompts to flux redux

ac266fb

check inputs

bf2e149

check inputs

27acef8

doc

6fbf290

linoytsaban added 5 commits November 29, 2024 15:58

doc

e6f26b9

fix error

7198ec3

style

382e556

Merge remote-tracking branch 'origin/redux' into redux

ef9ec65

# Conflicts: # src/diffusers/pipelines/flux/pipeline_flux_prior_redux.py

weighted sum

7d13a41

linoytsaban and others added 3 commits December 2, 2024 13:00

fix

b8dfdf7

Merge branch 'main' into redux

8bc5f7a

style

012a0ec

linoytsaban marked this pull request as ready for review December 2, 2024 11:02

linoytsaban added 3 commits December 2, 2024 13:10

check len of scales

5af6811

Merge remote-tracking branch 'origin/redux' into redux

6275586

style

bf68f2e

Merge branch 'main' into redux

34715b1

hlky reviewed Dec 2, 2024

View reviewed changes

src/diffusers/pipelines/flux/pipeline_flux_prior_redux.py Show resolved Hide resolved

src/diffusers/pipelines/flux/pipeline_flux_prior_redux.py Outdated Show resolved Hide resolved

linoytsaban and others added 3 commits December 2, 2024 21:54

Update src/diffusers/pipelines/flux/pipeline_flux_prior_redux.py

d2b4881

Co-authored-by: hlky <[email protected]>

fix check_inputs call

a9e893e

Merge branch 'main' into redux

df49440

hlky approved these changes Dec 2, 2024

View reviewed changes

yiyixuxu reviewed Dec 2, 2024

View reviewed changes

linoytsaban and others added 2 commits December 3, 2024 13:29

add warning in doc on providing prompts

7c93dd0

Merge branch 'main' into redux

a350d0c

yiyixuxu approved these changes Dec 3, 2024

View reviewed changes

yiyixuxu added the close-to-merge label Dec 3, 2024

Merge branch 'main' into redux

971b376

yiyixuxu mentioned this pull request Dec 4, 2024

attention mask for transformer Flux #10025

Closed

linoytsaban and others added 2 commits December 4, 2024 18:11

Merge branch 'main' into redux

f42fe8c

make style

1ab7060

Merge branch 'main' into redux

0520ca5

yiyixuxu merged commit 04bba38 into huggingface:main Dec 4, 2024
15 checks passed

linoytsaban deleted the redux branch December 5, 2024 14:16

hlky mentioned this pull request Dec 12, 2024

Flux Control(Depth/Canny) + Inpaint #10192

Merged

6 tasks

sayakpaul pushed a commit that referenced this pull request Dec 23, 2024

[Flux Redux] add prompt & multiple image input (#10056)

8dff843

* add multiple prompts to flux redux --------- Co-authored-by: hlky <[email protected]>

teux91 mentioned this pull request May 25, 2025

[Feature] make Nunchaku FluxTransformer2DModel work with prior_redux with prompt/image interpolation nunchaku-tech/ComfyUI-nunchaku#203

Closed

2 tasks

Uh oh!

[Flux Redux] add prompt & multiple image input #10056

[Flux Redux] add prompt & multiple image input #10056

Uh oh!

Conversation

linoytsaban commented Nov 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Nov 29, 2024

Uh oh!

yiyixuxu commented Dec 1, 2024

Uh oh!

linoytsaban commented Dec 2, 2024

Uh oh!

Uh oh!

Uh oh!

hlky left a comment

Choose a reason for hiding this comment

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Dec 2, 2024

Choose a reason for hiding this comment

Uh oh!

linoytsaban Dec 3, 2024

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Dec 2, 2024

Choose a reason for hiding this comment

Uh oh!

linoytsaban Dec 3, 2024

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Dec 3, 2024

Uh oh!

stevhliu commented Dec 3, 2024

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

linoytsaban commented Dec 4, 2024

Uh oh!

hlky commented Dec 4, 2024

Uh oh!

linoytsaban commented Dec 4, 2024

Uh oh!

Uh oh!

sayakpaul commented Dec 5, 2024

Uh oh!

Thekey756 commented Dec 5, 2024

Uh oh!

hlky commented Dec 5, 2024

Uh oh!

linoytsaban commented Dec 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linoytsaban commented Dec 8, 2024

Uh oh!

lhjlhj11 commented Jan 10, 2025

Uh oh!

MikeHanKK commented Mar 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

linoytsaban commented Nov 29, 2024 •

edited

Loading

linoytsaban commented Dec 5, 2024 •

edited

Loading