[FLUX] add Img2Img pipeline #9070

shauray8 · 2024-08-03T13:31:48Z

What does this PR do?

image-to-image support for the FLUX pipeline.

Before submitting

Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@asomoza @a-r-r-o-w @sayakpaul

sayakpaul · 2024-08-04T10:15:18Z

@shauray8 could you also show us examples?

deforum-art · 2024-08-04T20:04:47Z

this does not work correctly, its missing the PipelineImageInput and does not add scaled noise to the init

the ordering of steps 4 and 5 need to be altered because correctly noising the latent requires timestep to be defined

shauray8 · 2024-08-05T18:25:06Z

Hey @sayakpaul @deforum, here I'm just trying out things, in theory img2img should work i guess with FLUX family of models, so as soon as I get somewhere I'll post some results.
I'll keep you guys posted :)

deforum-art · 2024-08-05T19:19:18Z

I have built a "working" pipeline, however the results are poor/different than what I would expect. I am unsure if it is related to difference in guidance etc..

https://github.com/deforum-studio/flux/blob/main/flux_pipeline.py

asomoza · 2024-08-05T19:55:52Z

what I found is that it needs a lot more strength than other models:

prompt: "high quality photo of a capybara"

original	img2img

deforum-art · 2024-08-05T22:07:17Z

i think there is an issue with the noise schedule, what would make this model need more strength?

asomoza · 2024-08-06T03:36:58Z

Probably because it's a distilled model (I'm using dev), I saw people with the same issue and also I see the same results with ComfyUI.

I just did a quick test because I need it for diff-diff so I didn't dig that much into since I'm more interested in how it works with inpainting and because img2img is already taken by @shauray8 and this PR.

sayakpaul · 2024-08-09T15:28:08Z

@shauray8 we would like to ship the img2img pipeline soon (preferably next week) because of its demand. Would it be possible for you to provide your commit address so that we can honor your contributions by adding you as a co-author? In that case, we can close this PR.

Upon inactivity, we will close it next week and do a PR. But in any case, please let us know about your commit address. I hope this is okay :)

SoftologyPro · 2024-08-09T21:15:14Z

The img2img is faster in some ways but overall slower than the original "on potato" script.
For example, using the following code for a single image takes 3m10s overall on a 4090 to create a single image. The initial pipeline creation is fast, but the image generation is slower taking most of the time.

pipe = FluxImg2ImgPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell", torch_dtype=torch.bfloat16)
pipe.to("cuda")
url = None
init_image = None #load_image(url)
prompt = args2.prompt
image = pipe(prompt, image=init_image, num_inference_steps=4, guidance_scale=3.5).images[0]
image.save('blah.png')

Using the original on potato script is slower to setup the encoders, tokenizers, etc but the image generation takes only seconds.
Overall this still takes around 2m45s to finish.

bfl_repo = "black-forest-labs/FLUX.1-schnell"
scheduler = FlowMatchEulerDiscreteScheduler.from_pretrained(bfl_repo, subfolder="scheduler")
text_encoder = CLIPTextModel.from_pretrained("openai/clip-vit-large-patch14", torch_dtype=dtype)
tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-large-patch14", torch_dtype=dtype)
text_encoder_2 = T5EncoderModel.from_pretrained(bfl_repo, subfolder="text_encoder_2", torch_dtype=dtype)
tokenizer_2 = T5TokenizerFast.from_pretrained(bfl_repo, subfolder="tokenizer_2", torch_dtype=dtype)
vae = AutoencoderKL.from_pretrained(bfl_repo, subfolder="vae", torch_dtype=dtype)
transformer = FluxTransformer2DModel.from_pretrained(bfl_repo, subfolder="transformer", torch_dtype=dtype)
quantize(transformer, weights=qfloat8)
freeze(transformer)
quantize(text_encoder_2, weights=qfloat8)
freeze(text_encoder_2)
pipe = FluxPipeline(
    scheduler=scheduler,
    text_encoder=text_encoder,
    tokenizer=tokenizer,
    text_encoder_2=None,
    tokenizer_2=tokenizer_2,
    vae=vae,
    transformer=None,
)
pipe.text_encoder_2 = text_encoder_2
pipe.transformer = transformer
generator = torch.Generator().manual_seed(args2.seed)
image = pipe(
    prompt=args2.prompt, 
    width=1024,
    height=1024,
    num_inference_steps=4, 
    generator=generator,
    guidance_scale=3.5,
).images[0]
image.save('blah.png')

If there was a way to combine the speed of the pipeline setup from img2img with the image generation speed of on_potato then Flux performance would be closer to the other recent Text-to-Image systems (for example Playground v2.5 takes 15 seconds total to generate a 1024x1024 image, including loading models and setting up the pipeline). If anyone can speed these up in any way it would be very helpful. I have added support for Flux to Visions of Chaos and the general consensus is "great image quality, but slow"

shauray8 · 2024-08-10T08:43:48Z

@shauray8 we would like to ship the img2img pipeline soon (preferably next week) because of its demand. Would it be possible for you to provide your commit address so that we can honor your contributions by adding you as a co-author? In that case, we can close this PR.

Upon inactivity, we will close it next week and do a PR. But in any case, please let us know about your commit address. I hope this is okay :)

sure @sayakpaul, I don't seem to get good results possible due to how I was passing it through the scheduler so yes, also I mean I did not do any contribution but here you go 141bd6bbfa5a3bb096eaa8056e540a3f9e559e2b

smthemex · 2024-09-11T13:21:38Z

I have built a "working" pipeline, however the results are poor/different than what I would expect. I am unsure if it is related to difference in guidance etc..

https://github.com/deforum-studio/flux/blob/main/flux_pipeline.py

Use you codes, change strength to 0.7 ~0.6 will get a not bad image...

yiyixuxu · 2024-09-12T01:37:05Z

ohhh @shauray8
I'm really sorry I missed this PR!! we merged this one instead even though it comes after your PR #9135

Would it be ok if I close this one now? if you see any improvement you can make on the flux img2img and inpaint pipeline we, Please let us know! we can make a PR and make you an author, or you are welcomed to make a PR too

sorry again

shauray8 · 2024-09-12T07:58:37Z

ohhh @shauray8 I'm really sorry I missed this PR!! we merged this one instead even though it comes after your PR #9135

Would it be ok if I close this one now? if you see any improvement you can make on the flux img2img and inpaint pipeline we, Please let us know! we can make a PR and make you an author, or you are welcomed to make a PR too

sorry again

@yiyixuxu no worries, anyway my code wasn't giving off good results, let's see what I can improve on 🫡

flux img2img addition

141bd6b

sayakpaul requested a review from asomoza August 4, 2024 10:15

tin2tin mentioned this pull request Aug 5, 2024

img2img tin2tin/Pallaidium#111

Closed

shauray8 closed this Sep 12, 2024

Uh oh!

[FLUX] add Img2Img pipeline #9070

[FLUX] add Img2Img pipeline #9070

Uh oh!

Conversation

shauray8 commented Aug 3, 2024

What does this PR do?

Before submitting

Who can review?

Uh oh!

sayakpaul commented Aug 4, 2024

Uh oh!

deforum-art commented Aug 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shauray8 commented Aug 5, 2024

Uh oh!

deforum-art commented Aug 5, 2024

Uh oh!

asomoza commented Aug 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deforum-art commented Aug 5, 2024

Uh oh!

asomoza commented Aug 6, 2024

Uh oh!

sayakpaul commented Aug 9, 2024

Uh oh!

SoftologyPro commented Aug 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shauray8 commented Aug 10, 2024

Uh oh!

smthemex commented Sep 11, 2024

Uh oh!

yiyixuxu commented Sep 12, 2024

Uh oh!

shauray8 commented Sep 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

deforum-art commented Aug 4, 2024 •

edited

Loading

asomoza commented Aug 5, 2024 •

edited

Loading

SoftologyPro commented Aug 9, 2024 •

edited

Loading