- 
                Notifications
    You must be signed in to change notification settings 
- Fork 6.4k
[examples] add controlnet sd3 example #9249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[examples] add controlnet sd3 example #9249
Conversation
| My implementation is based on the official examples for ControlNet and SD3 DreamBooth | 
| Great. Are you going to write controlnet train code for Flux? I really need it. | 
| 
 Hi, a nice implementation of controlnet flux with training scripts can be found at https://github.com/XLabs-AI/x-flux. | 
| 
 I have been using this library for 1 week and multi gpu is not working. I need multi gpu support to train large datasets. | 
| ohh thanks for your PR! @haofanwang @wangqixun would you be able to give this PR a review too? | 
| The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this! I have left a couple of comments.
Could you also share some results from your experiments?
And as @yiyixuxu mentioned, it'd be great to have this PR reviewed by @haofanwang @wangqixun as they were the first ones to have come up with SD3 ControlNets.
| 
 Hi, It can be fixed by following review. | 
3360355    to
    d9bf0d2      
    Compare
  
    | @sayakpaul @xduzhangjiayu Thank you very much for your kind reviews. I have updated the codes according to your comments and suggestions. I have also added two experimental images as results in README. Would you mind have another look? | 
| 
 Please find the result images at the bottom of README_sd3.md :) | 
| Thanks for the changes @DavyMorgan! Let's also add a test similar to https://github.com/huggingface/diffusers/blob/main/examples/controlnet/test_controlnet.py? | 
| 
 @sayakpaul Yeah sure! I have added a similar test to https://github.com/huggingface/diffusers/blob/main/examples/controlnet/test_controlnet.py. | 
| Could you run  | 
| 
 @sayakpaul Yeah. I have run  | 
| In the Fast tests for PRs / PyTorch Example CPU tests (pull_request): @sayakpaul It seems that the  | 
| We need to use a smaller ControlNet model. We should be able to initialize that from the  | 
| 
 @sayakpaul Thanks. I have updated the test to leverage the smaller SD3 model used in the official test script in 
 | 
…trolnet-sd3-example
| I see. I have also added a tiny controlnet model based on the official test script of controlnet-sd3. Now the example test passes on my local machine. @sayakpaul | 
| @sayakpaul It seems that the failure in fast pipeline test is unrelated to this PR. WDYT? | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contributions!
| import torch | ||
|  | ||
| base_model_path = "stabilityai/stable-diffusion-3-medium-diffusers" | ||
| controlnet_path = "sd3-controlnet-out/checkpoint-6500/controlnet" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a local path. Can we update this to a checkpoint on the Hub?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah sure, I will upload my checkpoint to the hub.
| | | | | ||
| |-------------------|:-------------------------:| | ||
| || pale golden rod circle with old lace background | | ||
|  |  | | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like there are artifacts in the output image but it could also be because of overfitting. Any comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I can include more sample images for validation.
| def image_grid(imgs, rows, cols): | ||
| assert len(imgs) == rows * cols | ||
|  | ||
| w, h = imgs[0].size | ||
| grid = Image.new("RGB", size=(cols * w, rows * h)) | ||
|  | ||
| for i, img in enumerate(imgs): | ||
| grid.paste(img, box=(i % cols * w, i // cols * h)) | ||
| return grid | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use the make_image_grid() utility function from diffusers.utils.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! Thanks.
| Yes,I'm sure it is text encoders that occupy the memory。After computing the embedding,text encoders are still in the GPU memory. (That means clear_objs_and_retain_memory can't work).I think it may related to my accelerate config. I will check later. Have you tried training with a large dataset(at least 1000k image)? I think it may needs lots of CPU RAM when pre-computing the text embedding. ---- Replied Message ----
| From | Yu ***@***.***> |
| Date | 09/13/2024 17:43 |
| To | ***@***.***> |
| Cc | ***@***.***>***@***.***> |
| Subject | Re: [huggingface/diffusers] [examples] add controlnet sd3 example (PR #9249) |
@DavyMorgan commented on this pull request.
In examples/controlnet/train_controlnet_sd3.py:  +    ) +
+    train_dataset = make_train_dataset(args, tokenizer_one, tokenizer_two, tokenizer_three, accelerator)
+
+    train_dataloader = torch.utils.data.DataLoader(
+        train_dataset,
+        shuffle=True,
+        collate_fn=collate_fn,
+        batch_size=args.train_batch_size,
+        num_workers=args.dataloader_num_workers,
+    )
+
+    tokenizers = [tokenizer_one, tokenizer_two, tokenizer_three]
+    text_encoders = [text_encoder_one, text_encoder_two, text_encoder_three]
+
+    def compute_text_embeddings(prompt, text_encoders, tokenizers):
Are you sure it is the text encoders that occupy the memory? The GPU memory can be other models, as the vae, transformer, and controlnet models are still in memory. Besides, as we periodically run the validation, the text encoders will also be loaded every validation_steps steps. From my experiments, previously I need to separate the training and validation in two distinct GPUS, and after the above update I only need one GPU to run the script.
During training, the text embeddings from text encoders are in memory, though there is a cached one in disk such that it will not compute them in your next run as long as the configs are the same.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***> | 
| 
 I only tested the fill50k dataset. As we use the  | 
| 
 OK, once again thank you very much for your reply ! | 
* add controlnet sd3 example * add controlnet sd3 example * update controlnet sd3 example * add controlnet sd3 example test * fix quality and style * update test * update test --------- Co-authored-by: Sayak Paul <[email protected]>
What does this PR do?
Fixes #8834
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.