-
Couldn't load subscription status.
- Fork 6.5k
Add AniMemoryPipeline #10083
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add AniMemoryPipeline #10083
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just highlighting differences compared to scheduling_euler_ancestral_discrete for reference in the upcoming scheduling refactor.
src/diffusers/schedulers/scheduling_euler_ancestral_discrete_x_pred.py
Outdated
Show resolved
Hide resolved
src/diffusers/schedulers/scheduling_euler_ancestral_discrete_x_pred.py
Outdated
Show resolved
Hide resolved
Thanks for the comparison! I have finished scheduler refactor and tested the output is the same as before the modification. |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @animemory 🤗
The comparison was just for our reference as we're planning to refactor scheduling, thanks for making those changes though!
It would be great to see some example outputs etc. in this PR and you can add more information in the docs. See the files under docs in this PR as an example. cc @stevhliu for docs
| from transformers.models.t5.modeling_t5 import T5Stack | ||
|
|
||
|
|
||
| class AniMemoryT5(torch.nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would AniMemoryT5 and AniMemoryAltCLip be better added to transformers directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This two model are mostly similar to the original T5 and AltCLip, which just features some little arch modifications and tokenizer replacement. It's very unique for this model design for bilingual alignments.
|
why can't we roll the scheduler changes into the existing EulerAncestralDiscreteScheduler? it's already got v-prediction and x-prediction, but is there something wrong with that implementation? maybe it can be fixed. |
| self.model_hf.gradient_checkpointing_enable() | ||
|
|
||
| def forward(self, text, attention_mask): | ||
| hidden_states = self.model_hf.text_model.embeddings(input_ids=text, position_ids=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't want to pass them in? this will prevent eg. precaching the inputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi which one parameter do u mean? The caching function is just for training and should be easily adapted I suppose.
|
Hi, thanks for the contribution! Feel free to let me know if you need any help with the docs 🙂 |
…to animemory merge
|
Hi I did the following things:
some example outputs and more details can be found in the docs. please review and comment, thx! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding, docs look good to me 🤗
| ### Usage | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ### Usage | |
| ``` | |
| ## Usage | |
| ```py |
| images.save("output.png") | ||
| ``` | ||
|
|
||
| Use pipe.enable_sequential_cpu_offload() to offload the model into CPU for less GPU memory cost (about 14.25 G, compared to 25.67 G if CPU offload is not enabled), but the inference time will increase significantly(5.18s v.s. 17.74s on A100 40G). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Use pipe.enable_sequential_cpu_offload() to offload the model into CPU for less GPU memory cost (about 14.25 G, compared to 25.67 G if CPU offload is not enabled), but the inference time will increase significantly(5.18s v.s. 17.74s on A100 40G). | |
| Use [`~DiffusionPipeline.enable_sequential_cpu_offload`] to offload the model into CPU to reduce GPU memory cost, about 14.25GB compared to 25.67GB if CPU offload is not enabled). However, the inference time will increase significantly from 5.18s vs 17.74s on A100 40GB. |
|
|
||
| # EulerAncestralDiscreteXPredScheduler | ||
|
|
||
| An improved scheduler(SingDiffusion) that addresses the sampling challenge at the initial singular time step. To know more about SingDiffusion, check out the original [blog post](https://pangzecheung.github.io/SingDiffusion/). Our original paper can be found [here](https://arxiv.org/abs/2403.08381). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| An improved scheduler(SingDiffusion) that addresses the sampling challenge at the initial singular time step. To know more about SingDiffusion, check out the original [blog post](https://pangzecheung.github.io/SingDiffusion/). Our original paper can be found [here](https://arxiv.org/abs/2403.08381). | |
| An improved scheduler (SingDiffusion) that addresses the sampling challenge at the initial singular time step. To learn more about SingDiffusion, check out the original [blog post](https://pangzecheung.github.io/SingDiffusion/). Our original paper can be found [here](https://arxiv.org/abs/2403.08381). |
|
Hi @animemory! We think this model would be a great new example for custom components which we can use to rework the documentation. The benefit of custom components for model authors is friction-free day-1 support for Diffusers. While I'm testing this with custom components would you mind taking a quick look at the current documentation and providing some feedback? For example, is the process clear, is anything missing, etc. |
|
@animemory I've created a PR on the Hub to add remote code. |
yes, i think the process is clear. It'd be helpful to add or link some tips on how to upload code and checkpoints to the pipeline repo, especially for the beginners. |
|
Thanks for the feedback @animemory and thank you for working with us to use custom components for this model. We can leave this PR open for now as we can revisit integration at some point. As a note for anyone else viewing, we can now use this model/pipeline in Diffusers with remote code: from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained("animEEEmpire/AniMemory-alpha", trust_remote_code=True, revision="fad02d2", torch_dtype=torch.bfloat16)
pipe.to("cuda")
prompt = "一只凶恶的狼,猩红的眼神,在午夜咆哮,月光皎洁"
negative_prompt = "nsfw, worst quality, low quality, normal quality, low resolution, monochrome, blurry, wrong, Mutated hands and fingers, text, ugly faces, twisted, jpeg artifacts, watermark, low contrast, realistic"
images = pipe(prompt=prompt,
negative_prompt=negative_prompt,
num_inference_steps=40,
height=1024, width=1024,
guidance_scale=7,
)[0]
images.save("output.png") |
|
i think any remote code examples should provide a pinned revision |
|
Thanks @bghira, I've added a revision to the example here. |
|
@hlky Is this good to merge? |
|
@a-r-r-o-w not yet, I think we will revisit when the model gains more traction, it's been supported with remote code for now, see modeling_movq.py and modeling_text_encoder.py for why |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
What does this PR do?
This PR did the following things:
Usage:
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
We will update doc soon.
Thank you so much! I'll be there and help with everything.
@yiyixuxu @asomoza