Add AniMemoryPipeline #10083

animemory · 2024-12-02T12:21:13Z

What does this PR do?

This PR did the following things:

Created AniMemoryPipeline in src/diffusers/pipelines/animemory/
Created EulerAncestralDiscreteXPredScheduler in src/diffusers/schedulers/
Uploaded the safetensors model to our huggingface: animEEEmpire/AniMemory-alpha
Tested the pipeline and the outputs are as expected.

Usage:

import torch
from diffusers import AniMemoryPipeline
pipe = AniMemoryPipeline.from_pretrained(
    "animEEEmpire/AniMemory-alpha",
    torch_dtype=torch.bfloat16
)
pipe = pipe.to("cuda")

prompt = '一只凶恶的狼，猩红的眼神，在午夜咆哮，月光皎洁'
negative_prompt = 'nsfw, worst quality, low quality, normal quality, low resolution, monochrome, blurry, wrong, Mutated hands and fingers, text, ugly faces, twisted, jpeg artifacts, watermark, low contrast, realistic'
image = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=40,
    height=1024, width=1024,
    guidance_scale=7.0
).images[0]
image.save("output.png")

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

We will update doc soon.
Thank you so much! I'll be there and help with everything.

@yiyixuxu @asomoza

hlky

Just highlighting differences compared to scheduling_euler_ancestral_discrete for reference in the upcoming scheduling refactor.

src/diffusers/schedulers/scheduling_euler_ancestral_discrete_x_pred.py

animemory · 2024-12-03T09:38:29Z

Just highlighting differences compared to scheduling_euler_ancestral_discrete for reference in the upcoming scheduling refactor.

Thanks for the comparison! I have finished scheduler refactor and tested the output is the same as before the modification.
Now, this PR is ready for review!

HuggingFaceDocBuilderDev · 2024-12-03T11:06:24Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

hlky

Hi @animemory 🤗

The comparison was just for our reference as we're planning to refactor scheduling, thanks for making those changes though!

It would be great to see some example outputs etc. in this PR and you can add more information in the docs. See the files under docs in this PR as an example. cc @stevhliu for docs

hlky · 2024-12-03T12:21:25Z

src/diffusers/pipelines/animemory/modeling_text_encoder.py

+from transformers.models.t5.modeling_t5 import T5Stack
+
+
+class AniMemoryT5(torch.nn.Module):


Would AniMemoryT5 and AniMemoryAltCLip be better added to transformers directly?

This two model are mostly similar to the original T5 and AltCLip, which just features some little arch modifications and tokenizer replacement. It's very unique for this model design for bilingual alignments.

bghira · 2024-12-03T13:02:41Z

why can't we roll the scheduler changes into the existing EulerAncestralDiscreteScheduler? it's already got v-prediction and x-prediction, but is there something wrong with that implementation? maybe it can be fixed.

bghira · 2024-12-03T13:05:39Z

src/diffusers/pipelines/animemory/modeling_text_encoder.py

+        self.model_hf.gradient_checkpointing_enable()
+
+    def forward(self, text, attention_mask):
+        hidden_states = self.model_hf.text_model.embeddings(input_ids=text, position_ids=None)


we don't want to pass them in? this will prevent eg. precaching the inputs

hi which one parameter do u mean? The caching function is just for training and should be easily adapted I suppose.

hlky · 2024-12-03T14:31:25Z

@bghira Some of the changes are currently unique so would need a branch if merged into EulerAncestralDiscreteScheduler. It will be covered later in the planned scheduling refactor so it's ok for now. Priority should be the rest of the implementation for this new model.

stevhliu · 2024-12-03T17:15:29Z

Hi, thanks for the contribution! Feel free to let me know if you need any help with the docs 🙂

…to animemory merge

animemory · 2024-12-04T07:17:20Z

Hi I did the following things:

added two docs in docs/source/en/api:

docs/source/en/api/pipelines/animemory.md
docs/source/en/api/schedulers/euler_ancestral_x_pred.md

modified the docs/source/en/_toctree.yml file.

some example outputs and more details can be found in the docs.

please review and comment, thx!

@hlky @stevhliu

stevhliu

Thanks for adding, docs look good to me 🤗

stevhliu · 2024-12-04T17:11:23Z

docs/source/en/api/pipelines/animemory.md

+### Usage
+```


Suggested change

### Usage

```

## Usage

```py

stevhliu · 2024-12-04T17:13:10Z

docs/source/en/api/pipelines/animemory.md

+images.save("output.png")
+```
+
+Use pipe.enable_sequential_cpu_offload() to offload the model into CPU for less GPU memory cost (about 14.25 G, compared to 25.67 G if CPU offload is not enabled), but the inference time will increase significantly(5.18s v.s. 17.74s on A100 40G).


Suggested change

Use pipe.enable_sequential_cpu_offload() to offload the model into CPU for less GPU memory cost (about 14.25 G, compared to 25.67 G if CPU offload is not enabled), but the inference time will increase significantly(5.18s v.s. 17.74s on A100 40G).

Use [`~DiffusionPipeline.enable_sequential_cpu_offload`] to offload the model into CPU to reduce GPU memory cost, about 14.25GB compared to 25.67GB if CPU offload is not enabled). However, the inference time will increase significantly from 5.18s vs 17.74s on A100 40GB.

stevhliu · 2024-12-04T17:13:41Z

docs/source/en/api/schedulers/euler_ancestral_x_pred.md

+
+# EulerAncestralDiscreteXPredScheduler
+
+An improved scheduler(SingDiffusion) that addresses the sampling challenge at the initial singular time step. To know more about SingDiffusion, check out the original [blog post](https://pangzecheung.github.io/SingDiffusion/). Our original paper can be found [here](https://arxiv.org/abs/2403.08381).


Suggested change

An improved scheduler(SingDiffusion) that addresses the sampling challenge at the initial singular time step. To know more about SingDiffusion, check out the original [blog post](https://pangzecheung.github.io/SingDiffusion/). Our original paper can be found [here](https://arxiv.org/abs/2403.08381).

An improved scheduler (SingDiffusion) that addresses the sampling challenge at the initial singular time step. To learn more about SingDiffusion, check out the original [blog post](https://pangzecheung.github.io/SingDiffusion/). Our original paper can be found [here](https://arxiv.org/abs/2403.08381).

hlky · 2024-12-05T10:45:18Z

Hi @animemory! We think this model would be a great new example for custom components which we can use to rework the documentation. The benefit of custom components for model authors is friction-free day-1 support for Diffusers. While I'm testing this with custom components would you mind taking a quick look at the current documentation and providing some feedback? For example, is the process clear, is anything missing, etc.

hlky · 2024-12-05T12:07:44Z

@animemory I've created a PR on the Hub to add remote code.

animemory · 2024-12-06T03:50:33Z

Hi @animemory! We think this model would be a great new example for custom components which we can use to rework the documentation. The benefit of custom components for model authors is friction-free day-1 support for Diffusers. While I'm testing this with custom components would you mind taking a quick look at the current documentation and providing some feedback? For example, is the process clear, is anything missing, etc.

yes, i think the process is clear. It'd be helpful to add or link some tips on how to upload code and checkpoints to the pipeline repo, especially for the beginners.

hlky · 2024-12-06T16:33:32Z

Thanks for the feedback @animemory and thank you for working with us to use custom components for this model. We can leave this PR open for now as we can revisit integration at some point.

As a note for anyone else viewing, we can now use this model/pipeline in Diffusers with remote code:

from diffusers import DiffusionPipeline
import torch

pipe = DiffusionPipeline.from_pretrained("animEEEmpire/AniMemory-alpha", trust_remote_code=True, revision="fad02d2", torch_dtype=torch.bfloat16)
pipe.to("cuda")

prompt = "一只凶恶的狼，猩红的眼神，在午夜咆哮，月光皎洁"
negative_prompt = "nsfw, worst quality, low quality, normal quality, low resolution, monochrome, blurry, wrong, Mutated hands and fingers, text, ugly faces, twisted, jpeg artifacts, watermark, low contrast, realistic"

images = pipe(prompt=prompt,
              negative_prompt=negative_prompt,
              num_inference_steps=40,
              height=1024, width=1024,
              guidance_scale=7,
              )[0]
images.save("output.png")

bghira · 2024-12-06T18:23:27Z

i think any remote code examples should provide a pinned revision

hlky · 2024-12-06T18:28:26Z

Thanks @bghira, I've added a revision to the example here.

a-r-r-o-w · 2024-12-11T23:05:23Z

@hlky Is this good to merge?

hlky · 2024-12-11T23:17:08Z

@a-r-r-o-w not yet, I think we will revisit when the model gains more traction, it's been supported with remote code for now, see modeling_movq.py and modeling_text_encoder.py for why

github-actions · 2025-01-05T15:03:13Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Add AniMemoryPipeline

537562a

hlky reviewed Dec 2, 2024

View reviewed changes

animemory added 2 commits December 3, 2024 10:32

Merge branch 'main' into animemory

786be0d

refactor the scheduler

8f51743

animemory marked this pull request as ready for review December 3, 2024 09:38

animemory and others added 2 commits December 3, 2024 17:39

Merge branch 'main' into animemory

b3a70b5

make

f71062f

hlky added 3 commits December 3, 2024 11:16

doc-builder

8cb5e51

make style

020f8ae

make fix-copies

a479683

hlky reviewed Dec 3, 2024

View reviewed changes

bghira reviewed Dec 3, 2024

View reviewed changes

animemory added 2 commits December 4, 2024 15:00

Add docs

c924111

Merge branch 'animemory' of https://github.com/animemory/diffusers in…

407ea6e

…to animemory merge

animemory added 2 commits December 4, 2024 15:18

Merge branch 'main' into animemory

0076d0d

Merge branch 'main' into animemory

5c98d7e

stevhliu approved these changes Dec 4, 2024

View reviewed changes

Merge branch 'main' into animemory

71cadd4

a-r-r-o-w requested a review from hlky December 11, 2024 23:05

github-actions bot added the stale Issues that haven't received updates label Jan 5, 2025

		from transformers.models.t5.modeling_t5 import T5Stack


		class AniMemoryT5(torch.nn.Module):

	Use pipe.enable_sequential_cpu_offload() to offload the model into CPU for less GPU memory cost (about 14.25 G, compared to 25.67 G if CPU offload is not enabled), but the inference time will increase significantly(5.18s v.s. 17.74s on A100 40G).
	Use [`~DiffusionPipeline.enable_sequential_cpu_offload`] to offload the model into CPU to reduce GPU memory cost, about 14.25GB compared to 25.67GB if CPU offload is not enabled). However, the inference time will increase significantly from 5.18s vs 17.74s on A100 40GB.


		# EulerAncestralDiscreteXPredScheduler

		An improved scheduler(SingDiffusion) that addresses the sampling challenge at the initial singular time step. To know more about SingDiffusion, check out the original [blog post](https://pangzecheung.github.io/SingDiffusion/). Our original paper can be found [here](https://arxiv.org/abs/2403.08381).

	An improved scheduler(SingDiffusion) that addresses the sampling challenge at the initial singular time step. To know more about SingDiffusion, check out the original [blog post](https://pangzecheung.github.io/SingDiffusion/). Our original paper can be found [here](https://arxiv.org/abs/2403.08381).
	An improved scheduler (SingDiffusion) that addresses the sampling challenge at the initial singular time step. To learn more about SingDiffusion, check out the original [blog post](https://pangzecheung.github.io/SingDiffusion/). Our original paper can be found [here](https://arxiv.org/abs/2403.08381).

Uh oh!

Add AniMemoryPipeline #10083

Are you sure you want to change the base?

Add AniMemoryPipeline #10083

Uh oh!

Conversation

animemory commented Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Who can review?

Uh oh!

hlky left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

animemory commented Dec 3, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Dec 3, 2024

Uh oh!

hlky left a comment

Choose a reason for hiding this comment

Uh oh!

hlky Dec 3, 2024

Choose a reason for hiding this comment

Uh oh!

animemory Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

bghira commented Dec 3, 2024

Uh oh!

bghira Dec 3, 2024

Choose a reason for hiding this comment

Uh oh!

animemory Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

hlky commented Dec 3, 2024

Uh oh!

stevhliu commented Dec 3, 2024

Uh oh!

animemory commented Dec 4, 2024

Uh oh!

stevhliu left a comment

Choose a reason for hiding this comment

Uh oh!

stevhliu Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

stevhliu Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

stevhliu Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

hlky commented Dec 5, 2024

Uh oh!

hlky commented Dec 5, 2024

Uh oh!

animemory commented Dec 6, 2024

Uh oh!

hlky commented Dec 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bghira commented Dec 6, 2024

Uh oh!

hlky commented Dec 6, 2024

Uh oh!

a-r-r-o-w commented Dec 11, 2024

Uh oh!

hlky commented Dec 11, 2024

Uh oh!

github-actions bot commented Jan 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

animemory commented Dec 2, 2024 •

edited

Loading

hlky commented Dec 6, 2024 •

edited

Loading