Why are the characters the same when generated on one image, but so different when generated sequentially with the same settings? #14925

Skygabriel · 2024-02-15T02:30:14Z

Skygabriel
Feb 15, 2024

Hi. I'm a bit of a newbie here. I had idea of using Stable Diffusion to generate animations for my game, but no matter what I try, I can't get good results.

I'm using CetusMix_Coda2 + M_Pixel 像素人人 (Lora) and I really like the results of solo generation. Its 100% perfect. For example:

(masterpiece, top quality, best quality), pixel,pixel art,1girl,full body,witch,witch hat,stockings,black dress,green eyes,red hair, <lora:pixel_f2:0.5> Negative prompt: (worst quality, low quality:2), Steps: 30, Sampler: DPM++ SDE Karras, CFG scale: 7, Seed: 197475931, Size: 512x768, Model hash: 68c0a27380, Model: cetusMix_Coda2, VAE hash: df3c506e51, VAE: pastel-waifu-diffusion.vae.pt, Denoising strength: 0.5, Clip skip: 2, Hires upscale: 1.5, Hires steps: 30, Hires upscaler: Latent, Lora hashes: "pixel_f2: 9d49ca20880d", Version: 1.7.0

I'm trying to generate the same character, but in different poses. But when I generate them sequentially (first generate image 1, then generate image 2 with next ControlNet frame), the results are too different from each other:

I tried using AnimateDiff, but from what I understand, it's more suitable for generating vid2vid, not for generating simple animations of 10-20 frames. One generation can last an hour. And the end result looks terrible.

As I understand, the best option is to place all ControlNet frames on one image. In this case, the result is 200% completely perfect! I really like how it looks and its exactly what i want. Example:

But how can I place 16 or more frames on one image? In this case, the image becomes too large. Like this example with resolution 2048x3072:

(I create frames in blender, animations are taken from mixamo)

So the question is: Can I generate all frames on one image? Maybe with batch?(Something like generating one image by segments. I once used an extension for ControlNet that allowed me to divide an image into several segments with different prompts) Why are the characters exactly the same when generated on one image, but so different when generated sequentially with the same settings? Am I doing something wrong?

I'm sorry if my questions are stupid. But I've been trying to figure it out on my own for about a month now. I tried ComfyUI+Animatediff and its really good for vid2vid, but I couldn't get the same results as in A1111. (For some reason, even solo generations are completely different with the same settings and seed. M_Pixel 像素人人 don't work at all) I've already read all the articles I can find and would appreciate any advice. 🙏The idea sounds simple to me, but maybe I'm too stupid to figure out how to make it a reality :')

chabonmental · 2024-02-15T06:41:16Z

chabonmental
Feb 15, 2024

I think you spotted the problems, ControlNet won't do the poses right (1) and the style is always going to change. Only doable by bruteforce and lots of tricks and I wish luck on that.

(1) as you see the wizard lady that is suposed to be running is more like posing for camera
(2) the 200 % perfect example would only work for some really common poses, but not actions

1 reply

Skygabriel Feb 16, 2024
Author

Yea, thats true, thank you. Bruteforce its exactly what i do right now :) ControlNet really doesn't do a good job of understanding motions. (Especially when using anime models) But I still like the results I get with it. I would like to achieve at least 80% similarity to use the images at least as a reference. But so far my limit is no more than +/- 60%. At least by feel:

This is my best result. And that's the reason I want to know if ControlNet images can be used with huge resolution

mark20044 · 2024-02-15T23:45:21Z

mark20044
Feb 15, 2024

AnimateDiff does fine with small animations, I'd recommend exporting a 3d model animation instead of openpose and using a different controlnet like lineart, softedge, or canny. You can combine controlnets too, check out https://stable-diffusion-art.com/animatediff/
But it's still going to take some tweaking, this stuff isn't exactly science at this point.

You might be able to do something with a loopback to the controlnet as well, I haven't experimented with that feature yet.

1 reply

Skygabriel Feb 16, 2024
Author

I tried AnimateDiff and didn't like it. I don't have a powerful pc so I use Sagemaker studio, but even so one generation (16 frames, even 1 frame) can take longer than an hour. I tried using LCM sampler for more speed, but then the result looks completely different, not the way I want. However, AnimateDiff does have a bit better understanding of motion than solo ControlNet. This is my test result:

Loopback I tried it too, but I didn't understand exactly how it works at all. As I understand it, it first generate 1 image without ControlNet, and then uses it as a reference to apply ControlNet. But then the result is quite strange:

And what do you mean abut exporting 3d model animations? Is there any way to upload a finished 3d animation to ControlNet, not just openpose or depth images?

light-and-ray · 2024-02-17T06:43:32Z

light-and-ray
Feb 17, 2024

Don't know can it help you, but controlnet has controlnetM2M selectable script, you can use it to process video

1 reply

Skygabriel Feb 18, 2024
Author

Oh, looks interesting, I had never seen this tool before. I found an interesting post on reddit about it: https://www.reddit.com/r/StableDiffusion/comments/121vg53/enhancing_controlnetm2m_video_smoothness_with/

I'll try to study it and see if it comes in handy. Thank you! :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why are the characters the same when generated on one image, but so different when generated sequentially with the same settings? #14925

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Why are the characters the same when generated on one image, but so different when generated sequentially with the same settings? #14925

Uh oh!

Skygabriel Feb 15, 2024

Replies: 3 comments · 3 replies

Uh oh!

chabonmental Feb 15, 2024

Uh oh!

Uh oh!

Skygabriel Feb 16, 2024 Author

Uh oh!

mark20044 Feb 15, 2024

Uh oh!

Uh oh!

Skygabriel Feb 16, 2024 Author

Uh oh!

light-and-ray Feb 17, 2024

Uh oh!

Skygabriel Feb 18, 2024 Author

Skygabriel
Feb 15, 2024

Replies: 3 comments 3 replies

chabonmental
Feb 15, 2024

Skygabriel Feb 16, 2024
Author

mark20044
Feb 15, 2024

Skygabriel Feb 16, 2024
Author

light-and-ray
Feb 17, 2024

Skygabriel Feb 18, 2024
Author