Why are the characters the same when generated on one image, but so different when generated sequentially with the same settings? #14925
Replies: 3 comments 3 replies
-
I think you spotted the problems, ControlNet won't do the poses right (1) and the style is always going to change. Only doable by bruteforce and lots of tricks and I wish luck on that. (1) as you see the wizard lady that is suposed to be running is more like posing for camera |
Beta Was this translation helpful? Give feedback.
-
AnimateDiff does fine with small animations, I'd recommend exporting a 3d model animation instead of openpose and using a different controlnet like lineart, softedge, or canny. You can combine controlnets too, check out https://stable-diffusion-art.com/animatediff/ You might be able to do something with a loopback to the controlnet as well, I haven't experimented with that feature yet. |
Beta Was this translation helpful? Give feedback.
-
Don't know can it help you, but controlnet has controlnetM2M selectable script, you can use it to process video |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi. I'm a bit of a newbie here. I had idea of using Stable Diffusion to generate animations for my game, but no matter what I try, I can't get good results.
I'm using CetusMix_Coda2 + M_Pixel 像素人人 (Lora) and I really like the results of solo generation. Its 100% perfect. For example:

(masterpiece, top quality, best quality), pixel,pixel art,1girl,full body,witch,witch hat,stockings,black dress,green eyes,red hair, <lora:pixel_f2:0.5> Negative prompt: (worst quality, low quality:2), Steps: 30, Sampler: DPM++ SDE Karras, CFG scale: 7, Seed: 197475931, Size: 512x768, Model hash: 68c0a27380, Model: cetusMix_Coda2, VAE hash: df3c506e51, VAE: pastel-waifu-diffusion.vae.pt, Denoising strength: 0.5, Clip skip: 2, Hires upscale: 1.5, Hires steps: 30, Hires upscaler: Latent, Lora hashes: "pixel_f2: 9d49ca20880d", Version: 1.7.0
I'm trying to generate the same character, but in different poses. But when I generate them sequentially (first generate image 1, then generate image 2 with next ControlNet frame), the results are too different from each other:

I tried using AnimateDiff, but from what I understand, it's more suitable for generating vid2vid, not for generating simple animations of 10-20 frames. One generation can last an hour. And the end result looks terrible.
As I understand, the best option is to place all ControlNet frames on one image. In this case, the result is 200% completely perfect! I really like how it looks and its exactly what i want. Example:


But how can I place 16 or more frames on one image? In this case, the image becomes too large. Like this example with resolution 2048x3072:

(I create frames in blender, animations are taken from mixamo)
So the question is: Can I generate all frames on one image? Maybe with batch?(Something like generating one image by segments. I once used an extension for ControlNet that allowed me to divide an image into several segments with different prompts) Why are the characters exactly the same when generated on one image, but so different when generated sequentially with the same settings? Am I doing something wrong?
I'm sorry if my questions are stupid. But I've been trying to figure it out on my own for about a month now. I tried ComfyUI+Animatediff and its really good for vid2vid, but I couldn't get the same results as in A1111. (For some reason, even solo generations are completely different with the same settings and seed. M_Pixel 像素人人 don't work at all) I've already read all the articles I can find and would appreciate any advice. 🙏The idea sounds simple to me, but maybe I'm too stupid to figure out how to make it a reality :')
Beta Was this translation helpful? Give feedback.
All reactions