Worked brilliantly before Halloween but now... not so much? #505

illumnat · 2022-11-15T06:07:26Z

illumnat
Nov 15, 2022

Howdy!! First off, I want to thank you for the you're putting into this! I appreciate it greatly and I'm sure the community at large does as well!

So, here's my issue: I trained a couple of things prior to Halloween and it worked brilliantly. October 29 would've been the actual date I ran them. One was from a photo of an old friend of mine, 22 images. The other was 24 various images of a Fender Stratocaster guitar. I typed in my prompt, and just about any style I put it to worked wonderfully! I followed the tips in the instructions, which at that time was 100 steps per image and so forth. I'm attaching a couple examples below.

I've done a couple training sessions on some of the more recent builds that have the "text encoder training" thing. I run as you suggested using 200 steps per photo of a different friend (24 photos) with the text encoder set to 10 in hopes that it'll work more or less as it did before... successfully! ;-)

If I keep my prompt close to the model (dntbkr) it creates a pretty good likeness of her. e.g. prompts like "photo of dntbkr" or just "dntbkr" and so forth but if I stray much from that it's not even close. For example... this is something like "a black and white photo of dntbkr dressed as a hippy:

She doesn't have a beard... and if she did... she still wouldn't look anything like that lol. However, a very similar prompt of on my friend's photo, something like "color photo of rklcke dressed as a hippy at Woodstock" turned out like this training with whatever build was up around Oct. 29.

So... I don't know if it's something in the build that changed that's now no longer giving me wonderful results, or if I'm not using the "text encoder training" settings correctly but either way, I sure as hell would like to get back to the awesome results I was getting from your notebook back around Oct. 29!!! I've tried doing additional training on the model to no avail. Same sort of deal... any sort of style just doesn't take.

Thanks again for the awesome work though! I mostly do animation stuff using Deforum and folks like you and them have opened up incredible new methods of creativity for me. This was my most recent animation endeavor set to a Beats Antique song for Halloween.

[https://youtu.be/Ltvt3zDPvD0](Devil Dance by Beats Antique on YouTube)

Cheers!

TheLastBen · 2022-11-15T06:33:06Z

TheLastBen
Nov 15, 2022
Maintainer

Hi and thanks, use the latest notebook https://colab.research.google.com/github/TheLastBen/fast-stable-diffusion/blob/main/fast-DreamBooth.ipynb
and set the text encoder to 80%, steps to 5000, you should get good results with these settings

1 reply

illumnat Nov 15, 2022
Author

Thanks for the quick response!! I'll give that a try!

illumnat · 2022-11-15T21:47:13Z

illumnat
Nov 15, 2022
Author

Howdy. Just sent you a tip on the Ko-Fi thing. Have a good coffee or 2! ;-)

So back to the issue. I tried what you suggested but am not having much luck. As I mentioned, around Oct. 29, I trained a .ckpt on photos of my friend Ed and it worked incredibly well. Last night, I trained as you suggested on photos of his wife Danette and am not having near the success.

The source image files were named 'ouedbkr (1).png' and 'dntbkr.png (1)' and so forth respectively. I think I had 22 images for Ed and 24 for Danette. For the photos of Ed, I followed the basic instructions as listed on that build's notebook. For the training last night, I followed the settings you mentioned above. Comparison below:

Staying relatively close to "photo of x" produces more or less expected results. But, once you stray away from that, it quickly falls apart. I used the same settings and seed for both images. Sampler: Euler, resolution: 512x512, Steps: 100, CFG: 7.5, seed: 424242. Other than the subject, 'ouedbkr' or 'dntbkr,' the prompts are exactly the same.

Prompt: color photo of x in New York City

240130790d.png)

Prompt: a black and white newspaper photo of x from the 1950s

Prompt: a pen and ink drawing of x

Prompt: a fantasy painting of x dressed as a warrior by John Howe

As you can see, the result is profoundly better from the Oct. 29 build in comparison to my current results with the settings you mentioned.

Any thoughts or ideas? Would it be possible to re-post the Oct 29 build as a separate Colab Notebook in the meantime?

Thanks again for your great work! Much appreciated!!

4 replies

nawnie Nov 16, 2022

I've found the face feature to work best for "finishing touches" ill normally set faces to no then train text encoder at 100% for 3k to 4 depending on amount of subjects, then ill set the encoder for about 15% and slowly refine my way up, if the face is blurry or.. off i then use the face option and start a new session loading the last ckpt from the old.

im normally doing a ton of subjects at once, 10 - 30 so refining it is just par for the course for me so my advice may not be a simple answer you are looking for.

illumnat Nov 16, 2022
Author

Thanks I'll try that. I think I'm just a bit frustrated because I had such great results before the face/text encoder stuff got added in.

This was the version that worked so well for me. Do you know of an "easy" way to run that older version of the notebook? I couldn't get it running in Colab and didn't have time to try futzing with code and such. Thanks!

illumnat Nov 16, 2022
Author

Hmm... the old notebook is working for me now. Will try that and post samples.

illumnat Nov 16, 2022
Author

Ok... trained the same images in the "old" notebook from around Oct. 29 and these are the results. 2500 steps of training. Same prompts and settings as above.

color photo of x in New York City

a black and white newspaper photo of x from the 1950s

a pen and ink drawing of x

a fantasy painting of x as a warrior standing in a forest by John Howe

TheLastBen · 2022-11-16T05:16:49Z

TheLastBen
Nov 16, 2022
Maintainer

Hi and thank you for the tip
Can you layout all the settings you used with the new method ?

0 replies

MasterDenis · 2022-11-16T12:03:41Z

MasterDenis
Nov 16, 2022

I had a similar issue which is why I switch to Shiv's Settings for Now, there is something wrong with the new method of training with both suggested and pre-defined settings (20 Images 3k Steps default txt encoder, also tried with 100% txt encoder, also tried with 4k Steps as suggested).
It would help to add the old method back for at least comparison reasons.
Ben suggested better photos but when using the same exact dataset photos on Shiv's it doesnt have an issue and I once got a real-Looking photo representation of the subject.
I appreciate the multi-subject training option though and perhaps a combination of that with the old method would be idea for my use case.

2 replies

TheLastBen Nov 16, 2022
Maintainer

https://colab.research.google.com/github/TheLastBen/fast-stable-diffusion/blob/main/Dreambooth/fast_DreamBooth-Old-Method.ipynb

MasterDenis Nov 16, 2022

Thanks, that finally helped, I had to tweak the settings to match shiv's but the rest of the settings and code is very clean and straightforward.
Perhaps you could implement 2 versions in one notebook where one will be "stable-slow" and the other "new-fast" or something close to,
I changed to constant, slowest learning rate and so on (similar to the fork).
On another note, regarding the conditioning mask I noticed an issue where it doesnt always update correctly but that is likely a connection issue in a1111.

illumnat · 2022-11-16T17:03:11Z

illumnat
Nov 16, 2022
Author

OK... I think I got something worked out. Part of the answer was right in the notebook! (silly me!) I think it came down to that low text encode percentage to get it to stylize. The higher percentage is pretty much impossible to stylize.

Keep the % low for better style transfer, more training steps will be necessary for good results.
Higher % will give more weight to the instance, it gives stronger results at lower steps count, but harder to stylize,

This time around training with the current build, I set "Contains_faces" to No. Training steps: 5000, Train_text_encoder_for: 10. These are the results using the same render settings as above: Sampler: Euler, resolution: 512x512, Steps: 100, CFG: 7.5, seed: 424242.

The last image leaves a bit to desire but that's just the nature of AI art I believe!

color photo of x in New York City

a black and white newspaper photo of x from the 1950s

a pen and ink drawing of x

a fantasy painting of x as a warrior standing in a forest by John Howe

5 replies

TheLastBen Nov 16, 2022
Maintainer

Great, use this prompt :

Still frame of _____________, cinematic, 1970s film, 40mm f/2.8, closeup, remastered, 4k uhd

negative prompt: cartoon, fake, painting, 3d, low poly

Steps: 35, Sampler: Euler a, CFG scale: 7.5, Size: 512x704, Denoising strength: 0.68, First pass size: 0x0

use high res fix

illumnat Nov 16, 2022
Author

Here ya go! :-)

TheLastBen Nov 16, 2022
Maintainer

You can also merge your trained model with my 768x photography one : https://huggingface.co/TheLastBen/hrrzg-style-768px
you will get amazing results

illumnat Nov 16, 2022
Author

Cool! I'll have to give that a try!!

MasterDenis Nov 16, 2022

You can also merge your trained model with my 768x photography one : https://huggingface.co/TheLastBen/hrrzg-style-768px you will get amazing results

Question regarding merging, which way would you suggest?
I am interested to learn what does each merging option do in A1111, thanks.

illumnat · 2022-11-17T02:04:13Z

illumnat
Nov 17, 2022
Author

Yeah, I would say I got the training thing worked out!! :-)

Thanks again for the great work!!

5 replies

TheLastBen Nov 17, 2022
Maintainer

👍

shyt47 Nov 17, 2022

I used your settings above 5000 steps 10% and also achieved better results. Is this screenshot the same settings? how many images total for the subject?

illumnat Nov 17, 2022
Author

This was done in only 2 passes. The first render was done at 832x512 with this prompt and settings:

a beautiful painting of oudntbkrf as a fantasy warrior in front of an old rundown castle by James Clyne, closeup, highly detailed, dramatic lighting, 8k

Negative prompt: cartoon, fake Steps: 150, Sampler: Euler a, CFG scale: 9, Seed: 3367182182, Size: 832x512, Model hash: 77add77c, Denoising strength: 0.66, First pass size: 0x0

I had "Hires. fix" turned on with the first pass size set to 0. It turned out this image which chopped off the top of her head like this:

I'm really not sure why it turned out photographic considering the prompt had painting in it.. ohhhh... unless the negative prompt "fake" and "cartoon" did that!! Just realized that now!! Anywho... I then took that into img2img and did a 'resize and fill' on it (same prompt) to a resolution of a more portrait-ish size of 832x896 where I wound up with an image of her leaning on some kind of shield(?)/board kind of thing. I think I had the sampler set to "DPM++ 2M Karras" for the resize and fill.

For the final, I resized it in Topaz' Photo AI and then cropped it to the final image you see above in Photoshop. I didn't do any alterations or edits in Photoshop other than cropping.

illumnat Nov 17, 2022
Author

Oh... I should add that my friend here is a theater professor/actor so there's a lot of great source photography to build the model off of. Lots of head shots, action shots, and full body shots.

TheLastBen Nov 18, 2022
Maintainer

to allow easy styling, set the text encoder to 40% or less

Worked brilliantly before Halloween but now... not so much? #505

Uh oh!

illumnat Nov 15, 2022

Replies: 6 comments · 17 replies

Uh oh!

TheLastBen Nov 15, 2022 Maintainer

Uh oh!

illumnat Nov 15, 2022 Author

Uh oh!

illumnat Nov 15, 2022 Author

Uh oh!

nawnie Nov 16, 2022

Uh oh!

illumnat Nov 16, 2022 Author

Uh oh!

illumnat Nov 16, 2022 Author

Uh oh!

illumnat Nov 16, 2022 Author

Uh oh!

TheLastBen Nov 16, 2022 Maintainer

Uh oh!

MasterDenis Nov 16, 2022

Uh oh!

TheLastBen Nov 16, 2022 Maintainer

Uh oh!

MasterDenis Nov 16, 2022

Uh oh!

illumnat Nov 16, 2022 Author

Uh oh!

TheLastBen Nov 16, 2022 Maintainer

Uh oh!

illumnat Nov 16, 2022 Author

Uh oh!

TheLastBen Nov 16, 2022 Maintainer

Uh oh!

illumnat Nov 16, 2022 Author

Uh oh!

MasterDenis Nov 16, 2022

Uh oh!

illumnat Nov 17, 2022 Author

Uh oh!

TheLastBen Nov 17, 2022 Maintainer

Uh oh!

shyt47 Nov 17, 2022

Uh oh!

Uh oh!

illumnat Nov 17, 2022 Author

Uh oh!

illumnat Nov 17, 2022 Author

Uh oh!

TheLastBen Nov 18, 2022 Maintainer

illumnat
Nov 15, 2022

Replies: 6 comments 17 replies

TheLastBen
Nov 15, 2022
Maintainer

illumnat Nov 15, 2022
Author

illumnat
Nov 15, 2022
Author

illumnat Nov 16, 2022
Author

illumnat Nov 16, 2022
Author

illumnat Nov 16, 2022
Author

TheLastBen
Nov 16, 2022
Maintainer

MasterDenis
Nov 16, 2022

TheLastBen Nov 16, 2022
Maintainer

illumnat
Nov 16, 2022
Author

TheLastBen Nov 16, 2022
Maintainer

illumnat Nov 16, 2022
Author

TheLastBen Nov 16, 2022
Maintainer

illumnat Nov 16, 2022
Author

illumnat
Nov 17, 2022
Author

TheLastBen Nov 17, 2022
Maintainer

illumnat Nov 17, 2022
Author

illumnat Nov 17, 2022
Author

TheLastBen Nov 18, 2022
Maintainer