Replies: 5 comments 2 replies
-
There was someone on Reddit yesterday who used textual inversion or dreambooth to train proper hands into a model or embedding |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Looks interesting, I wonder how flexible it will turn out to adapt and you'll have to refer those hands in your prompt. Maybe it's enough to refer them with a low weight [[]] though |
Beta Was this translation helpful? Give feedback.
-
Another way might be to train the model on some hands dataset or use pose estimators to guide the diffussion process to get better hands. |
Beta Was this translation helpful? Give feedback.
-
I had an idea a long time back I posted in... that place. Make some pictures with multiple hands too many fingers, body horror etc. Train an embed. Put the embed into negative prompt. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
We can't fix the model itself to draw a hand correctly but we need those proper hands, arms, legs, toes, etc ..
Sometimes it works fine, but it seems more like random luck how many fingers a hand is showing.
I watched hundreds of hands begin to form and then end up badly but during the forming process you could see that it can go both ways, it just went bad. What that area would need is a tiny nudge at the right time to form a proper hand.
I am proposing a temporary solution that hooks in between the steps during generation and analyzes the image.
Using object recognition or a compact trained model to specifically detect an area with wrong fingers we should be able to detect most of the wrong outcomes early on.
When that is detected we could reverse y steps, modify the noise/image in the problem area in a semi-randomized fashion (using the seed for reproducibility) let it process again.
That step could be set to repeat up to "n" times until it gives up. When giving up it chooses the solution where the error was smallest.
An alternative to that would be inpainting as a post-processing step using the same method to detect and retry hands but often an error of that sort is getting worse through more steps. So stopping it at it's root might work best.
It's just an idea, maybe someone who worked in that area could tell if this sounds doable ?
Beta Was this translation helpful? Give feedback.
All reactions