Improving prompt understanding in Stable Diffusion (e.g. LLM integration) #13321
CalculonPrime
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Even the latest SD version (SDXL I assume?) doesn't seem capable of grasping separate objects and properly assigning their attributes, and seems to lack positional knowledge derived from your prompt.
However, most of us have used ChatGPT and know that this should be a solvable problem. I see from Cornell Paper: LLM-grounded Diffusion that research is already underway towards this goal.
Let's have this thread be a place for a high-level discussion of such potential improvements to SD. Are the authors of SD already attempting a similar integration with LLM software?
Beta Was this translation helpful? Give feedback.
All reactions