Replies: 13 comments 21 replies
-
@AUTOMATIC1111 I can make the pull request for this feature as I would also be trialing it with my own animation plugin, but I'll need to know first if we're all on-board with this new architecture! |
Beta Was this translation helpful? Give feedback.
-
Great idea, an extension ecosystem is always good for open source project |
Beta Was this translation helpful? Give feedback.
-
A way to control execution order of plugins that are activated on the same trigger, possibly even individual events that may need to intercalated with other plugins or might need different orders than the rest of the script; would probably be a good idea. |
Beta Was this translation helpful? Give feedback.
-
Excellent proposal! The middleware system of addons is what first came to my mind when I saw how scripts are implemented now |
Beta Was this translation helpful? Give feedback.
-
I have edited this proposal with an update, things have evolved quite a bit more drastically. I am suggesting a brand new approach to this whole thing, refined to an amazing core architecture which is both efficient for UIs, coding, and CLI use. There are certainly challenges, the plugin installation code I think will be tricky. Let me know how this feels to you guys |
Beta Was this translation helpful? Give feedback.
-
I like your proposal, but given the scope another approach to implementing it could be:
|
Beta Was this translation helpful? Give feedback.
-
I'm working on a modular refactor/rewrite called "Instability" Big changes include (almost) no global variables, easy usage in other scripts, simple installation with |
Beta Was this translation helpful? Give feedback.
-
I think it would be more valuable for the pre and post processing to convert it to a pipelines type architecture, similar to what you see in data processing tools, like scrapy, jina.ai or bonobo. This would infact would support "plugins" for processing steps as python modules. An added bonus would be to allow parallel processing on certain steps, and directing which should happen on CPU and which on GPU or even remotely on things like huggingface. |
Beta Was this translation helpful? Give feedback.
-
It seems there are a few projects currently attempting something like this, but none of them have their priorities in the right place which is the plugin ecosystem. Everyone is wasting time here, the plugin architecture is the only thing that matters, then we can all improve parts of the project as a community. Some are warning that the repository is a mess and it can't be saved, but I looked at it quickly and I disagree. There's no architecture essentially, it's a classic singleton problem grown out of control, all of the code is still good it just needs to be called in the right order:
Well, with that being said I'm about halfway through a refactor https://github.com/oxysoft/stable-diffusion-webui check the README.md to see progress and how you can help. I'm out of free time but I suspect another day or two will yield results. I will implement the shittiest UI that can be made around this architecture and then relay it to the community so we can all build on it. I will need urgent help (even just comments and recommendations) with these:
The sooner we have the core/plugin architecture, the sooner we can have everyone contributing to build the most powerful AI core for both coders and artists. @AUTOMATIC1111 I hope you will accept this development, this will unlock the full power of AI. We must continue moving forward as a community and with a permissive license. |
Beta Was this translation helpful? Give feedback.
-
Interesting concept. I've often thought about how there should be a modifier stack similar to Blender to line up processes (not just to stack custom scripts, but also for pipelines to img2img or upscaling), but this is on a much larger scale. Honestly, I think even the input should be a plugin - I dream of an advanced text editor with highlighting, autocomplete and snippets. Alternatively, I think there could be other means of creating prompts that are much more intuitive than writing plain text we can't even imagine. |
Beta Was this translation helpful? Give feedback.
-
They have announced their animation API for release next week. https://twitter.com/Plinz/status/1582200052801359872 But they don't have an open-source plugin ecosystem, so ours will be better. The same mistakes every single time. Corporate software cannot beat open-source movements. Ours will be a hundred time more feature-rich thanks to the community. |
Beta Was this translation helpful? Give feedback.
-
Major Progresshttps://github.com/oxysoft/stable-diffusion-webui
After this I am making a simple UI to demo how this all works, and once that's working I will port the remaining plugin code. (upscalers, face restoration, etc.) DreamStudio ProWe've seen it on the big screen now and I'm not convinced, too many fancy bells and whistle like nodes and 3D scene view (wtf was that even for???) and not enough actual software being shown. I'm not optimistic about it. Their existing DreamStudio web interface already sucks, but the real meat is in embedding these features into existing software!! Blender, Photoshop, Kdenlive, no actual artist and animators wants to work in some crappy web interface. The only reason I use DreamStudio at all is to iterate on a prompt idea without any resource constraints. I will make a minimal UI in Dear IMGUI (powerful though) only so that 1. devs can test plugins easily 2. people who don't know any real art software can also have fun and enjoy a powerful workflow. The focus should on existing software. Well it occurred to me that we can bake cloud deployment into the core. E.g. you rent an instance on Vast.ai, then you run a deploy command in the stable-core shell and enter your SSH details. Automatically, it clones Still no comment from @AUTOMATIC1111. Having seen and refactored the code behind this webui and the way the issues are piling up, I think the project won't be maintainable for much longer. This could be the greatest learning in your entire life, the entire codebase is refactored and I left comments everywhere to detail my confusion, suggestions, frustration even, etc. I hope you'll endorse this core and transition to it, we could use your expertise in maintaining the StableDiffusion plugin and continuing to support new research papers, techniques and optimizations. |
Beta Was this translation helpful? Give feedback.
-
Hi everyone, this is the last time I will update this discussion Development is now well under way at https://github.com/distable/core. Plugin system is functional, we have an interactive shell to run SD from the command line, and I've begun work on a GUI. The API is settling down so I feel comfortable taking contributions from people now if you wish to help. Let's make this the best AI art ecosystem, much bigger than Stable Diffusion |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Plugin Proposal
With this project's buzzing community and very active team of collaborators, being always on top of the latest optimizations and new techniques to implement, it is quickly becoming the de-facto implementation for Stable Diffusion.
I am noticing that most colab notebooks being used currently can be ported to a custom script and harness the power of the webui, even notebooks like Deforum. Custom scripts are great because they allow sub communities to establish and contribute to the project in their own vacuums. This give me the idea, along with noticing that many features are currently unrelated to one another and use different models, that we can restructure the project around external plugins instead of a single bloated repository. Then we massively gain from the combined power of open source, the same power we've seen optimizing the shit out of SD in the first weeks following its open release.
We rename to stable-core, everyone's first stop to AI art, usable out of the box with official plugins, or even better with external UI implementations.
Development Fork
https://github.com/oxysoft/stable-diffusion-webui/
From stable-diffusion-webui to stable-core ...
We split up the project down to a very solid backend core, not even StableDiffusion specific. Plugins can be just some functions, technique, library, or they can load models, and implement a well defined job pipeline. (jobs like
txt2img
,img2img
,img2txt
, etc.)Workflow
The entire UI is designed like a feature dashboard which each expose their own UI. A workflow where you're jumping between generating (stablediffusion, vqgan+clip, guided diffusion, soon txt2vid like phenaki) and postprocessing (upscalers, 2D and 3D transforms, MiDaS extract depth map, img2img) plugins to get work done and string features together. Each plugin implements models and discrete features which can be used in a manual workflow, and depended on in larger plugins like Deforum to invoke as an API.
Eventually, someone will try to make a fancy timeline sequencer, or some node based editor. Or they'll integrate it into major software like Blender, or Kdenlive, Premiere, etc.
Community Decentralization
We can automatically collect plugin repositories from GitHub and have them enabled much like you would Vim-Plug and such. Thus, we unlock the full power of open-source contributions and condense all manpower into a buzzing ecosystem of UIs and plugins to use, much like the marketplace for VS Code and Sublime Text once upon a time. Anyone can create a plugin to implement some image synthesis or transformation technique into the repo, even ebsynth or imagemagick to support them in all of supported UIs automatically, handling installation for each platform. This becomes like the npm of creative community.
Concrete use-cases
Plugin Ideas
With the macro editor, you could easily reconstruct a Deforum frame by stringing together just a few plugins, making animation available out-of-the-box without any special animation plugin necessary.
API Specs
Plugin API
title()
describe(page)
ui()
install()
code to run during the core's startup to install the requirements to run it. (through pip, apt, etc.) same way you would on colab.init()
runs once per script upon first startup, allows loading resources required for enabling the script, e.g. reading from files into memory if it's required to display in the UI for selections.load()
instantiate resources/models required for processing, e.g. lpips, midas, ...unload()
frees resources from vramgenerate_cost()
attempt to return the estimate VRAM cost to run this generator or postprocess.postprocess_cost()
attempt to return the estimate VRAM cost to run this generator or postprocess.generate(params)
postprocess(img)
take in the current image as input, cv2 RGB. An extensive utility API provides conversions likecv2_to_pil
,pil_to_cv2
,pil_to_latent
,latent_to_pil
,cv2_to_latent
,latent_to_cv2
, etc. painless to use no matter what.postprocess_prompt(prompt)
Any generator prompt will be passed through, this is where we can implement wildcards, etc.Notes:
load()
/unload()
are managed by the core, it will usually call load before processing and unload at the end. Users can configure certain models to load on startup or remain loaded to tailor for their their performance needs.Generators
emit...onGenStart(parameters)
onGenEnd(outputs)
onGenInterrupted()
when the user manually interrupts the generator, or a plugin requests it.StableDiffusion:Plugin
emits...onPostprocessParameters(params)
Take in the full dictionary of render parameters and return the modified parameters. (I actually have not really looked at the codebase yet, I'm assuming there must be one) Runs only once before the first batch, but not subsequent ones.onStepStart(latent)
On the very first call this would be the raw noise, no denoising yetonStepCondFn()
Allow implementing new loss terms to guide diffusion. I'm not sure how it's been done in this repository but this would be the place in k-diffusion. Uses for this include CLIP conditioning, lpips to preserve perceptual similarity (as in Disco Diffusion), or preserving shapes as in PyTTI with convolutions ("edge stabilization").onStepEnd(latent)
Extend specific plugin features...
API design thoughts:
on
prefix establishes a clear boundary between plugin event handlers and any extra functions written by the plugin developer.start
events are positioned before running any of the code relevant to that event's description, andend
come after. Otherwise, we specify with past-tense to avoid ambiguity, e.g.onImageSaved
,onRunInterrupted
.Beta Was this translation helpful? Give feedback.
All reactions