A1111 Stable Diffusion webui - a bird's eye view - self study #4821

Ehplodor · 2022-11-18T13:20:19Z

Ehplodor
Nov 18, 2022

I try my best to understand the current code and translate it into something I can, finally, make sense of. If someone actually read all this and find errors in my "translation", please correct me in the comment. TY in advance.

starting the webui

webui-user.bat

reads user arguments (COMMANDLINE_ARGS)
call webui.bat
- run launch.py
  - prepare_environment()
    ...
  - start()
    if --nowebui : webui.api_only() (a function in webui.py file)
    else : webui.webui() (a function in webui.py file) ---> go to "### webui.webui() in webui.py" section
    endif

webui.webui() in webui.py

First

initialize() : init a bunch of things
- model with VAE : load_model() -> load_model() definition in sd_models.py -> load_model_weights() -> load VAE (call function load_vae() from sd_vae.py after resolving of which VAE to use)
- fixers
- scalers
- ...

Then

create gradio UI via modules.ui.createui(...) as "demo" object, then lanch it with demo.launch(...) --> go to "### modules.ui.create_ui() in modules/ui.py" section

At last

launch api if user requested it
sets samplers
sets extensions
sets localizations
sets custom scripts
sets modules
sets model list

modules.ui.create_ui() in modules/ui.py

Import main SD functionalities :

import modules/img2img.py script
import modules/txt2img.py script

Then sets up the interface for txt2img ("txt2img_interface")

top row (where the "submit" button is located - modules.ui.create_toprow(...) creates the top row)
progress row
main parameters of txt2img as radio buttons or sliders (sampler, steps, cfg...)
output panel
defines txt2img arguments as a dict variable : txt2img_args = dict(...), together with the call to modules.txt2img.txt2img function
defines the passing of this dict to submit's button click action (uses python's **kwargs - info here and more here) so as to call txt2img(...) from modules/txt2img.py, and generate an image from prompt only --> go to "### txt2img() definition in modules/txt2img.py" section (todo)

Then sets up the interface for img2img ("img2img_interface")

top row (where the "submit" button is located - modules.ui.create_toprow(...) creates the top row)
... similar gradio ui interface definitions ...
defines img2img arguments as a dict variable : img2img_args = dict(...), together with the call to modules.im2img.img2img function
defines the passing of this dict to submit's button click action (uses python's **kwargs) so as to call img2img(...) from modules/img2img.py, and generate an image from prompt and initializing image --> go to "### img2img() definition in modules/img2img.py" section (todo)

txt2img() definition in modules/txt2img.py

instanciates object "p" from class "StableDiffusionProcessingTxt2Img" defined in modules/processing.py
attaches installed scripts to "p"
attaches any scripts' custom inputs to "p"
run script(s?) - recursively - (??? apparently ??? So maybe more than one script could actually be executed ?) until no script left to do, then return none... (question : is "p" object something that can be updated without being returned ?)
run THE MAIN THING i.e. process_images(p) from "modules/processing.py", where a loop is defined and sample() is called at each steps so as to gradually produce an image.
erase the "p" object : p.close()

process_images() from "modules/processing.py

inputs : "p" (object of class StableDiffusionProcessing)
outputs : "res" (object of class Processed)
First, process_images() saves some shared options in a safe place
Then it overrides the options with thoses defined in "p"
Then it actually processes "p" into "res", via the process_images_inner(p) function --> go to "### process_images_inner() definition in modules/processing.py"
Finally it restores shared options back
and returns "res"

process_images_inner() definition in modules/processing.py :

the magic happens inside a loop of nsteps, as defined in the ui, at

stable-diffusion-webui/modules/processing.py

Line 519 in 98947d1

    
           samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)

trhrough the p.sample(...) function. p is a p: "StableDiffusionProcessingTxt2Img" class object so that function sample() is defined at

stable-diffusion-webui/modules/processing.py

Line 647 in 98947d1

    
           def sample(self, conditioning, unconditional_conditioning, seeds, subseeds, subseed_strength, prompts):

in the

stable-diffusion-webui/modules/processing.py

Line 602 in 98947d1

class StableDiffusionProcessingTxt2Img(StableDiffusionProcessing):

class (see the part above about p instantiation)

sample() definition from class "StableDiffusionProcessingTxt2Img" in modules/processing.py#L647

Sampler is created in the beginning and added to "p", so that the right sampler's sample() function can be called later

If "highres-fix" is NOT active

First, random latents (tensors of adequate size) are created and assigned to "x"
Then, function sample() (from choosen sampler (as indicated by a number in "p" object, but defined in "samplers" object instanciated at the beginning of this paragraph), is called once, using as last parameter an "image conditioning" on itself ("x") as the "last latent" when sampling, from which "sigmas" (i.e. a bit of noise) will be removed. This underlines the proximity between txt2img and img2img.
Finally, the "sample" object, result of previous "sample()" function, is returned.

Else (highres-fix is active)

(todo...)

EndIf

img2img() in modules/txt2img.py

sd_samplers.create_sampler_with_index() is defined in modules/sd_samplers.py

There are 15 samplers of class "KDiffusionSampler", and 2 of class "VanillaStableDiffusionSampler", to choose from.
The main takeway from that part is that some noise is eventually removed from "x" (image in the latent's space), using positive prompt (conditional conditioning) and negative prompt (unconditional conditioning).
To be continued...

Ehplodor · 2022-11-18T13:21:14Z

Ehplodor
Nov 18, 2022
Author

todo : go down the rabbit's hole -> txt2img() and img2img()

0 replies

Ehplodor · 2022-11-18T13:28:28Z

Ehplodor
Nov 18, 2022
Author

@Extraltodeus I do miss comments too. However I have to admit that at this point it is almost self explanatory. Maybe because I'm accustomed to the UI. I'll see what comes next when i'll try to make sense of txt2img and img2img :-)

0 replies

Ehplodor · 2022-11-18T14:47:48Z

Ehplodor
Nov 18, 2022
Author

I have some doubt about my understanding of scripts execution from txt2img()...

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

A1111 Stable Diffusion webui - a bird's eye view - self study #4821

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

A1111 Stable Diffusion webui - a bird's eye view - self study #4821

Uh oh!

Uh oh!

Ehplodor Nov 18, 2022

starting the webui

webui.webui() in webui.py

modules.ui.create_ui() in modules/ui.py

txt2img() definition in modules/txt2img.py

process_images() from "modules/processing.py

process_images_inner() definition in modules/processing.py :

sample() definition from class "StableDiffusionProcessingTxt2Img" in modules/processing.py#L647

img2img() in modules/txt2img.py

sd_samplers.create_sampler_with_index() is defined in modules/sd_samplers.py

Replies: 3 comments

Uh oh!

Ehplodor Nov 18, 2022 Author

Uh oh!

Ehplodor Nov 18, 2022 Author

Uh oh!

Ehplodor Nov 18, 2022 Author

Ehplodor
Nov 18, 2022

Ehplodor
Nov 18, 2022
Author

Ehplodor
Nov 18, 2022
Author

Ehplodor
Nov 18, 2022
Author