Skip to content

Commit 48e4ff5

Browse files
committed
update overview
1 parent 7c78fb1 commit 48e4ff5

File tree

2 files changed

+63
-54
lines changed

2 files changed

+63
-54
lines changed

docs/source/en/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,8 @@
8989
- sections:
9090
- local: modular_diffusers/developer_guide
9191
title: Developer Guide
92+
- local: modular_diffusers/overview
93+
title: Overview
9294
- sections:
9395
- local: using-diffusers/cogvideox
9496
title: CogVideoX

docs/source/en/modular_diffusers/overview.md

Lines changed: 61 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License.
1212

1313
# Overview
1414

15-
The Modular Diffusers Framework consist of three main components
15+
The Modular Diffusers Framework consists of three main components:
1616

1717
## ModularPipelineBlocks
1818

@@ -23,35 +23,37 @@ Pipeline blocks are the fundamental building blocks of the Modular Diffusers sys
2323
- [`AutoPipelineBlocks`](TODO)
2424

2525

26-
Each block defines:
27-
28-
**Specifications:**
29-
- Inputs: User-provided parameters that the block expects
30-
- Intermediate inputs: Variables from other blocks that this block needs
31-
- Intermediate outputs: Variables this block produces for other blocks to use
32-
- Components: Models and processors the block requires (e.g., UNet, VAE, scheduler)
33-
34-
**Computation:**
35-
- `__call__` method: Defines the actual computational steps within the block
36-
37-
Pipeline blocks are essentially **"definitions"** - they define the specifications and computational steps for a pipeline, but are not runnable until converted into a `ModularPipeline` object.
38-
39-
All blocks interact with a global `PipelineState` object that maintains the pipeline's state throughout execution.
40-
41-
### Load/save a custom `ModularPipelineBlocks`
42-
43-
You can load a custom pipeline block from a hub repository directly
44-
26+
To use a `ModularPipelineBlocks` officially supported in 🧨 Diffusers
4527
```py
46-
from diffusers import ModularPipelineBlocks
47-
diffdiff_block = ModularPipelineBlocks.from_pretrained(repo_id, trust_remote_code=True)
28+
>>> from diffusers.modular_pipelines.stable_diffusion_xl import StableDiffusionXLTextEncoderStep
29+
>>> text_encoder_block = StableDiffusionXLTextEncoderStep()
4830
```
4931

50-
to save, and publish to a hub repository
32+
Each [`ModularPipelineBlocks`] defines its requirement for components, configs, inputs, intermediate inputs, and outputs. You'll see that this text encoder block uses text_encoders, tokenizers as well as a guider component. It takes user inputs such as `prompt` and `negative_prompt`, and return a list of conditional text embeddings.
5133

52-
```py
53-
diffdiff_block.save(repo_id)
5434
```
35+
>>> text_encoder_block
36+
StableDiffusionXLTextEncoderStep(
37+
Class: PipelineBlock
38+
Description: Text Encoder step that generate text_embeddings to guide the image generation
39+
Components:
40+
text_encoder (`CLIPTextModel`)
41+
text_encoder_2 (`CLIPTextModelWithProjection`)
42+
tokenizer (`CLIPTokenizer`)
43+
tokenizer_2 (`CLIPTokenizer`)
44+
guider (`ClassifierFreeGuidance`)
45+
Configs:
46+
force_zeros_for_empty_prompt (default: True)
47+
Inputs:
48+
prompt=None, prompt_2=None, negative_prompt=None, negative_prompt_2=None, cross_attention_kwargs=None, clip_skip=None
49+
Intermediates:
50+
- outputs: prompt_embeds, negative_prompt_embeds, pooled_prompt_embeds, negative_pooled_prompt_embeds
51+
)
52+
```
53+
54+
Pipeline blocks are essentially **"definitions"** - they define the specifications and computational steps for a pipeline. However, they do not contain any model states, and are not runnable until converted into a `ModularPipeline` object.
55+
56+
Read more about how to write your own `ModularPipelineBlocks` [here](TODO)
5557

5658
## PipelineState & BlockState
5759

@@ -67,15 +69,9 @@ You typically don't need to manually create or manage these state objects. The `
6769

6870
`ModularPipeline` is the main interface to create and execute pipelines in the Modular Diffusers system.
6971

70-
### Create a `ModularPipeline`
72+
### Modular Repo
7173

72-
Each `ModularPipelineBlocks` has an `init_pipeline` method that can initialize a `ModularPipeline` object based on its component and configuration specifications.
73-
74-
```py
75-
>>> pipeline = blocks.init_pipeline(pretrained_model_name_or_path)
76-
```
77-
78-
`ModularPipeline` only works with modular repositories, so make sure `pretrained_model_name_or_path` points to a modular repo (you can see an example [here](https://huggingface.co/YiYiXu/modular-diffdiff)).
74+
`ModularPipeline` only works with modular repositories. You can find an example modular repo [here](https://huggingface.co/YiYiXu/modular-diffdiff).
7975

8076
The main differences from standard diffusers repositories are:
8177

@@ -93,7 +89,7 @@ In standard `model_index.json`, each component entry is a `(library, class)` tup
9389
In `modular_model_index.json`, each component entry contains 3 elements: `(library, class, loading_specs {})`
9490

9591
- `library` and `class`: Information about the actual component loaded in the pipeline at the time of saving (can be `None` if not loaded)
96-
- **`loading_specs`**: A dictionary containing all information required to load this component, including `repo`, `revision`, `subfolder`, `variant`, and `type_hint`
92+
- `loading_specs`: A dictionary containing all information required to load this component, including `repo`, `revision`, `subfolder`, `variant`, and `type_hint`
9793

9894
```py
9995
"text_encoder": [
@@ -114,7 +110,16 @@ In `modular_model_index.json`, each component entry contains 3 elements: `(libra
114110

115111
2. Cross-Repository Component Loading
116112

117-
Unlike standard repositories where components must be in subfolders within the same repo, modular repositories can fetch components from different repositories based on the `loading_specs` dictionary. In our example above, the `text_encoder` component will be fetched from the "text_encoder" folder in `stabilityai/stable-diffusion-xl-base-1.0` while other components come from different repositories.
113+
Unlike standard repositories where components must be in subfolders within the same repo, modular repositories can fetch components from different repositories based on the `loading_specs` dictionary. e.g. the `text_encoder` component will be fetched from the "text_encoder" folder in `stabilityai/stable-diffusion-xl-base-1.0` while other components come from different repositories.
114+
115+
116+
### Create a `ModularPipeline` from `ModularPipelineBlocks`
117+
118+
Each `ModularPipelineBlocks` has an `init_pipeline` method that can initialize a `ModularPipeline` object based on its component and configuration specifications.
119+
120+
```py
121+
>>> pipeline = blocks.init_pipeline(pretrained_model_name_or_path)
122+
```
118123

119124

120125
<Tip>
@@ -135,7 +140,6 @@ You can read more about Components Manager [here](TODO)
135140

136141
</Tip>
137142

138-
139143
Unlike `DiffusionPipeline`, you need to explicitly load model components using `load_components`:
140144

141145
```py
@@ -155,49 +159,52 @@ You can partially load specific components using the `component_names` argument,
155159

156160
</Tip>
157161

158-
### Execute a `ModularPipeline`
162+
### Load a `ModularPipeline` from hub
159163

160-
The API to run the `ModularPipeline` is very similar to how you would run a regular `DiffusionPipeline`:
164+
You can create a `ModularPipeline` from a HuggingFace Hub repository with `from_pretrained` method, as long as it's a modular repo:
161165

162166
```py
163-
>>> image = pipeline(prompt="a cat", num_inference_steps=15, output="images")[0]
167+
pipeline = ModularPipeline.from_pretrained(repo_id, components_manager=..., collection=...)
164168
```
165169

166-
There are a few key differences though:
167-
1. You can also pass a `PipelineState` object directly to the pipeline instead of individual arguments
168-
2. If you do not specify the `output` argument, it returns the `PipelineState` object
169-
3. You can pass a list as `output`, e.g. `pipeline(... output=["images", "latents"])` will return a dictionary containing both the generated image and the final denoised latents
170+
Loading custom code is also supported:
170171

171-
Under the hood, `ModularPipeline`'s `__call__` method is a wrapper around the pipeline blocks' `__call__` method: it creates a `PipelineState` object and populates it with user inputs, then returns the output to the user based on the `output` argument. It also ensures that all pipeline-level config and components are exposed to all pipeline blocks by preparing and passing a `components` input.
172+
```py
173+
diffdiff_pipeline = ModularPipeline.from_pretrained(repo_id, trust_remote_code=True, ...)
174+
```
172175

173-
### Load a `ModularPipeline` from hub
176+
Similar to `init_pipeline` method, the modular pipeline will not load any components automatically, so you will have to call `load_components` to explicitly load the components you need.
174177

175-
You can directly load a `ModularPipeline` from a HuggingFace Hub repository, as long as it's a modular repo
176178

177-
```py
178-
pipeine = ModularPipeline.from_pretrained(repo_id, components_manager=..., collection=...)
179-
```
179+
### Execute a `ModularPipeline`
180180

181-
Loading custom code is also supported, just pass a `trust_remote_code=True` argument:
181+
The API to run the `ModularPipeline` is very similar to how you would run a regular `DiffusionPipeline`:
182182

183183
```py
184-
diffdiff_pipeline = ModularPipeline.from_pretrained(repo_id, trust_remote_code=True, ...)
184+
>>> image = pipeline(prompt="a cat", num_inference_steps=15, output="images")[0]
185185
```
186186

187-
The ModularPipeine created with `from_pretrained` method also would not load any components and you would have to call `load_components` to explicitly load components you need.
187+
There are a few key differences though:
188+
1. You can also pass a `PipelineState` object directly to the pipeline instead of individual arguments
189+
2. If you do not specify the `output` argument, it returns the `PipelineState` object
190+
3. You can pass a list as `output`, e.g. `pipeline(... output=["images", "latents"])` will return a dictionary containing both the generated image and the final denoised latents
191+
192+
Under the hood, `ModularPipeline`'s `__call__` method is a wrapper around the pipeline blocks' `__call__` method: it creates a `PipelineState` object and populates it with user inputs, then returns the output to the user based on the `output` argument. It also ensures that all pipeline-level config and components are exposed to all pipeline blocks by preparing and passing a `components` input.
188193

189194

190195
### Save a `ModularPipeline`
191196

192-
to save a `ModularPipeline` and publish it to hub
197+
To save a `ModularPipeline` and publish it to hub:
193198

194199
```py
195200
pipeline.save_pretrained("YiYiXu/modular-loader-t2i", push_to_hub=True)
196201
```
197202

198203
<Tip>
199204

200-
We do not automatically save custom code and share it on hub for you, please read more about how to share your custom pipeline on hub [here](TODO: ModularPipeline/CustomCode)
205+
We do not automatically save custom code and share it on hub for you. Please read more about how to share your custom pipeline on hub [here](TODO: ModularPipeline/CustomCode)
206+
207+
</Tip>
201208

202209

203210

0 commit comments

Comments
 (0)