Skip to content

Commit 195926b

Browse files
DN6sayakpaul
andauthored
Update Chroma Docs (huggingface#11753)
* update * update --------- Co-authored-by: Sayak Paul <[email protected]>
1 parent 85a916b commit 195926b

File tree

3 files changed

+62
-34
lines changed

3 files changed

+62
-34
lines changed

docs/source/en/api/pipelines/chroma.md

Lines changed: 49 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,36 @@ Chroma can use all the same optimizations as Flux.
2727

2828
</Tip>
2929

30-
## Inference (Single File)
30+
## Inference
3131

32-
The `ChromaTransformer2DModel` supports loading checkpoints in the original format. This is also useful when trying to load finetunes or quantized versions of the models that have been published by the community.
32+
The Diffusers version of Chroma is based on the [`unlocked-v37`](https://huggingface.co/lodestones/Chroma/blob/main/chroma-unlocked-v37.safetensors) version of the original model, which is available in the [Chroma repository](https://huggingface.co/lodestones/Chroma).
33+
34+
```python
35+
import torch
36+
from diffusers import ChromaPipeline
37+
38+
pipe = ChromaPipeline.from_pretrained("lodestones/Chroma", torch_dtype=torch.bfloat16)
39+
pipe.enabe_model_cpu_offload()
40+
41+
prompt = [
42+
"A high-fashion close-up portrait of a blonde woman in clear sunglasses. The image uses a bold teal and red color split for dramatic lighting. The background is a simple teal-green. The photo is sharp and well-composed, and is designed for viewing with anaglyph 3D glasses for optimal effect. It looks professionally done."
43+
]
44+
negative_prompt = ["low quality, ugly, unfinished, out of focus, deformed, disfigure, blurry, smudged, restricted palette, flat colors"]
45+
46+
image = pipe(
47+
prompt=prompt,
48+
negative_prompt=negative_prompt,
49+
generator=torch.Generator("cpu").manual_seed(433),
50+
num_inference_steps=40,
51+
guidance_scale=3.0,
52+
num_images_per_prompt=1,
53+
).images[0]
54+
image.save("chroma.png")
55+
```
56+
57+
## Loading from a single file
58+
59+
To use updated model checkpoints that are not in the Diffusers format, you can use the `ChromaTransformer2DModel` class to load the model from a single file in the original format. This is also useful when trying to load finetunes or quantized versions of the models that have been published by the community.
3360

3461
The following example demonstrates how to run Chroma from a single file.
3562

@@ -38,34 +65,39 @@ Then run the following example
3865
```python
3966
import torch
4067
from diffusers import ChromaTransformer2DModel, ChromaPipeline
41-
from transformers import T5EncoderModel
4268

43-
bfl_repo = "black-forest-labs/FLUX.1-dev"
69+
model_id = "lodestones/Chroma"
4470
dtype = torch.bfloat16
4571

46-
transformer = ChromaTransformer2DModel.from_single_file("https://huggingface.co/lodestones/Chroma/blob/main/chroma-unlocked-v35.safetensors", torch_dtype=dtype)
47-
48-
text_encoder = T5EncoderModel.from_pretrained(bfl_repo, subfolder="text_encoder_2", torch_dtype=dtype)
49-
tokenizer = T5Tokenizer.from_pretrained(bfl_repo, subfolder="tokenizer_2", torch_dtype=dtype)
50-
51-
pipe = ChromaPipeline.from_pretrained(bfl_repo, transformer=transformer, text_encoder=text_encoder, tokenizer=tokenizer, torch_dtype=dtype)
72+
transformer = ChromaTransformer2DModel.from_single_file("https://huggingface.co/lodestones/Chroma/blob/main/chroma-unlocked-v37.safetensors", torch_dtype=dtype)
5273

74+
pipe = ChromaPipeline.from_pretrained(model_id, transformer=transformer, torch_dtype=dtype)
5375
pipe.enable_model_cpu_offload()
5476

55-
prompt = "A cat holding a sign that says hello world"
77+
prompt = [
78+
"A high-fashion close-up portrait of a blonde woman in clear sunglasses. The image uses a bold teal and red color split for dramatic lighting. The background is a simple teal-green. The photo is sharp and well-composed, and is designed for viewing with anaglyph 3D glasses for optimal effect. It looks professionally done."
79+
]
80+
negative_prompt = ["low quality, ugly, unfinished, out of focus, deformed, disfigure, blurry, smudged, restricted palette, flat colors"]
81+
5682
image = pipe(
57-
prompt,
58-
guidance_scale=4.0,
59-
output_type="pil",
60-
num_inference_steps=26,
61-
generator=torch.Generator("cpu").manual_seed(0)
83+
prompt=prompt,
84+
negative_prompt=negative_prompt,
85+
generator=torch.Generator("cpu").manual_seed(433),
86+
num_inference_steps=40,
87+
guidance_scale=3.0,
6288
).images[0]
6389

64-
image.save("image.png")
90+
image.save("chroma-single-file.png")
6591
```
6692

6793
## ChromaPipeline
6894

6995
[[autodoc]] ChromaPipeline
7096
- all
7197
- __call__
98+
99+
## ChromaImg2ImgPipeline
100+
101+
[[autodoc]] ChromaImg2ImgPipeline
102+
- all
103+
- __call__

src/diffusers/pipelines/chroma/pipeline_chroma.py

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -52,20 +52,21 @@
5252
>>> import torch
5353
>>> from diffusers import ChromaPipeline
5454
55+
>>> model_id = "lodestones/Chroma"
5556
>>> ckpt_path = "https://huggingface.co/lodestones/Chroma/blob/main/chroma-unlocked-v37.safetensors"
5657
>>> transformer = ChromaTransformer2DModel.from_single_file(ckpt_path, torch_dtype=torch.bfloat16)
57-
>>> text_encoder = AutoModel.from_pretrained("black-forest-labs/FLUX.1-schnell", subfolder="text_encoder_2")
58-
>>> tokenizer = AutoTokenizer.from_pretrained("black-forest-labs/FLUX.1-schnell", subfolder="tokenizer_2")
59-
>>> pipe = ChromaImg2ImgPipeline.from_pretrained(
60-
... "black-forest-labs/FLUX.1-schnell",
58+
>>> pipe = ChromaPipeline.from_pretrained(
59+
... model_id,
6160
... transformer=transformer,
62-
... text_encoder=text_encoder,
63-
... tokenizer=tokenizer,
6461
... torch_dtype=torch.bfloat16,
6562
... )
6663
>>> pipe.enable_model_cpu_offload()
67-
>>> prompt = "A cat holding a sign that says hello world"
68-
>>> negative_prompt = "low quality, ugly, unfinished, out of focus, deformed, disfigure, blurry, smudged, restricted palette, flat colors"
64+
>>> prompt = [
65+
... "A high-fashion close-up portrait of a blonde woman in clear sunglasses. The image uses a bold teal and red color split for dramatic lighting. The background is a simple teal-green. The photo is sharp and well-composed, and is designed for viewing with anaglyph 3D glasses for optimal effect. It looks professionally done."
66+
... ]
67+
>>> negative_prompt = [
68+
... "low quality, ugly, unfinished, out of focus, deformed, disfigure, blurry, smudged, restricted palette, flat colors"
69+
... ]
6970
>>> image = pipe(prompt, negative_prompt=negative_prompt).images[0]
7071
>>> image.save("chroma.png")
7172
```

src/diffusers/pipelines/chroma/pipeline_chroma_img2img.py

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -51,26 +51,21 @@
5151
```py
5252
>>> import torch
5353
>>> from diffusers import ChromaTransformer2DModel, ChromaImg2ImgPipeline
54-
>>> from transformers import AutoModel, Autotokenizer
5554
55+
>>> model_id = "lodestones/Chroma"
5656
>>> ckpt_path = "https://huggingface.co/lodestones/Chroma/blob/main/chroma-unlocked-v37.safetensors"
57-
>>> transformer = ChromaTransformer2DModel.from_single_file(ckpt_path, torch_dtype=torch.bfloat16)
58-
>>> text_encoder = AutoModel.from_pretrained("black-forest-labs/FLUX.1-schnell", subfolder="text_encoder_2")
59-
>>> tokenizer = AutoTokenizer.from_pretrained("black-forest-labs/FLUX.1-schnell", subfolder="tokenizer_2")
6057
>>> pipe = ChromaImg2ImgPipeline.from_pretrained(
61-
... "black-forest-labs/FLUX.1-schnell",
58+
... model_id,
6259
... transformer=transformer,
63-
... text_encoder=text_encoder,
64-
... tokenizer=tokenizer,
6560
... torch_dtype=torch.bfloat16,
6661
... )
6762
>>> pipe.enable_model_cpu_offload()
68-
>>> image = load_image(
63+
>>> init_image = load_image(
6964
... "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
7065
... )
7166
>>> prompt = "a scenic fastasy landscape with a river and mountains in the background, vibrant colors, detailed, high resolution"
7267
>>> negative_prompt = "low quality, ugly, unfinished, out of focus, deformed, disfigure, blurry, smudged, restricted palette, flat colors"
73-
>>> image = pipe(prompt, image=image, negative_prompt=negative_prompt).images[0]
68+
>>> image = pipe(prompt, image=init_image, negative_prompt=negative_prompt).images[0]
7469
>>> image.save("chroma-img2img.png")
7570
```
7671
"""

0 commit comments

Comments
 (0)