Skip to content

Commit 6f313e8

Browse files
doc change
1 parent 7e637d6 commit 6f313e8

File tree

3 files changed

+26
-6
lines changed

3 files changed

+26
-6
lines changed

docs/source/en/api/loaders/single_file.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ The [`~loaders.FromSingleFileMixin.from_single_file`] method allows you to load:
2323
## Supported pipelines
2424

2525
- [`CogVideoXPipeline`]
26+
- [`CogVideoXImageToVideoPipeline`]
27+
- [`CogVideoXVideoToVideoPipeline`]
2628
- [`StableDiffusionPipeline`]
2729
- [`StableDiffusionImg2ImgPipeline`]
2830
- [`StableDiffusionInpaintPipeline`]

docs/source/en/api/pipelines/cogvideox.md

Lines changed: 23 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -44,10 +44,13 @@ First, load the pipeline:
4444

4545
```python
4646
import torch
47-
from diffusers import CogVideoXPipeline
48-
from diffusers.utils import export_to_video
49-
50-
pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-2b").to("cuda")
47+
from diffusers import CogVideoXPipeline, CogVideoXImageToVideoPipeline
48+
from diffusers.utils import export_to_video,load_image
49+
pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-5b").to("cuda") # or "THUDM/CogVideoX-2b"
50+
```
51+
If you are using the image-to-video pipeline, load it as follows:
52+
```python
53+
pipe = CogVideoXImageToVideoPipeline.from_pretrained("THUDM/CogVideoX-5b-I2V").to("cuda") # Image-to-Video pipeline
5154
```
5255

5356
Then change the memory layout of the pipelines `transformer` component to `torch.channels_last`:
@@ -56,7 +59,7 @@ Then change the memory layout of the pipelines `transformer` component to `torch
5659
pipe.transformer.to(memory_format=torch.channels_last)
5760
```
5861

59-
Finally, compile the components and run inference:
62+
compile the components and run inference:
6063

6164
```python
6265
pipe.transformer = torch.compile(pipeline.transformer, mode="max-autotune", fullgraph=True)
@@ -66,6 +69,21 @@ prompt = "A panda, dressed in a small, red jacket and a tiny hat, sits on a wood
6669
video = pipe(prompt=prompt, guidance_scale=6, num_inference_steps=50).frames[0]
6770
```
6871

72+
if you are using the image-to-video pipeline, you can use the following code to generate a video from an image:
73+
74+
```python
75+
image = load_image("image_of_panda.jpg")
76+
prompt = "A panda, dressed in a small, red jacket and a tiny hat, sits on a wooden stool in a serene bamboo forest. The panda's fluffy paws strum a miniature acoustic guitar, producing soft, melodic tunes. Nearby, a few other pandas gather, watching curiously and some clapping in rhythm. Sunlight filters through the tall bamboo, casting a gentle glow on the scene. The panda's face is expressive, showing concentration and joy as it plays. The background includes a small, flowing stream and vibrant green foliage, enhancing the peaceful and magical atmosphere of this unique musical performance."
77+
video = pipe(prompt=prompt, image=image, guidance_scale=6, num_inference_steps=50).frames[0]
78+
```
79+
80+
To save the video, use the following code:
81+
82+
```python
83+
export_to_video(video, "panda_video.mp4")
84+
```
85+
86+
6987
The [benchmark](https://gist.github.com/a-r-r-o-w/5183d75e452a368fd17448fcc810bd3f) results on an 80GB A100 machine are:
7088

7189
```

scripts/convert_cogvideox_to_diffusers.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -244,7 +244,7 @@ def get_args():
244244
text_encoder_id = "google/t5-v1_1-xxl"
245245
tokenizer = T5Tokenizer.from_pretrained(text_encoder_id, model_max_length=TOKENIZER_MAX_LENGTH)
246246
text_encoder = T5EncoderModel.from_pretrained(text_encoder_id, cache_dir=args.text_encoder_cache_dir)
247-
# Apparently, the conversion does not work any more without this :shrug:
247+
# Apparently, the conversion does not work anymore without this :shrug:
248248
for param in text_encoder.parameters():
249249
param.data = param.data.contiguous()
250250

0 commit comments

Comments
 (0)