You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+20-7Lines changed: 20 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,8 +22,15 @@ Experience the CogVideoX-5B model online at <a href="https://huggingface.co/spac
22
22
23
23
## Project Updates
24
24
25
-
- 🔥🔥 **News**: ```2024/10/13```: A more cost-effective fine-tuning framework for `CogVideoX-5B` that works with a single 4090 GPU, [cogvideox-factory](https://github.com/a-r-r-o-w/cogvideox-factory), has been released. It supports fine-tuning with multiple resolutions. Feel free to use it!
26
-
- 🔥 **News**: ```2024/10/10```: We have updated our technical report. Please click [here](https://arxiv.org/pdf/2408.06072) to view it. More training details and a demo have been added. To see the demo, click [here](https://yzy-thu.github.io/CogVideoX-demo/).- 🔥 **News**: ```2024/10/09```: We have publicly released the [technical documentation](https://zhipu-ai.feishu.cn/wiki/DHCjw1TrJiTyeukfc9RceoSRnCh) for CogVideoX fine-tuning on Feishu, further increasing distribution flexibility. All examples in the public documentation can be fully reproduced.
25
+
- 🔥🔥 **News**: ```2024/10/13```: A more cost-effective fine-tuning framework for `CogVideoX-5B` that works with a single
26
+
4090 GPU, [cogvideox-factory](https://github.com/a-r-r-o-w/cogvideox-factory), has been released. It supports
27
+
fine-tuning with multiple resolutions. Feel free to use it!
28
+
- 🔥 **News**: ```2024/10/10```: We have updated our technical report. Please
29
+
click [here](https://arxiv.org/pdf/2408.06072) to view it. More training details and a demo have been added. To see
30
+
the demo, click [here](https://yzy-thu.github.io/CogVideoX-demo/).- 🔥 **News**: ```2024/10/09```: We have publicly
31
+
released the [technical documentation](https://zhipu-ai.feishu.cn/wiki/DHCjw1TrJiTyeukfc9RceoSRnCh) for CogVideoX
32
+
fine-tuning on Feishu, further increasing distribution flexibility. All examples in the public documentation can be
33
+
fully reproduced.
27
34
- 🔥 **News**: ```2024/9/19```: We have open-sourced the CogVideoX series image-to-video model **CogVideoX-5B-I2V**.
28
35
This model can take an image as a background input and generate a video combined with prompt words, offering greater
29
36
controllability. With this, the CogVideoX series models now support three tasks: text-to-video generation, video
@@ -295,10 +302,16 @@ works have already been adapted for CogVideoX, and we invite everyone to use the
is a fine-tuned model based on CogVideoX, specifically designed for interior design.
298
-
+[xDiT](https://github.com/xdit-project/xDiT): xDiT is a scalable inference engine for Diffusion Transformers (DiTs)
299
-
on multiple GPU Clusters. xDiT supports real-time image and video generations services.
300
-
+[cogvideox-factory](https://github.com/a-r-r-o-w/cogvideox-factory): A cost-effective
301
-
fine-tuning framework for CogVideoX, compatible with the `diffusers` version model. Supports more resolutions, and fine-tuning CogVideoX-5B can be done with a single 4090 GPU.
305
+
+[xDiT](https://github.com/xdit-project/xDiT): xDiT is a scalable inference engine for Diffusion Transformers (DiTs)
306
+
on multiple GPU Clusters. xDiT supports real-time image and video generations services.
307
+
[cogvideox-factory](https://github.com/a-r-r-o-w/cogvideox-factory): A cost-effective
308
+
fine-tuning framework for CogVideoX, compatible with the `diffusers` version model. Supports more resolutions, and
309
+
fine-tuning CogVideoX-5B can be done with a single 4090 GPU.
310
+
+[CogVideoX-Interpolation](https://github.com/feizc/CogvideX-Interpolation): A pipeline based on the modified CogVideoX
311
+
structure, aimed at providing greater flexibility for keyframe interpolation generation.
312
+
+[DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio): DiffSynth Studio is a diffusion engine. It has
313
+
restructured the architecture, including text encoders, UNet, VAE, etc., enhancing computational performance while
314
+
maintaining compatibility with open-source community models. The framework has been adapted for CogVideoX.
302
315
303
316
## Project Structure
304
317
@@ -365,7 +378,7 @@ This folder contains some tools for model conversion / caption generation, etc.
365
378
+[llm_flux_cogvideox](tools/llm_flux_cogvideox/llm_flux_cogvideox.py): Automatically generate videos using an
366
379
open-source local large language model + Flux + CogVideoX.
0 commit comments