Skip to content

Commit ae4f418

Browse files
authored
Multi GPUs API && Control preprocess nodes && Fix merge_lora bug in sequential_cpu_offload (#145)
1 parent e90bc58 commit ae4f418

File tree

108 files changed

+12559
-45
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

108 files changed

+12559
-45
lines changed

README.md

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,19 @@ We need about 60GB available on disk (for saving weights), please check!
111111
#### b. Weights
112112
We'd better place the [weights](#model-zoo) along the specified path:
113113

114+
**Via ComfyUI**:
115+
Put the models into the ComfyUI weights folder `ComfyUI/models/Fun_Models/`:
116+
```
117+
📦 ComfyUI/
118+
├── 📂 models/
119+
│ └── 📂 Fun_Models/
120+
│ ├── 📂 CogVideoX-Fun-V1.1-2b-InP/
121+
│ ├── 📂 CogVideoX-Fun-V1.1-5b-InP/
122+
│ ├── 📂 Wan2.1-Fun-14B-InP
123+
│ └── 📂 Wan2.1-Fun-1.3B-InP/
124+
```
125+
126+
**Run its own python file or UI interface**:
114127
```
115128
📦 models/
116129
├── 📂 Diffusion_Transformer/
@@ -182,10 +195,10 @@ We'd better place the [weights](#model-zoo) along the specified path:
182195
<video src="https://github.com/user-attachments/assets/53002ce2-dd18-4d4f-8135-b6f68364cabd" width="100%" controls autoplay loop></video>
183196
</td>
184197
<td>
185-
<video src="https://github.com/user-attachments/assets/fce43c0b-81fa-4ab2-9ca7-78d786f520e6" width="100%" controls autoplay loop></video>
198+
<video src="https://github.com/user-attachments/assets/a1a07cf8-d86d-4cd2-831f-18a6c1ceee1d" width="100%" controls autoplay loop></video>
186199
</td>
187200
<td>
188-
<video src="https://github.com/user-attachments/assets/b208b92c-5add-4ece-a200-3dbbe47b93c3" width="100%" controls autoplay loop></video>
201+
<video src="https://github.com/user-attachments/assets/3224804f-342d-4947-918d-d9fec8e3d273" width="100%" controls autoplay loop></video>
189202
</td>
190203
<tr>
191204
<td>
@@ -268,10 +281,10 @@ Resolution-512
268281
<video src="https://github.com/user-attachments/assets/53002ce2-dd18-4d4f-8135-b6f68364cabd" width="100%" controls autoplay loop></video>
269282
</td>
270283
<td>
271-
<video src="https://github.com/user-attachments/assets/fce43c0b-81fa-4ab2-9ca7-78d786f520e6" width="100%" controls autoplay loop></video>
284+
<video src="https://github.com/user-attachments/assets/a1a07cf8-d86d-4cd2-831f-18a6c1ceee1d" width="100%" controls autoplay loop></video>
272285
</td>
273286
<td>
274-
<video src="https://github.com/user-attachments/assets/b208b92c-5add-4ece-a200-3dbbe47b93c3" width="100%" controls autoplay loop></video>
287+
<video src="https://github.com/user-attachments/assets/3224804f-342d-4947-918d-d9fec8e3d273" width="100%" controls autoplay loop></video>
275288
</td>
276289
<tr>
277290
<td>

README_ja-JP.md

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,19 @@ Linuxの詳細:
111111
#### b. 重み
112112
[重み](#model-zoo)を指定されたパスに配置することをお勧めします:
113113

114+
**ComfyUIを通じて**:
115+
モデルをComfyUIの重みフォルダ `ComfyUI/models/Fun_Models/` に入れます:
116+
```
117+
📦 ComfyUI/
118+
├── 📂 models/
119+
│ └── 📂 Fun_Models/
120+
│ ├── 📂 CogVideoX-Fun-V1.1-2b-InP/
121+
│ ├── 📂 CogVideoX-Fun-V1.1-5b-InP/
122+
│ ├── 📂 Wan2.1-Fun-14B-InP
123+
│ └── 📂 Wan2.1-Fun-1.3B-InP/
124+
```
125+
126+
**独自のpythonファイルまたはUIインターフェースを実行**:
114127
```
115128
📦 models/
116129
├── 📂 Diffusion_Transformer/
@@ -182,10 +195,10 @@ Linuxの詳細:
182195
<video src="https://github.com/user-attachments/assets/53002ce2-dd18-4d4f-8135-b6f68364cabd" width="100%" controls autoplay loop></video>
183196
</td>
184197
<td>
185-
<video src="https://github.com/user-attachments/assets/fce43c0b-81fa-4ab2-9ca7-78d786f520e6" width="100%" controls autoplay loop></video>
198+
<video src="https://github.com/user-attachments/assets/a1a07cf8-d86d-4cd2-831f-18a6c1ceee1d" width="100%" controls autoplay loop></video>
186199
</td>
187200
<td>
188-
<video src="https://github.com/user-attachments/assets/b208b92c-5add-4ece-a200-3dbbe47b93c3" width="100%" controls autoplay loop></video>
201+
<video src="https://github.com/user-attachments/assets/3224804f-342d-4947-918d-d9fec8e3d273" width="100%" controls autoplay loop></video>
189202
</td>
190203
<tr>
191204
<td>
@@ -268,10 +281,10 @@ Linuxの詳細:
268281
<video src="https://github.com/user-attachments/assets/53002ce2-dd18-4d4f-8135-b6f68364cabd" width="100%" controls autoplay loop></video>
269282
</td>
270283
<td>
271-
<video src="https://github.com/user-attachments/assets/fce43c0b-81fa-4ab2-9ca7-78d786f520e6" width="100%" controls autoplay loop></video>
284+
<video src="https://github.com/user-attachments/assets/a1a07cf8-d86d-4cd2-831f-18a6c1ceee1d" width="100%" controls autoplay loop></video>
272285
</td>
273286
<td>
274-
<video src="https://github.com/user-attachments/assets/b208b92c-5add-4ece-a200-3dbbe47b93c3" width="100%" controls autoplay loop></video>
287+
<video src="https://github.com/user-attachments/assets/3224804f-342d-4947-918d-d9fec8e3d273" width="100%" controls autoplay loop></video>
275288
</td>
276289
<tr>
277290
<td>

README_zh-CN.md

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,19 @@ Linux 的详细信息:
109109
#### b. 权重放置
110110
我们最好将[权重](#model-zoo)按照指定路径进行放置:
111111

112+
**通过comfyui**
113+
将模型放入Comfyui的权重文件夹`ComfyUI/models/Fun_Models/`
114+
```
115+
📦 ComfyUI/
116+
├── 📂 models/
117+
│ └── 📂 Fun_Models/
118+
│ ├── 📂 CogVideoX-Fun-V1.1-2b-InP/
119+
│ ├── 📂 CogVideoX-Fun-V1.1-5b-InP/
120+
│ ├── 📂 Wan2.1-Fun-14B-InP
121+
│ └── 📂 Wan2.1-Fun-1.3B-InP/
122+
```
123+
124+
**运行自身的python文件或ui界面**:
112125
```
113126
📦 models/
114127
├── 📂 Diffusion_Transformer/
@@ -180,10 +193,10 @@ Linux 的详细信息:
180193
<video src="https://github.com/user-attachments/assets/53002ce2-dd18-4d4f-8135-b6f68364cabd" width="100%" controls autoplay loop></video>
181194
</td>
182195
<td>
183-
<video src="https://github.com/user-attachments/assets/fce43c0b-81fa-4ab2-9ca7-78d786f520e6" width="100%" controls autoplay loop></video>
196+
<video src="https://github.com/user-attachments/assets/a1a07cf8-d86d-4cd2-831f-18a6c1ceee1d" width="100%" controls autoplay loop></video>
184197
</td>
185198
<td>
186-
<video src="https://github.com/user-attachments/assets/b208b92c-5add-4ece-a200-3dbbe47b93c3" width="100%" controls autoplay loop></video>
199+
<video src="https://github.com/user-attachments/assets/3224804f-342d-4947-918d-d9fec8e3d273" width="100%" controls autoplay loop></video>
187200
</td>
188201
<tr>
189202
<td>
@@ -266,10 +279,10 @@ Resolution-512
266279
<video src="https://github.com/user-attachments/assets/53002ce2-dd18-4d4f-8135-b6f68364cabd" width="100%" controls autoplay loop></video>
267280
</td>
268281
<td>
269-
<video src="https://github.com/user-attachments/assets/fce43c0b-81fa-4ab2-9ca7-78d786f520e6" width="100%" controls autoplay loop></video>
282+
<video src="https://github.com/user-attachments/assets/a1a07cf8-d86d-4cd2-831f-18a6c1ceee1d" width="100%" controls autoplay loop></video>
270283
</td>
271284
<td>
272-
<video src="https://github.com/user-attachments/assets/b208b92c-5add-4ece-a200-3dbbe47b93c3" width="100%" controls autoplay loop></video>
285+
<video src="https://github.com/user-attachments/assets/3224804f-342d-4947-918d-d9fec8e3d273" width="100%" controls autoplay loop></video>
273286
</td>
274287
<tr>
275288
<td>

comfyui/README.md

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
# ComfyUI CogVideoX-Fun
2-
Easily use CogVideoX-Fun and Wan2.1-Fun inside ComfyUI!
1+
# ComfyUI VideoX-Fun
2+
Easily use VideoX-Fun and Wan2.1-Fun inside ComfyUI!
33

44
- [Installation](#1-installation)
55
- [Node types](#node-types)
@@ -12,23 +12,31 @@ Easily use CogVideoX-Fun and Wan2.1-Fun inside ComfyUI!
1212
TBD
1313

1414
#### Option 2: Install manually
15-
The CogVideoX-Fun repository needs to be placed at `ComfyUI/custom_nodes/CogVideoX-Fun/`.
15+
The VideoX-Fun repository needs to be placed at `ComfyUI/custom_nodes/VideoX-Fun/`.
1616

1717
```
1818
cd ComfyUI/custom_nodes/
1919
2020
# Git clone the cogvideox_fun itself
21-
git clone https://github.com/aigc-apps/CogVideoX-Fun.git
21+
git clone https://github.com/aigc-apps/VideoX-Fun.git
2222
2323
# Git clone the video outout node
2424
git clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite.git
2525
26-
cd CogVideoX-Fun/
26+
cd VideoX-Fun/
2727
python install.py
2828
```
2929

3030
### 2. Download models into `ComfyUI/models/Fun_Models/`
3131

32+
### 3. (Optional) Download preprocess weights into `ComfyUI/custom_nodes/Fun_Models/Third_Party/`.
33+
Except for the fun models' weights, if you want to use the control preprocess nodes, you can download the preprocess weights to `ComfyUI/custom_nodes/Fun_Models/Third_Party/`.
34+
35+
```
36+
remote_onnx_det = "https://huggingface.co/yzd-v/DWPose/resolve/main/yolox_l.onnx"
37+
remote_onnx_pose = "https://huggingface.co/yzd-v/DWPose/resolve/main/dw-ll_ucoco_384.onnx"
38+
remote_zoe= "https://huggingface.co/lllyasviel/Annotators/resolve/main/ZoeD_M12_N.pt"
39+
```
3240
#### i. Wan2.1-Fun
3341

3442
V1.0:
@@ -142,6 +150,8 @@ You can run a demo using the following photo:
142150
### iv. Control Video Generation
143151
Our user interface is shown as follows, this is the [json](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/wan_fun/asset/v1.0/wan2.1_fun_workflow_v2v_control.json):
144152

153+
To facilitate usage, we have added several JSON configurations that automatically process input videos into the necessary control videos. These include [canny processing](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/wan_fun/asset/v1.0/wan2.1_fun_workflow_v2v_control_canny.json), [pose processing](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/wan_fun/asset/v1.0/wan2.1_fun_workflow_v2v_control_pose.json), and [depth processing](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/wan_fun/asset/v1.0/wan2.1_fun_workflow_v2v_control_depth.json).
154+
145155
![Workflow Diagram](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/wan_fun/asset/v1.0/wan2.1_fun_workflow_v2v_control.jpg)
146156

147157
You can run a demo using the following video:
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# This folder is modified from the https://github.com/Mikubill/sd-webui-controlnet
2+
# Openpose
3+
# Original from CMU https://github.com/CMU-Perceptual-Computing-Lab/openpose
4+
# 2nd Edited by https://github.com/Hzzone/pytorch-openpose
5+
# 3rd Edited by ControlNet
6+
# 4th Edited by ControlNet (added face and correct hands)
7+
8+
import os
9+
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
10+
11+
import torch
12+
import numpy as np
13+
from . import util
14+
from .wholebody import Wholebody
15+
16+
def draw_pose(poses, H, W):
17+
canvas = np.zeros(shape=(H, W, 3), dtype=np.uint8)
18+
19+
for pose in poses:
20+
canvas = util.draw_bodypose(canvas, pose.body.keypoints)
21+
22+
canvas = util.draw_handpose(canvas, pose.left_hand)
23+
canvas = util.draw_handpose(canvas, pose.right_hand)
24+
25+
canvas = util.draw_facepose(canvas, pose.face)
26+
return canvas
27+
28+
29+
class DWposeDetector:
30+
def __init__(self, onnx_det, onnx_pose):
31+
self.pose_estimation = Wholebody(onnx_det, onnx_pose)
32+
33+
def __call__(self, oriImg):
34+
oriImg = oriImg.copy()
35+
H, W, C = oriImg.shape
36+
with torch.no_grad():
37+
keypoints_info = self.pose_estimation(oriImg)
38+
return draw_pose(
39+
Wholebody.format_result(keypoints_info),
40+
H,
41+
W,
42+
)
43+
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
import cv2
2+
import numpy as np
3+
4+
def nms(boxes, scores, nms_thr):
5+
"""Single class NMS implemented in Numpy."""
6+
x1 = boxes[:, 0]
7+
y1 = boxes[:, 1]
8+
x2 = boxes[:, 2]
9+
y2 = boxes[:, 3]
10+
11+
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
12+
order = scores.argsort()[::-1]
13+
14+
keep = []
15+
while order.size > 0:
16+
i = order[0]
17+
keep.append(i)
18+
xx1 = np.maximum(x1[i], x1[order[1:]])
19+
yy1 = np.maximum(y1[i], y1[order[1:]])
20+
xx2 = np.minimum(x2[i], x2[order[1:]])
21+
yy2 = np.minimum(y2[i], y2[order[1:]])
22+
23+
w = np.maximum(0.0, xx2 - xx1 + 1)
24+
h = np.maximum(0.0, yy2 - yy1 + 1)
25+
inter = w * h
26+
ovr = inter / (areas[i] + areas[order[1:]] - inter)
27+
28+
inds = np.where(ovr <= nms_thr)[0]
29+
order = order[inds + 1]
30+
31+
return keep
32+
33+
def multiclass_nms(boxes, scores, nms_thr, score_thr):
34+
"""Multiclass NMS implemented in Numpy. Class-aware version."""
35+
final_dets = []
36+
num_classes = scores.shape[1]
37+
for cls_ind in range(num_classes):
38+
cls_scores = scores[:, cls_ind]
39+
valid_score_mask = cls_scores > score_thr
40+
if valid_score_mask.sum() == 0:
41+
continue
42+
else:
43+
valid_scores = cls_scores[valid_score_mask]
44+
valid_boxes = boxes[valid_score_mask]
45+
keep = nms(valid_boxes, valid_scores, nms_thr)
46+
if len(keep) > 0:
47+
cls_inds = np.ones((len(keep), 1)) * cls_ind
48+
dets = np.concatenate(
49+
[valid_boxes[keep], valid_scores[keep, None], cls_inds], 1
50+
)
51+
final_dets.append(dets)
52+
if len(final_dets) == 0:
53+
return None
54+
return np.concatenate(final_dets, 0)
55+
56+
def demo_postprocess(outputs, img_size, p6=False):
57+
grids = []
58+
expanded_strides = []
59+
strides = [8, 16, 32] if not p6 else [8, 16, 32, 64]
60+
61+
hsizes = [img_size[0] // stride for stride in strides]
62+
wsizes = [img_size[1] // stride for stride in strides]
63+
64+
for hsize, wsize, stride in zip(hsizes, wsizes, strides):
65+
xv, yv = np.meshgrid(np.arange(wsize), np.arange(hsize))
66+
grid = np.stack((xv, yv), 2).reshape(1, -1, 2)
67+
grids.append(grid)
68+
shape = grid.shape[:2]
69+
expanded_strides.append(np.full((*shape, 1), stride))
70+
71+
grids = np.concatenate(grids, 1)
72+
expanded_strides = np.concatenate(expanded_strides, 1)
73+
outputs[..., :2] = (outputs[..., :2] + grids) * expanded_strides
74+
outputs[..., 2:4] = np.exp(outputs[..., 2:4]) * expanded_strides
75+
76+
return outputs
77+
78+
def preprocess(img, input_size, swap=(2, 0, 1)):
79+
if len(img.shape) == 3:
80+
padded_img = np.ones((input_size[0], input_size[1], 3), dtype=np.uint8) * 114
81+
else:
82+
padded_img = np.ones(input_size, dtype=np.uint8) * 114
83+
84+
r = min(input_size[0] / img.shape[0], input_size[1] / img.shape[1])
85+
resized_img = cv2.resize(
86+
img,
87+
(int(img.shape[1] * r), int(img.shape[0] * r)),
88+
interpolation=cv2.INTER_LINEAR,
89+
).astype(np.uint8)
90+
padded_img[: int(img.shape[0] * r), : int(img.shape[1] * r)] = resized_img
91+
92+
padded_img = padded_img.transpose(swap)
93+
padded_img = np.ascontiguousarray(padded_img, dtype=np.float32)
94+
return padded_img, r
95+
96+
def inference_detector(session, oriImg, detect_classes=[0]):
97+
input_shape = (640,640)
98+
img, ratio = preprocess(oriImg, input_shape)
99+
100+
input = img[None, :, :, :]
101+
if "InferenceSession" in type(session).__name__:
102+
input_name = session.get_inputs()[0].name
103+
output = session.run(None, {input_name: input})
104+
else:
105+
outNames = session.getUnconnectedOutLayersNames()
106+
session.setInput(input)
107+
output = session.forward(outNames)
108+
109+
predictions = demo_postprocess(output[0], input_shape)[0]
110+
111+
boxes = predictions[:, :4]
112+
scores = predictions[:, 4:5] * predictions[:, 5:]
113+
114+
boxes_xyxy = np.ones_like(boxes)
115+
boxes_xyxy[:, 0] = boxes[:, 0] - boxes[:, 2]/2.
116+
boxes_xyxy[:, 1] = boxes[:, 1] - boxes[:, 3]/2.
117+
boxes_xyxy[:, 2] = boxes[:, 0] + boxes[:, 2]/2.
118+
boxes_xyxy[:, 3] = boxes[:, 1] + boxes[:, 3]/2.
119+
boxes_xyxy /= ratio
120+
dets = multiclass_nms(boxes_xyxy, scores, nms_thr=0.45, score_thr=0.1)
121+
if dets is None:
122+
return None
123+
final_boxes, final_scores, final_cls_inds = dets[:, :4], dets[:, 4], dets[:, 5]
124+
isscore = final_scores>0.3
125+
iscat = np.isin(final_cls_inds, detect_classes)
126+
isbbox = [ i and j for (i, j) in zip(isscore, iscat)]
127+
final_boxes = final_boxes[isbbox]
128+
return final_boxes

0 commit comments

Comments
 (0)