aigc-apps
diff --git a/‎README.md‎
Lines changed: 17 additions & 4 deletions b/‎README.md‎
Lines changed: 17 additions & 4 deletions
diff --git a/‎README_ja-JP.md‎
Lines changed: 17 additions & 4 deletions b/‎README_ja-JP.md‎
Lines changed: 17 additions & 4 deletions
diff --git a/‎README_zh-CN.md‎
Lines changed: 17 additions & 4 deletions b/‎README_zh-CN.md‎
Lines changed: 17 additions & 4 deletions
diff --git a/‎comfyui/README.md‎
Lines changed: 15 additions & 5 deletions b/‎comfyui/README.md‎
Lines changed: 15 additions & 5 deletions
diff --git a/‎comfyui/annotator/dwpose_utils/__init__.py‎
Lines changed: 43 additions & 0 deletions b/‎comfyui/annotator/dwpose_utils/__init__.py‎
Lines changed: 43 additions & 0 deletions
diff --git a/‎comfyui/annotator/dwpose_utils/onnxdet.py‎
Lines changed: 128 additions & 0 deletions b/‎comfyui/annotator/dwpose_utils/onnxdet.py‎
Lines changed: 128 additions & 0 deletions
@@ -111,6 +111,19 @@ We need about 60GB available on disk (for saving weights), please check!
 #### b. Weights
 We'd better place the [weights](#model-zoo) along the specified path:
 
+**Via ComfyUI**:
+Put the models into the ComfyUI weights folder `ComfyUI/models/Fun_Models/`:
+```
+📦 ComfyUI/
+├── 📂 models/
+│   └── 📂 Fun_Models/
+│       ├── 📂 CogVideoX-Fun-V1.1-2b-InP/
+│       ├── 📂 CogVideoX-Fun-V1.1-5b-InP/
+│       ├── 📂 Wan2.1-Fun-14B-InP
+│       └── 📂 Wan2.1-Fun-1.3B-InP/
+```
+
+**Run its own python file or UI interface**:
 ```
 📦 models/
 ├── 📂 Diffusion_Transformer/
@@ -182,10 +195,10 @@ We'd better place the [weights](#model-zoo) along the specified path:
           <video src="https://github.com/user-attachments/assets/53002ce2-dd18-4d4f-8135-b6f68364cabd" width="100%" controls autoplay loop></video>
       </td>
       <td>
-          <video src="https://github.com/user-attachments/assets/fce43c0b-81fa-4ab2-9ca7-78d786f520e6" width="100%" controls autoplay loop></video>
+          <video src="https://github.com/user-attachments/assets/a1a07cf8-d86d-4cd2-831f-18a6c1ceee1d" width="100%" controls autoplay loop></video>
       </td>
        <td>
-          <video src="https://github.com/user-attachments/assets/b208b92c-5add-4ece-a200-3dbbe47b93c3" width="100%" controls autoplay loop></video>
+          <video src="https://github.com/user-attachments/assets/3224804f-342d-4947-918d-d9fec8e3d273" width="100%" controls autoplay loop></video>
      </td>
   <tr>
       <td>
@@ -268,10 +281,10 @@ Resolution-512
           <video src="https://github.com/user-attachments/assets/53002ce2-dd18-4d4f-8135-b6f68364cabd" width="100%" controls autoplay loop></video>
       </td>
       <td>
-          <video src="https://github.com/user-attachments/assets/fce43c0b-81fa-4ab2-9ca7-78d786f520e6" width="100%" controls autoplay loop></video>
+          <video src="https://github.com/user-attachments/assets/a1a07cf8-d86d-4cd2-831f-18a6c1ceee1d" width="100%" controls autoplay loop></video>
       </td>
        <td>
-          <video src="https://github.com/user-attachments/assets/b208b92c-5add-4ece-a200-3dbbe47b93c3" width="100%" controls autoplay loop></video>
+          <video src="https://github.com/user-attachments/assets/3224804f-342d-4947-918d-d9fec8e3d273" width="100%" controls autoplay loop></video>
      </td>
   <tr>
       <td>
 
@@ -111,6 +111,19 @@ Linuxの詳細：
 #### b. 重み
 [重み](#model-zoo)を指定されたパスに配置することをお勧めします：
 
+**ComfyUIを通じて**:
+モデルをComfyUIの重みフォルダ `ComfyUI/models/Fun_Models/` に入れます：
+```
+📦 ComfyUI/
+├── 📂 models/
+│   └── 📂 Fun_Models/
+│       ├── 📂 CogVideoX-Fun-V1.1-2b-InP/
+│       ├── 📂 CogVideoX-Fun-V1.1-5b-InP/
+│       ├── 📂 Wan2.1-Fun-14B-InP
+│       └── 📂 Wan2.1-Fun-1.3B-InP/
+```
+
+**独自のpythonファイルまたはUIインターフェースを実行**:
 ```
 📦 models/
 ├── 📂 Diffusion_Transformer/
@@ -182,10 +195,10 @@ Linuxの詳細：
           <video src="https://github.com/user-attachments/assets/53002ce2-dd18-4d4f-8135-b6f68364cabd" width="100%" controls autoplay loop></video>
       </td>
       <td>
-          <video src="https://github.com/user-attachments/assets/fce43c0b-81fa-4ab2-9ca7-78d786f520e6" width="100%" controls autoplay loop></video>
+          <video src="https://github.com/user-attachments/assets/a1a07cf8-d86d-4cd2-831f-18a6c1ceee1d" width="100%" controls autoplay loop></video>
       </td>
        <td>
-          <video src="https://github.com/user-attachments/assets/b208b92c-5add-4ece-a200-3dbbe47b93c3" width="100%" controls autoplay loop></video>
+          <video src="https://github.com/user-attachments/assets/3224804f-342d-4947-918d-d9fec8e3d273" width="100%" controls autoplay loop></video>
      </td>
   <tr>
       <td>
@@ -268,10 +281,10 @@ Linuxの詳細：
           <video src="https://github.com/user-attachments/assets/53002ce2-dd18-4d4f-8135-b6f68364cabd" width="100%" controls autoplay loop></video>
       </td>
       <td>
-          <video src="https://github.com/user-attachments/assets/fce43c0b-81fa-4ab2-9ca7-78d786f520e6" width="100%" controls autoplay loop></video>
+          <video src="https://github.com/user-attachments/assets/a1a07cf8-d86d-4cd2-831f-18a6c1ceee1d" width="100%" controls autoplay loop></video>
       </td>
        <td>
-          <video src="https://github.com/user-attachments/assets/b208b92c-5add-4ece-a200-3dbbe47b93c3" width="100%" controls autoplay loop></video>
+          <video src="https://github.com/user-attachments/assets/3224804f-342d-4947-918d-d9fec8e3d273" width="100%" controls autoplay loop></video>
      </td>
   <tr>
       <td>
 
@@ -109,6 +109,19 @@ Linux 的详细信息：
 #### b. 权重放置
 我们最好将[权重](#model-zoo)按照指定路径进行放置：
 
+**通过comfyui**：
+将模型放入Comfyui的权重文件夹`ComfyUI/models/Fun_Models/`：
+```
+📦 ComfyUI/
+├── 📂 models/
+│   └── 📂 Fun_Models/
+│       ├── 📂 CogVideoX-Fun-V1.1-2b-InP/
+│       ├── 📂 CogVideoX-Fun-V1.1-5b-InP/
+│       ├── 📂 Wan2.1-Fun-14B-InP
+│       └── 📂 Wan2.1-Fun-1.3B-InP/
+```
+
+**运行自身的python文件或ui界面**:
 ```
 📦 models/
 ├── 📂 Diffusion_Transformer/
@@ -180,10 +193,10 @@ Linux 的详细信息：
           <video src="https://github.com/user-attachments/assets/53002ce2-dd18-4d4f-8135-b6f68364cabd" width="100%" controls autoplay loop></video>
       </td>
       <td>
-          <video src="https://github.com/user-attachments/assets/fce43c0b-81fa-4ab2-9ca7-78d786f520e6" width="100%" controls autoplay loop></video>
+          <video src="https://github.com/user-attachments/assets/a1a07cf8-d86d-4cd2-831f-18a6c1ceee1d" width="100%" controls autoplay loop></video>
       </td>
        <td>
-          <video src="https://github.com/user-attachments/assets/b208b92c-5add-4ece-a200-3dbbe47b93c3" width="100%" controls autoplay loop></video>
+          <video src="https://github.com/user-attachments/assets/3224804f-342d-4947-918d-d9fec8e3d273" width="100%" controls autoplay loop></video>
      </td>
   <tr>
       <td>
@@ -266,10 +279,10 @@ Resolution-512
           <video src="https://github.com/user-attachments/assets/53002ce2-dd18-4d4f-8135-b6f68364cabd" width="100%" controls autoplay loop></video>
       </td>
       <td>
-          <video src="https://github.com/user-attachments/assets/fce43c0b-81fa-4ab2-9ca7-78d786f520e6" width="100%" controls autoplay loop></video>
+          <video src="https://github.com/user-attachments/assets/a1a07cf8-d86d-4cd2-831f-18a6c1ceee1d" width="100%" controls autoplay loop></video>
       </td>
        <td>
-          <video src="https://github.com/user-attachments/assets/b208b92c-5add-4ece-a200-3dbbe47b93c3" width="100%" controls autoplay loop></video>
+          <video src="https://github.com/user-attachments/assets/3224804f-342d-4947-918d-d9fec8e3d273" width="100%" controls autoplay loop></video>
      </td>
   <tr>
       <td>
 
@@ -1,5 +1,5 @@
-# ComfyUI CogVideoX-Fun
-Easily use CogVideoX-Fun and Wan2.1-Fun inside ComfyUI!
+# ComfyUI VideoX-Fun
+Easily use VideoX-Fun and Wan2.1-Fun inside ComfyUI!
 
 - [Installation](#1-installation)
 - [Node types](#node-types)
@@ -12,23 +12,31 @@ Easily use CogVideoX-Fun and Wan2.1-Fun inside ComfyUI!
 TBD
 
 #### Option 2: Install manually
-The CogVideoX-Fun repository needs to be placed at `ComfyUI/custom_nodes/CogVideoX-Fun/`.
+The VideoX-Fun repository needs to be placed at `ComfyUI/custom_nodes/VideoX-Fun/`.
 
 ```
 cd ComfyUI/custom_nodes/
 
 # Git clone the cogvideox_fun itself
-git clone https://github.com/aigc-apps/CogVideoX-Fun.git
+git clone https://github.com/aigc-apps/VideoX-Fun.git
 
 # Git clone the video outout node
 git clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite.git
 
-cd CogVideoX-Fun/
+cd VideoX-Fun/
 python install.py
 ```
 
 ### 2. Download models into `ComfyUI/models/Fun_Models/`
 
+### 3. (Optional) Download preprocess weights into `ComfyUI/custom_nodes/Fun_Models/Third_Party/`.
+Except for the fun models' weights, if you want to use the control preprocess nodes, you can download the preprocess weights to `ComfyUI/custom_nodes/Fun_Models/Third_Party/`.
+
+```
+remote_onnx_det = "https://huggingface.co/yzd-v/DWPose/resolve/main/yolox_l.onnx"
+remote_onnx_pose = "https://huggingface.co/yzd-v/DWPose/resolve/main/dw-ll_ucoco_384.onnx"
+remote_zoe= "https://huggingface.co/lllyasviel/Annotators/resolve/main/ZoeD_M12_N.pt"
+```
 #### i. Wan2.1-Fun
 
 V1.0:
@@ -142,6 +150,8 @@ You can run a demo using the following photo:
 ### iv. Control Video Generation
 Our user interface is shown as follows, this is the [json](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/wan_fun/asset/v1.0/wan2.1_fun_workflow_v2v_control.json):
 
+To facilitate usage, we have added several JSON configurations that automatically process input videos into the necessary control videos. These include [canny processing](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/wan_fun/asset/v1.0/wan2.1_fun_workflow_v2v_control_canny.json), [pose processing](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/wan_fun/asset/v1.0/wan2.1_fun_workflow_v2v_control_pose.json), and [depth processing](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/wan_fun/asset/v1.0/wan2.1_fun_workflow_v2v_control_depth.json).
+
 ![Workflow Diagram](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/wan_fun/asset/v1.0/wan2.1_fun_workflow_v2v_control.jpg)
 
 You can run a demo using the following video:
 
@@ -0,0 +1,43 @@
+# This folder is modified from the https://github.com/Mikubill/sd-webui-controlnet
+# Openpose
+# Original from CMU https://github.com/CMU-Perceptual-Computing-Lab/openpose
+# 2nd Edited by https://github.com/Hzzone/pytorch-openpose
+# 3rd Edited by ControlNet
+# 4th Edited by ControlNet (added face and correct hands)
+
+import os
+os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
+
+import torch
+import numpy as np
+from . import util
+from .wholebody import Wholebody
+
+def draw_pose(poses, H, W):
+    canvas = np.zeros(shape=(H, W, 3), dtype=np.uint8)
+
+    for pose in poses:
+        canvas = util.draw_bodypose(canvas, pose.body.keypoints)
+
+        canvas = util.draw_handpose(canvas, pose.left_hand)
+        canvas = util.draw_handpose(canvas, pose.right_hand)
+
+        canvas = util.draw_facepose(canvas, pose.face)
+    return canvas
+
+
+class DWposeDetector:
+    def __init__(self, onnx_det, onnx_pose):
+        self.pose_estimation = Wholebody(onnx_det, onnx_pose)
+
+    def __call__(self, oriImg):
+        oriImg = oriImg.copy()
+        H, W, C = oriImg.shape
+        with torch.no_grad():
+            keypoints_info = self.pose_estimation(oriImg)
+            return draw_pose(
+                Wholebody.format_result(keypoints_info),
+                H,
+                W,
+            )
+             
@@ -0,0 +1,128 @@
+import cv2
+import numpy as np
+
+def nms(boxes, scores, nms_thr):
+    """Single class NMS implemented in Numpy."""
+    x1 = boxes[:, 0]
+    y1 = boxes[:, 1]
+    x2 = boxes[:, 2]
+    y2 = boxes[:, 3]
+
+    areas = (x2 - x1 + 1) * (y2 - y1 + 1)
+    order = scores.argsort()[::-1]
+
+    keep = []
+    while order.size > 0:
+        i = order[0]
+        keep.append(i)
+        xx1 = np.maximum(x1[i], x1[order[1:]])
+        yy1 = np.maximum(y1[i], y1[order[1:]])
+        xx2 = np.minimum(x2[i], x2[order[1:]])
+        yy2 = np.minimum(y2[i], y2[order[1:]])
+
+        w = np.maximum(0.0, xx2 - xx1 + 1)
+        h = np.maximum(0.0, yy2 - yy1 + 1)
+        inter = w * h
+        ovr = inter / (areas[i] + areas[order[1:]] - inter)
+
+        inds = np.where(ovr <= nms_thr)[0]
+        order = order[inds + 1]
+
+    return keep
+
+def multiclass_nms(boxes, scores, nms_thr, score_thr):
+    """Multiclass NMS implemented in Numpy. Class-aware version."""
+    final_dets = []
+    num_classes = scores.shape[1]
+    for cls_ind in range(num_classes):
+        cls_scores = scores[:, cls_ind]
+        valid_score_mask = cls_scores > score_thr
+        if valid_score_mask.sum() == 0:
+            continue
+        else:
+            valid_scores = cls_scores[valid_score_mask]
+            valid_boxes = boxes[valid_score_mask]
+            keep = nms(valid_boxes, valid_scores, nms_thr)
+            if len(keep) > 0:
+                cls_inds = np.ones((len(keep), 1)) * cls_ind
+                dets = np.concatenate(
+                    [valid_boxes[keep], valid_scores[keep, None], cls_inds], 1
+                )
+                final_dets.append(dets)
+    if len(final_dets) == 0:
+        return None
+    return np.concatenate(final_dets, 0)
+
+def demo_postprocess(outputs, img_size, p6=False):
+    grids = []
+    expanded_strides = []
+    strides = [8, 16, 32] if not p6 else [8, 16, 32, 64]
+
+    hsizes = [img_size[0] // stride for stride in strides]
+    wsizes = [img_size[1] // stride for stride in strides]
+
+    for hsize, wsize, stride in zip(hsizes, wsizes, strides):
+        xv, yv = np.meshgrid(np.arange(wsize), np.arange(hsize))
+        grid = np.stack((xv, yv), 2).reshape(1, -1, 2)
+        grids.append(grid)
+        shape = grid.shape[:2]
+        expanded_strides.append(np.full((*shape, 1), stride))
+
+    grids = np.concatenate(grids, 1)
+    expanded_strides = np.concatenate(expanded_strides, 1)
+    outputs[..., :2] = (outputs[..., :2] + grids) * expanded_strides
+    outputs[..., 2:4] = np.exp(outputs[..., 2:4]) * expanded_strides
+
+    return outputs
+
+def preprocess(img, input_size, swap=(2, 0, 1)):
+    if len(img.shape) == 3:
+        padded_img = np.ones((input_size[0], input_size[1], 3), dtype=np.uint8) * 114
+    else:
+        padded_img = np.ones(input_size, dtype=np.uint8) * 114
+
+    r = min(input_size[0] / img.shape[0], input_size[1] / img.shape[1])
+    resized_img = cv2.resize(
+        img,
+        (int(img.shape[1] * r), int(img.shape[0] * r)),
+        interpolation=cv2.INTER_LINEAR,
+    ).astype(np.uint8)
+    padded_img[: int(img.shape[0] * r), : int(img.shape[1] * r)] = resized_img
+
+    padded_img = padded_img.transpose(swap)
+    padded_img = np.ascontiguousarray(padded_img, dtype=np.float32)
+    return padded_img, r
+
+def inference_detector(session, oriImg, detect_classes=[0]):
+    input_shape = (640,640)
+    img, ratio = preprocess(oriImg, input_shape)
+
+    input = img[None, :, :, :]
+    if "InferenceSession" in type(session).__name__:
+        input_name = session.get_inputs()[0].name
+        output = session.run(None, {input_name: input})
+    else:
+        outNames = session.getUnconnectedOutLayersNames()
+        session.setInput(input)
+        output = session.forward(outNames)
+
+    predictions = demo_postprocess(output[0], input_shape)[0]
+
+    boxes = predictions[:, :4]
+    scores = predictions[:, 4:5] * predictions[:, 5:]
+
+    boxes_xyxy = np.ones_like(boxes)
+    boxes_xyxy[:, 0] = boxes[:, 0] - boxes[:, 2]/2.
+    boxes_xyxy[:, 1] = boxes[:, 1] - boxes[:, 3]/2.
+    boxes_xyxy[:, 2] = boxes[:, 0] + boxes[:, 2]/2.
+    boxes_xyxy[:, 3] = boxes[:, 1] + boxes[:, 3]/2.
+    boxes_xyxy /= ratio
+    dets = multiclass_nms(boxes_xyxy, scores, nms_thr=0.45, score_thr=0.1)
+    if dets is None:
+        return None
+    final_boxes, final_scores, final_cls_inds = dets[:, :4], dets[:, 4], dets[:, 5]
+    isscore = final_scores>0.3
+    iscat = np.isin(final_cls_inds, detect_classes)
+    isbbox = [ i and j for (i, j) in zip(isscore, iscat)]
+    final_boxes = final_boxes[isbbox]
+    return final_boxes