|
1 | | -# MiKaPo: AI Pose Picker for MikuMikuDance |
| 1 | +# MiKaPo: Real-time MMD Motion Capture |
2 | 2 |
|
3 | | -> **🎉 NEW PROJECT ALERT!** Check out [**PoPo**](https://popo.love) - Transform text into MMD poses with AI! No more manual bone adjustments - just type "shy smile while waving" and watch the magic happen ✨ |
| 3 | +A web-based tool that enables real-time motion capture for MikuMikuDance (MMD) models. |
4 | 4 |
|
5 | | -<img width="300px" alt="demo_pose" src="./logo.jpg" /> |
| 5 | +## Overview |
6 | 6 |
|
7 | | -[MiKaPo](https://mikapo.amyang.dev) is a **Web-based tool** that poses MMD models from video input in real-time. Welcome feature requests and PRs! |
| 7 | +[MiKaPo](https://mikapo.amyang.dev) transforms video input into real-time MMD model poses by detecting 3D landmarks and converting them to bone rotations. The core technical challenge lies in accurately mapping world-space 3D landmarks from MediaPipe to MMD bone quaternion rotations, accounting for MMD's specific bone coordinate system and directional conventions. |
8 | 8 |
|
9 | | -<img width="400px" alt="demo_pose" src="./demo1.gif" /> |
10 | | -<img width="400px" alt="demo_face" src="./demo2.gif" /> |
11 | | -<img width="400px" alt="demo_img" src="./demo3.png" /> |
| 9 | +**MiKaPo 2.0** introduces a completely rewritten solver with hierarchical bone transformations, migrating from Vite to Next.js for improved performance and maintainability. |
12 | 10 |
|
13 | | -## Tech Stack |
| 11 | + |
| 12 | + |
14 | 13 |
|
15 | | -- 3D key points detection: [Mediapipe](https://ai.google.dev/edge/mediapipe/solutions/vision/pose_landmarker/web_js) |
16 | | -- 3D scene: [Babylon.js](https://www.babylonjs.com/) |
17 | | -- MMD model viewer: [babylon-mmd](https://github.com/noname0310/babylon-mmd) |
18 | | -- Web framework: [Vite+React](https://vitejs.dev/) |
19 | | -- Models are from [aplaybox](https://aplaybox.com/en/mmd-models/). |
| 14 | +## Related Project |
20 | 15 |
|
21 | | -## Features |
| 16 | +Check out [**PoPo**](https://popo.love) - AI-powered text-to-MMD pose generation. Transform natural language descriptions into MMD poses instantly. |
22 | 17 |
|
23 | | -- [x] Pose detection |
24 | | -- [x] Face detection |
25 | | -- [x] Hand detection (experimental) |
26 | | -- [x] Rust-WASM based pose-to-quaternion solver |
27 | | -- [x] 360-degree background selection |
28 | | -- [x] Video, image upload |
29 | | -- [x] Webcam input |
30 | | -- [x] Model selection |
31 | | -- [x] Ollama support ([electron version](https://github.com/AmyangXYZ/MiKaPo-Electron)) |
32 | | -- [x] VMD import/export (to export a valid VMD file, you must record at least one motion) |
33 | | -- [x] MMD editor: bone, material, mesh edit |
| 18 | +## Key Features |
34 | 19 |
|
35 | | -## Hint |
| 20 | +- **Real-time pose detection** using MediaPipe Pose |
| 21 | +- **Face and hand tracking** for comprehensive motion capture |
| 22 | +- **Multiple input sources**: webcam, video files, and image uploads |
| 23 | +- **Live MMD model rendering** with synchronized bone animations |
36 | 24 |
|
37 | | -- Let your browser use dedicated GPU for better performance. |
| 25 | +_Legacy features from v1.0 (VMD export, bone manipulation, 360° scene environment) will be added in future updates._ |
38 | 26 |
|
39 | | -## Project Setup |
| 27 | +## Technical Stack |
40 | 28 |
|
41 | | -```sh |
42 | | -npm install |
43 | | -``` |
| 29 | +- **3D Pose Detection**: [MediaPipe Pose Landmarker](https://ai.google.dev/edge/mediapipe/solutions/vision/pose_landmarker/web_js) |
| 30 | +- **3D Graphics Engine**: [Babylon.js](https://www.babylonjs.com/) |
| 31 | +- **MMD Integration**: [babylon-mmd](https://github.com/noname0310/babylon-mmd) |
| 32 | +- **Web Framework**: [Next.js](https://nextjs.org/) |
44 | 33 |
|
45 | | -### Compile and Hot-Reload for Development |
| 34 | +## Core Challenge |
46 | 35 |
|
47 | | -```sh |
48 | | -npm run dev |
49 | | -``` |
| 36 | +The primary technical challenge involves solving the complex transformation from world-space 3D landmarks to MMD bone quaternion rotations. This requires: |
50 | 37 |
|
51 | | -### Type-Check, Compile and Minify for Production |
| 38 | +- Converting MediaPipe's coordinate system to MMD's bone space |
| 39 | +- Handling MMD's unique bone direction conventions |
| 40 | +- Computing accurate quaternion rotations for smooth animations |
| 41 | +- Maintaining temporal consistency across frames |
52 | 42 |
|
53 | | -```sh |
54 | | -npm run build |
55 | | -``` |
| 43 | +## Technical Solution |
| 44 | + |
| 45 | +The solver implements a hierarchical transformation approach that maps MediaPipe's world-space landmarks to MMD bone rotations: |
| 46 | + |
| 47 | +```typescript |
| 48 | +// Key Algorithm Pseudocode |
| 49 | +function solveBoneRotation(landmarkName: string, parentChain: string[]): Quaternion { |
| 50 | + // 1. Get world-space landmarks from MediaPipe |
| 51 | + const worldLandmark = getMediaPipeLandmark(landmarkName) |
| 52 | + const worldTarget = getMediaPipeLandmark(targetLandmarkName) |
| 53 | + |
| 54 | + // 2. Build full parent bone hierarchy chain (not just immediate parent) |
| 55 | + const fullParentQuat = parentChain.reduce( |
| 56 | + (acc, parent) => acc.multiply(boneStates[parent].rotation), |
| 57 | + Quaternion.Identity() |
| 58 | + ) |
56 | 59 |
|
57 | | -### Lint with [ESLint](https://eslint.org/) |
| 60 | + // 3. Transform world landmarks to parent's local space |
| 61 | + const parentMatrix = Matrix.FromQuaternion(fullParentQuat).invert() |
| 62 | + const localLandmark = Vector3.TransformCoordinates(worldLandmark, parentMatrix) |
| 63 | + const localTarget = Vector3.TransformCoordinates(worldTarget, parentMatrix) |
58 | 64 |
|
59 | | -```sh |
60 | | -npm run lint |
| 65 | + // 4. Calculate bone direction in local space |
| 66 | + const boneDirection = localTarget.subtract(localLandmark).normalize() |
| 67 | + |
| 68 | + // 5. Set MMD bone's default A-pose reference direction |
| 69 | + const mmdReferenceDirection = getMMDDefaultDirection(boneName) |
| 70 | + |
| 71 | + // 6. Compute quaternion rotation from reference to current direction |
| 72 | + return Quaternion.FromUnitVectors(referenceDirection, boneDirection) |
| 73 | +} |
| 74 | + |
| 75 | +// Example: Left wrist transformation chain |
| 76 | +// Parent hierarchy: upper_body → left_arm → left_elbow → left_wrist |
| 77 | +// Each bone's rotation is computed in its parent's local space |
61 | 78 | ``` |
| 79 | + |
| 80 | +This approach ensures accurate bone rotations by: |
| 81 | + |
| 82 | +- **Hierarchical Transformation**: Each bone is solved in its full parent chain's local space |
| 83 | +- **MMD A-Pose Alignment**: Reference directions match MMD's default bone orientations |
| 84 | +- **Coordinate System Conversion**: Properly handles MediaPipe's coordinate system to MMD's bone space |
0 commit comments