Skip to content

Add Camera Pose op#894

Merged
HYLcool merged 10 commits intomainfrom
dev/camera_pose_op
Feb 11, 2026
Merged

Add Camera Pose op#894
HYLcool merged 10 commits intomainfrom
dev/camera_pose_op

Conversation

@Qirui-jiao
Copy link
Collaborator

  • video_camera_pose_mapper: Extract camera poses with MegaSaM and MoGe-2.

@Qirui-jiao Qirui-jiao added enhancement New feature or request dj:multimodal issues/PRs about multimodal data processing dj:op issues/PRs about some specific OPs labels Jan 27, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Qirui-jiao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the video processing functionalities within the data_juicer library by introducing several new mappers. These additions enable advanced tasks such as extracting camera poses, performing static camera calibration using state-of-the-art models like DeepCalib and MoGe-2, and undistorting video footage. The changes aim to provide more robust tools for handling and preparing video data, particularly for applications requiring precise camera information.

Highlights

  • New Video Mappers: Introduced video_camera_calibration_static_deepcalib_mapper, video_camera_calibration_static_moge_mapper, video_undistort_mapper, and video_camera_pose_mapper to enhance video processing capabilities.
  • Camera Pose Extraction: Added a dedicated mapper (video_camera_pose_mapper) to extract camera poses using MegaSaM and MoGe-2 models.
  • Static Camera Calibration: Implemented mappers for static camera calibration using both DeepCalib and MoGe-2, providing options for different accuracy levels.
  • Video Undistortion: Introduced a mapper to undistort raw videos using provided camera intrinsics and distortion coefficients.
  • Configuration and Integration: Updated config_all.yaml and __init__.py to integrate these new mappers and added new MetaKeys in constant.py for better metadata management.
  • Model Preparation Utilities: Enhanced model_utils.py with functions to prepare DeepCalib and MoGe models.
  • Dependency Fix: Corrected a chumpy package installation string in video_hand_reconstruction_mapper.py.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several new video processing mappers to data_juicer, including VggtMapper, VideoCameraCalibrationStaticDeepcalibMapper, VideoCameraCalibrationStaticMogeMapper, VideoUndistortMapper, and VideoWholeBodyPoseEstimationMapper. These mappers enable functionalities such as extracting camera pose and depth information, calibrating static cameras, undistorting videos, and estimating whole-body poses. The code also includes necessary dependency installations and model preparations. Review comments suggest adding checks for the existence of specific keys in the sample metadata, version compatibility for opencv-contrib-python, error handling for camera parameter initialization and file concatenation, and ensuring submodule updates succeed in VideoCameraPoseMapper.

@Qirui-jiao
Copy link
Collaborator Author

Updates on Feb 7:

  • Merge Main.

@Qirui-jiao Qirui-jiao changed the title [WIP] Add Camera Pose op Add Camera Pose op Feb 6, 2026
@HYLcool HYLcool merged commit 25d9d49 into main Feb 11, 2026
5 checks passed
@github-project-automation github-project-automation bot moved this from Todo to Done in data-juicer Feb 11, 2026
@yxdyc yxdyc deleted the dev/camera_pose_op branch February 12, 2026 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dj:multimodal issues/PRs about multimodal data processing dj:op issues/PRs about some specific OPs enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants