This repository provides code related to working on the Ufactory xArm manipulators in Stanford MSL. It includes:
-
Model training and find-tuning suppport (data preparation, training scripts, etc):
- GROOT N1.5
- ACT (Action Chunking with Transformers)
- Diffusion Policy
-
Data collection and visualization utilities using ROS, RealSense, and VRPN for demonstration recording.
-
A modified version of the official xArm Python SDK adapted for integration with MSL workflows.
- xArm-MSL-ROS
PS: Trained model's checkpoints for MSL xArm can be found at: Groot Checkpoints
Use my GR00T repo, which differs from the original GR00T repo:
- Custom config specifying action representation and action horizon
Inside Docker:
dockerrunxarm
cd /xArm-MSL-ROS/msl_scripts/groot_data_prepare
python convert_data_for_groot.py
python generate_info_json.py
python generate_episode_task_json.py
python generate_modality_json.pyIMPORTANT NOTES:
- Gripper value is scaled to 0–1 here by
convert_data_for_groot.py. It is no longer 0–850 (raw data from xArm API). - The state/action inputs format:
[orientation (6D), position, gripper (0–1)]
Outside Docker:
conda activate gr00tStart fine-tuning:
python scripts/gr00t_finetune.py \
--dataset-path ~/bags/msl_bags/converted_groot_data_absolute \
--output-dir ~/bags/msl_bags/groot_checkpoints/xarm_pick_place_absolute_pose_run6_batch_16_horizon_100 \
--data-config xarm_dualcam_h100 \
--embodiment-tag oxe_droid \
--num-gpus 1 \
--no-tune_diffusion_model \
--max-steps 200000 \
--batch-size 16-
Edit the file:
/home/xarm/anaconda3_outside_docker/envs/gr00t/lib/python3.10/site-packages/transformers/trainer.pyChange this line:
checkpoint_rng_state = torch.load(rng_file, weights_only=True)
To:
checkpoint_rng_state = torch.load(rng_file, weights_only=False)
-
Add the
--resumeflag:
python scripts/gr00t_finetune.py \
--dataset-path ~/bags/msl_bags/converted_groot_data_absolute \
--output-dir ~/bags/msl_bags/groot_checkpoints/xarm_pick_place_absolute_pose_run6_batch_16_horizon_100 \
--data-config xarm_dualcam_h100 \
--embodiment-tag oxe_droid \
--num-gpus 1 \
--no-tune_diffusion_model \
--max-steps 200000 \
--batch-size 16 \
--resumeInside Docker:
python extract_images_poses_from_bags.pyNOTE: State/action inputs are in the format:
[position, orientation (6D), gripper (0–850)]
Outside Docker:
python eval_groot_xarm_pick_place.py3D plot:
python trajectory_plot_3D_rollout_groot.pySimilarity plot:
Edit trajectory_similarity_plot_3D_rollout_groot_act.py and set:
for_act = FalseThen:
python trajectory_similarity_plot_3D_rollout_groot_act.pypython scripts/eval_policy.py \
--model_path ~/bags/msl_bags/groot_checkpoints/xarm_pick_place_absolute_pose/checkpoint-25000 \
--data_config xarm_dualcam_h100 \
--dataset_path ~/bags/msl_bags/converted_groot_data \
--embodiment_tag oxe_droid \
--video_backend torchvision_av \
--modality_keys single_arm gripper \
--plotPS: Trained model's checkpoints for MSL xArm can be found at: ACT Checkpoints
Use my ACT repo, which differs from the original ACT repo:
- Action representation = end-effector pose + gripper (10-dim) instead of joint-space (14-dim)
- Updated data loader for custom observations
Inside Docker:
dockerrunxarm
cd /xArm-MSL-ROS/msl_scripts/act_data_prepare
python convert_data_for_act.py
python generate_act_config.pyIMPORTANT NOTES:
- Gripper values are NOT scaled; they remain 0–850
- Input format:
[position, orientation (6D), gripper (0–850)]
Outside Docker:
conda activate aloha
cd actTrain:
python imitate_episodes.py \
--task_name xarm_pick_place \
--ckpt_dir /home/xarm/bags/msl_bags/act_checkpoints/absolute_action_run3 \
--policy_class ACT \
--chunk_size 100 \
--batch_size 16 \
--num_epochs 50000 \
--lr 1e-5 \
--dim_feedforward 3200 \
--hidden_dim 512 \
--seed 0 \
--kl_weight 0--resume_ckpt_path /home/xarm/bags/msl_bags/act_checkpoints/absolute_action/policy_epoch_30600_seed_0.ckpt- Set relative flag in
convert_data_for_act.py - Update
--ckpt_dir - Edit
constants.py→REAL_DATASET_DIRto point to relative dataset
Inside Docker:
python extract_images_poses_from_bags.pyOutside Docker:
python imitate_episodes.py \
--eval \
--task_name xarm_pick_place \
--ckpt_dir /home/xarm/bags/msl_bags/act_checkpoints/absolute_action_run4 \
--policy_class ACT \
--chunk_size 100 \
--batch_size 16 \
--num_epochs 50000 \
--lr 1e-5 \
--dim_feedforward 3200 \
--hidden_dim 512 \
--seed 0 \
--kl_weight 0 \
--inference_dataset_dir /home/xarm/bags/msl_bags/IMPORTANT-distribution-pick-and-place-raw-bags-30/extracted_images_and_pose3D plot:
python trajectory_plot_3D_rollout_act.pySimilarity plot:
Edit trajectory_similarity_plot_3D_rollout_groot_act.py and set:
for_act = TrueThen:
python trajectory_similarity_plot_3D_rollout_groot_act.pyStart Docker:
dockerrunxarmIf container already exists and fails to start:
dockercommitxarm && dockerpushxarm && dockerrunxarmwithremoveStart tmux and open multiple windows:
tmux
Ctrl+b c # to create new window
Ctrl+b n # to switch to next windowIn separate windows:
- Start ROS:
roscore- Launch cameras:
cd xArm-MSL-ROS/
roslaunch msl_scripts/launch_two_realsense_cameras.launch(Optional: use rqt to check camera feed, then close viewer.)
Important: one person should remain at the robot's safety switch.
Enable robot at:
http://192.168.1.219:18333
cd diffusion_policy/
conda-init
conda activate robodiff
python eval_xarm_msl_ros.pyOutside Docker:
dockercommitxarm
dockerpushxarm
docker rm xarmcd ~/xArm-MSL-ROS
roslaunch msl_scripts/launch_two_realsense_cameras.launch
roslaunch vrpn_client_ros sample.launch
python ./xArm-Python-SDK/example/wrapper/xarm6/xarm_msl_ros.pyCheck pose estimates:
rostopic echo /vrpn_client_node/drone1/poseVisualize data:
rqt_image_viewRecord demo data:
rosbag record -o xarm_demo \
/robot_end_effector_pose \
/wrist_camera/color/image_raw \
/wrist_camera/color/camera_info \
/fixed_camera/color/camera_info \
/fixed_camera/color/image_raw \
/tf /tf_staticSee README.md in the xArm-Python-SDK subfolder for more details.