Diffusion Policy pybullet Usage Guide:

🎯 Overview

Diffusion Policy is a powerful behavior cloning method that models action sequences using diffusion models. This implementation includes receding horizon optimization and supports both MLP and U-Net architectures.This directory contains the implementation of Diffusion Policy for robotic manipulation tasks.The basic envs for the robotarm on pybullet by reinforcement learning refers to https://github.com/Shimly-2/DRL-on-robot-arm.git.

Key Features

Receding Horizon Optimization: Predicts action sequences and executes them step by step
Flexible Network Architecture: Supports both MLP and U-Net architectures
Multi-modal Inputs: Supports both vector states and image observations
HER Integration: Compatible with Hindsight Experience Replay
Detailed Logging: Comprehensive denoising process visualization

🚀 Quick Start

Vector State Input (MLP)

python main.py train_reach_with_DiffusionPolicy \
    --num_diffusion_steps=100 \
    --num_inference_steps=50 \
    --diffusion_lr=0.0003 \
    --horizon_steps=16 \
    --action_horizon=8

Image Input (CNN)

python main.py train_reach_with_DiffusionPolicy_CNN \
    --network_type=unet \
    --prediction_type=epsilon \
    --beta_schedule=squaredcos_cap_v2 \
    --horizon_steps=16 \
    --action_horizon=8

⚙️ Parameter Configuration

Core Diffusion Parameters

Parameter	Default	Description
`num_diffusion_steps`	100	Number of forward diffusion steps (noise addition)
`num_inference_steps`	50	Number of reverse inference steps (denoising)
`diffusion_lr`	0.0003	Learning rate for diffusion model
`beta_schedule`	"squaredcos_cap_v2"	Noise schedule: "linear" or "squaredcos_cap_v2"
`prediction_type`	"epsilon"	Prediction target: "epsilon" (noise) or "sample" (clean action)
`ema_decay`	0.995	Exponential moving average decay for model parameters

Receding Horizon Optimization Parameters

Parameter	Default	Description
`horizon_steps`	16	Prediction horizon length, how many steps to predict at once
`action_horizon`	8	Number of actions to actually execute, typically half of horizon_steps

Network Architecture Parameters

Parameter	Default	Description
`network_type`	"mlp"	Network type: "mlp" or "unet"
`clip_sample`	True	Whether to clip predicted samples

Usage Examples

Example 1: Fast Inference Configuration

# Reduce inference steps for faster execution
python main.py train_reach_with_DiffusionPolicy \
    --num_diffusion_steps=50 \
    --num_inference_steps=20 \
    --horizon_steps=8 \
    --action_horizon=4

Example 2: High-Quality Configuration

# Increase steps for better quality
python main.py train_reach_with_DiffusionPolicy \
    --num_diffusion_steps=200 \
    --num_inference_steps=100 \
    --horizon_steps=32 \
    --action_horizon=16

Example 3: Image-based Task

# CNN version for camera observations
python main.py train_reach_with_DiffusionPolicy_CNN \
    --network_type=unet \
    --num_diffusion_steps=100 \
    --num_inference_steps=50 \
    --diffusion_lr=0.0001

🔧 Advanced Features

1. Receding Horizon Optimization

Principle: Predict action sequences of length horizon_steps, but only execute the first action_horizon actions
Advantage: Provides long-term planning while maintaining real-time execution
Configuration: Typically set action_horizon = horizon_steps // 2

2. Multi-scale Network Architecture

MLP: Suitable for low-dimensional vector inputs, fast training
U-Net: Suitable for high-dimensional image inputs, better feature extraction

3. Flexible Noise Schedules

Linear: Simple linear noise increase, suitable for quick experiments
Cosine: More gradual noise increase, typically yields better results

4. Prediction Types

Epsilon: Predict noise, more stable training
Sample: Predict clean actions directly, more intuitive

📊 Performance Monitoring

The system automatically logs the following metrics:

diffusion_loss: Training loss
avg_return: Average episode return
success_rate: Task success rate
Detailed denoising process logs

🐛 Troubleshooting

Common Issues

Memory Issues
- Reduce horizon_steps and action_horizon
- Use smaller batch sizes
Training Instability
- Try prediction_type="epsilon"
- Reduce learning rate
- Increase ema_decay
Slow Inference
- Reduce num_inference_steps
- Use network_type="mlp" for simple tasks

📈 Performance Optimization

Training Speed

Use MLP for vector inputs
Reduce diffusion steps during development
Use smaller horizons for simple tasks

Training Quality

Use U-Net for image inputs
Increase diffusion and inference steps
Use cosine noise schedule
Enable EMA for stable inference

🎯 Best Practices

Parameter Selection
- Start with default parameters
- Gradually adjust based on task complexity
- Monitor success rate and convergence
Network Choice
- MLP: For vector states (positions, velocities)
- U-Net: For image observations
Horizon Setting
- Longer horizons: Better for planning tasks
- Shorter horizons: Better for reactive tasks
Training Strategy
- Use HER for sparse reward tasks
- Monitor both loss and success rate
- Save models with high success rates

🎭 Demo Program

Run demo program to see all features:

# Full demo
python demo_diffusion.py

# Only demo feature capabilities
python demo_diffusion.py --demo features

# Only demo training process
python demo_diffusion.py --demo training

🐛 Common Questions

Q1: Action output unstable?

A: Try increasing num_inference_steps or using larger ema_decay

Q2: Training slow to converge?

A: Adjust learning rate diffusion_lr, or try "sample" prediction type

Q3: Memory issues?

A: Reduce batch_size, horizon_steps, or hidden_dim

Q4: CNN version dimension error?

A: Check input image dimensions, ensure (C, H, W) format

📈 Experiment Suggestions

Start with quick configuration and validate
Gradually increase complexity
Compare different noise schedules
Adjust horizon parameters to observe effects
Monitor training loss and success rate

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.idea		.idea
__pycache__		__pycache__
algo		algo
envs		envs
models		models
pic		pic
utils		utils
visdata		visdata
.DS_Store		.DS_Store
ReadMe.md		ReadMe.md
USAGE.md		USAGE.md
config.py		config.py
demo_diffusion.py		demo_diffusion.py
main.py		main.py

Jang12344/Robotarm-Diffusion-Policy

Folders and files

Latest commit

History

Repository files navigation