CTP implements a number of machine learning algorithms that can be used in different ways.
General procedure:
- create a dataset or run an RL algorithm
- fit a model
- test
We need some data to work with. Let's start by making a small data set using both joint data and RGB images.
rosrun costar_bullet start --robot ur5 --task blocks --agent task \
--features multi --save -i 10 --data_file small.npz --seed 1
When we want to create a bigger data set, we can do so easily with:
rosrun costar_bullet start --robot ur5 --task blocks --agent task \
--features multi --save -i 100 --data_file big.npz --seed 1
This second one will include 100 trials of the robot picking and placing random blocks, as per the "blocks" task.
So, what do these arguments mean?
savetells us we should save a dataset--features multisays we should use both camera input, arm joint state, and gripper state--robot ur5says we should use the UR5 with 2-finger robotiq gripper--task blockssays we are interested in performing the blocks task--agent tasksays we want an "expert" task model to generate data rather than a human or RL algorithm-i 10says we want to run for 10 iterations, i.e. on 10 different random environments--seed 1tells us to initialize thenumpyrandom number generator at each iteration, in order to generate a consistent set of examples. Randomizer is initialized to(seed+iteration_num)before environment setup.--no_ssmdisable spatial softmax algorithm in the model- the network performs similarly when enabling and disabling spatial softmax, with a very small quantitative difference on a pixel-to-pixel level and very little qualitative difference.
Examples of preconfigured training commands can be found in costar_plan/commands.
The most straightforward approach we can take to getting a robot that will act on our environment is behavioral cloning. As long as we have a Task Agent that can generate decent policies, this is a reasonable starting point.
To fit such a model, run:
rosrun costar_bullet start --robot ur5 --agent null --features multi --load \
--model ff_regression --si 5 -i 1000 --batch_size 64 --data_file big.npz
To try it out, run:
rosrun costar_bullet start --robot ur5 --task blocks --agent ff \
--features multi -i 1000 --model ff_regression --gui
We need to separately define the specific architector (ff_regression) and the agent to execute (ff).
This is a simple approach; we're going to directly fit an input to an output image.
rosrun costar_bullet start --robot ur5 --agent null --features multi --load \
--model autoencoder --data_file small.npz --si 5 -i 1000 --batch_size 64
Above is an example of what this might look like after about 300 iterations. That's not very long, but we can still clearly see the robot's shape, and the net is starting to learn about block color.
rosrun costar_bullet start --robot ur5 --agent null --features multi --load --model unsupervised --data_file big.npz --si 5 -i 1000 --batch_size 64
GANs are a popular approach these days, and luckily it's pretty easy to train one with CTP:
rosrun costar_bullet start --robot ur5 --agent null --features multi --load \
--model gan --data_file small.npz --si 5 -i 1000 --batch_size 64 \
--optimizer nadam --lr 0.001
This will load our small data set and run for 1000 iterationsm using the ADAM optimizer with Nesterov momentum. Go ahead and play with the settings.
These are software libraries that might be worth a look.
[For developers] Libraries referenced, but not needed as prerequisites:
- Tensorflow-Reinforce
- Guided Policy Search -
traj_optdirectly included - KeRLym - referenced during
Trainerimplementation
TODO
