Working GAIL, pretrain RL models and hotfix for A2C with continuous actions
- fixed various bugs in GAIL
- added scripts to generate dataset for gail
- added tests for GAIL + data for Pendulum-v0
- removed unused
utilsfile in DQN folder - fixed a bug in A2C where actions were cast to
int32even in the continuous case - added addional logging to A2C when Monitor wrapper is used
- changed logging for PPO2: do not display NaN when reward info is not present
- change default value of A2C lr schedule
- removed behavior cloning script
- added
pretrainmethod to base class, in order to use behavior cloning on all models - fixed
close()method for DummyVecEnv. - added support for Dict spaces in DummyVecEnv and SubprocVecEnv. (@AdamGleave)
- added support for arbitrary multiprocessing start methods and added a warning about SubprocVecEnv that are not thread-safe by default. (@AdamGleave)
- added support for Discrete actions for GAIL
- fixed deprecation warning for tf: replaces
tf.to_float()bytf.cast() - fixed bug in saving and loading ddpg model when using normalization of obs or returns (@tperol)
- changed DDPG default buffer size from 100 to 50000.
- fixed a bug in
ddpg.pyincombined_statsfor eval. Computed mean oneval_episode_rewardsandeval_qs(@keshaviyengar) - fixed a bug in
setup.pythat would error on non-GPU systems without TensorFlow installed
Welcome to @AdamGleave who joins the maintainer team.