Measure-Valued Derivatives in Reinforcement Learning

Accompanying code for the paper An Empirical Analysis of Measure-Valued Derivatives for Policy Gradients, submitted to IJCNN 2021.

Installation

Install MuJoCo as in https://github.com/openai/mujoco-py?tab=readme-ov-file#install-mujoco

Add to .bashrc

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.mujoco/mujoco210/bin

Install everything with

bash setup.sh

Run the experiments

Test functions

python scripts/episodic/launch_episodic_test_functions.py

LQR

python scripts/lqr_pg/launch_exp_oracle_pg_lqr.py
python scripts/lqr_pg/launch_exp_oracle_pg_lqr_error.py 
python scripts/lqr_pg/launch_exp_oracle_pg_lqr_error_training.py

Off-policy

python scripts/off_policy/launch_exp_ddpg.py
python scripts/off_policy/launch_exp_sac.py
python scripts/off_policy/launch_exp_sac_extra_samples.py
python scripts/off_policy/launch_exp_sac_mvd.py
python scripts/off_policy/launch_exp_sac_sf.py
python scripts/off_policy/launch_exp_sac_sf_extra_samples.py
python scripts/off_policy/launch_exp_td3.py

On-policy

python scripts/on_policy/launch_exp_tree_mvd_lunarlander.py
python scripts/on_policy/launch_exp_tree_mvd_pendulum.py
python scripts/on_policy/launch_exp_tree_mvd_room.py
python scripts/on_policy/launch_exp_trustregion_lunarlander.py
python scripts/on_policy/launch_exp_trustregion_pendulum.py
python scripts/on_policy/launch_exp_trustregion_room.py

Plot the results

mkdir out
python plots/plot_test_functions.py
python plots/plot_lqr.py
python plots/plot_lqr_error.py
python plots/plot_lqr_error_training.py
python plots/plot_off_policy.py
python plots/plot_on_policy.py

Check the plots in the out directory.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
deps		deps
scripts		scripts
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
environment.yml		environment.yml
setup.py		setup.py
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Measure-Valued Derivatives in Reinforcement Learning

Installation

Run the experiments

Test functions

LQR

Off-policy

On-policy

Plot the results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Measure-Valued Derivatives in Reinforcement Learning

Installation

Run the experiments

Test functions

LQR

Off-policy

On-policy

Plot the results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages