Skip to content

lelis-research/Sparse-Policies

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

376 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Training

You can train PPO and PPO+GRU models on these environments:

  • Karel Tasks: Karel_stair_climber, Karel_maze, Karel_top_off, Karel_four_corner, Karel_harvester
  • Cartpole Tasks: Cartpole, CartpoleEasy
  • Parallel Park: car
  • Quad: Quad, QuadPO

To train models:
python src/scripts/train_ppo.py \
--env_id Karel_stair_climber \
--game_width 12 \
--game_height 12 \
--max_steps 50 \
--num_steps 500 \
--sparse_reward \  # for tasks with sparse reward only
--hidden_size 32 \  # hidden size of the actor
--total_timesteps 2_000_000 \
--num_envs 1 \  # num envs to run in parallel
--num_minibatches 1 \
--ppo_type original \  # set gru if you want to train PPO+GRU
--all_initial_confs \  # only for Karel_stair_climber, Karel_maze, Karel_top_off. for Karel_four_corner and Karel_harvester use --multi_initial_confs
--l1_lambda 0.0 \
--learning_rate 0.001 \
--ent_coef 0.1 \
--clip_coef 0.2 \
--exp_name nam _of_the_experiment

To train 30 seeds in parallel add --multiprocessing.

To train the wide maze (in the paper, wide maze is referred to as SparseMaze, but in the implementation, Sparse is used for sparse reward.):

--env_id Karel_maze --wide_maze

Evaluation

Evaluation script for karel tasks:

python src/scripts/evaluate_on_bigger_grid.py \
--task_name stair_climber \
--game_width 12 \   # details of the model you want to load
--game_height 12 \   # details of the model you want to load
--max_steps 100 \   # for evaluation
--sparse_reward \   # details of the model you want to load
--model_seed 0 \   # details of the model you want to load
--karel_seed 9 \    # initial conf seed, for reproducability
--hidden_size 32 \   # details of the model you want to load
--ppo_type original \   # details of the model you want to load
--game_width_eval 12 \  # grid width that you want to test on
--game_height_eval 12 \  # grid height that you want to test on
--learning_rate 0.0001 \   # details of the model you want to load
--ent_coef 0.1 \   # details of the model you want to load
--clip_coef 0.2 \   # details of the model you want to load
--time 1738007119 \  # timestep of the model
--record_video  # to record a video of it

You can use evaluate_cartpole.py to evaluate Cartpole tasks, and evaluate_car_quad.py for Parallel park and Quad tasks.

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors