ProcgenOOD is an extension of the standard Procgen benchmark environment that evaluates on configurable level generation variables out-of-distribution. These predefined variables are called holdout types and correspond to the random generation of an asset or value within the Procgen games. See Holdout Types below for specific types and game support for each.
Like the original, ProcgenOOD contains 16 procedurally generated gym environments that run efficiently.
This README describes our extension and changes to the original repository, along with basic installation and usage information to fit most users' needs. For more information on the original game descriptions and known issues, please refer to the Procgen's README.
NOTE: The associated training repository containing the (future) paper's experimental results will be open-sourced later.
In the Procgen benchmark, algorithms are trained on a fixed number of level seeds (typically 500) and then evaluated by uniformly sampling level seeds from
- Evaluation levels are sampled independently (IID) from the same distribution as training levels, and
- There is no direct way to control the level generation to test algorithms' specific capabilities.
Approaches that help RL performance IID often do not transfer to OOD, even by small distribution shifts. As RL algorithms are inherently learning with non-stationary targets and typically deployed with sim2real transfer, we believe that evaluating OOD is far more valuable to the research community.
The following instructions assume you have conda
installed.
If you do not have conda
, you can install it from Miniconda.
cd /path/to/your/clone/of/procgen-ood
conda create -n procgen_ood -f environment.yml
conda activate procgen_ood
pip install -e . # install the package in editable mode
Verify the installation by running the following command:
python -m procgen.interactive --env-name coinrun
Holdout types can be used to withhold certain aspects of the level generation during training and/or evaluation. The holdout types are predefined and can be set using the --train-holdout-type
and --eval-holdout-type
command line arguments, or through the environment options in code.
In addition to holdout types, a holdout fraction value is supported to provide finer control over the amount of held-out data. These settings can be applied during training and/or evaluation. The predefined holdout types are:
- "agent" - Hold out player sprites.
- "enemy" - Hold out enemy sprites.
- "platform" - Hold out platforming level difficulties.
- "background" - Hold out background images.
- "all" - Hold out all supported types (see following table).
- "none" - Disable hold out. Along with
--num-levels=500
, this replicates the original Procgen benchmark.
# | Game | agent | enemy | platform | background | all |
---|---|---|---|---|---|---|
1 | bigfish | ✔ | ✔ | ✔ | ||
2 | bossfight | ✔ | ✔ | ✔ | ✔ | |
3 | caveflyer | ✔ | ✔ | |||
4 | climber | ✔ | ✔ | ✔ | ✔ | |
5 | coinrun | ✔ | ✔ | ✔ | ✔ | ✔ |
6 | dodgeball | ✔ | ✔ | ✔ | ||
7 | fruitbot | ✔ | ✔ | |||
8 | heist | ✔ | ✔ | ✔ | ||
9 | jumper | ✔ | ✔ | |||
10 | leaper | ✔ | ✔ | ✔ | ||
11 | maze | ✔ | ✔ | |||
12 | miner | ✔ | ✔ | |||
13 | ninja | ✔ | ✔ | ✔ | ||
14 | starpilot | ✔ | ✔ | |||
NOTE: The behavior of holdout type "all" is game specific! Holdout type
all
independently samples all other supported types.
- E.g.,
coinrun
with holdout type "all" will independently sample each of ["background", "agent", "enemy", "platform"] variables using the accompanying--[train/eval]-holdout-frac 0.1
argument. In contrast,bigfish
only supports randomizing over "enemy" & "background".- As seen in the table above,
chaser
andplunder
do not support any holdout types.
These options from the base environment are relevant for testing OOD with this codebase:
env_name
- Name of environment, or comma-separate list of environment names to instantiate as each env in the VecEnv.num_levels=0
- The number of unique levels that can be generated. Set to 0 to use unlimited levels.start_level=0
- The lowest seed that will be used to generated levels.start_level
andnum_levels
fully specify the set of possible levels.debug=False
- Set toTrue
to use the debug build if building from source.debug_mode=0
- A useful flag that's passed through to procgen envs. Use however you want during debugging.
The following options are new pertaining to OOD holdout types:
eval_holdout_type="none"
- Predefined type to hold out during evaluation.train_holdout_type=None
- Predefined type to hold out during training.eval_holdout_frac=0.0
- During evaluation, withhold this fraction of the specified holdout type. Value must be in [0,1].train_holdout_frac=None
- During training, withhold this fraction of the specified holdout type. Value must be in [0,1].holdout_sampling_mode="extrapolate"
- Withholding ranges from well-ordered random variables can either be at the high end ("extrapolate", default) or somewhere in the middle ("interpolate"). There is no distinction for categorical variables (e.g., background assets).
Here's how to set the options:
import gym
env = gym.make(
"procgen:procgen-coinrun-v0",
num_levels=0, # use all level seeds
train_holdout_type="background",
train_holdout_frac=0.5, # hold out ~50% of backgrounds during training
eval_holdout_type="none", # eval on full level distribution
eval_holdout_frac=0.0,
# eval_holdout_frac=0.5, # equivalent: with type "none", frac is ignored
)
NOTE: Since the gym environment is adapted from a gym3 environment, early calls to
reset()
are disallowed and therender()
method does not do anything.
- To render the environment, pass
render_mode="human"
to the constructor, which will sendrender_mode="rgb_array"
to the environment constructor and wrap it in agym3.ViewerWrapper
.- If you just want the frames instead of the window, pass
render_mode="rgb_array"
.
This project contains two different licenses for different parts of the code:
- The original code, which was forked from Procgen, is licensed under the MIT license. You can find the MIT license in the
LICENSE-MIT
file. - All modifications and additions made by Kevin Corder, Song Park, DEVCOM Army Research Laboratory, and/or Parsons Corporation are licensed under the CC0 1.0 Universal license. See the
LICENSE-CC0
file for details.
To cite this project in your work, please use the following Bibtex:
(Publication in progress)