Skip to content

Commit 101e6f6

Browse files
authored
Merge pull request #136 from edbeeching/sb3_example_docs_update
Update ADV_STABLE_BASELINES_3.md to add instructions on using the command line arguments with the example file.
2 parents 2c348c8 + 1caf770 commit 101e6f6

File tree

1 file changed

+50
-26
lines changed

1 file changed

+50
-26
lines changed

docs/ADV_STABLE_BASELINES_3.md

Lines changed: 50 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -46,42 +46,66 @@ While the default options for sb3 work reasonably well. You may be interested in
4646

4747
We recommend taking the [sb3 example](https://github.com/edbeeching/godot_rl_agents/blob/main/examples/stable_baselines3_example.py) and modifying to match your needs.
4848

49-
This example exposes more parameter for the user to configure, such as `--speedup` to run the environment faster than realtime and the `--n_parallel` to launch several instances of the game executable in order to accelerate training (not available for in-editor training).
49+
The example exposes more parameters for the user to configure, such as `--speedup` to run the environment faster than realtime and the `--n_parallel` to launch several instances of the game executable in order to accelerate training (not available for in-editor training).
5050

51+
To use the example script, first move to the location where the downloaded script is in the console/terminal, and then try some of the example use cases below:
5152

52-
```python
53-
import argparse
54-
55-
from godot_rl.wrappers.stable_baselines_wrapper import StableBaselinesGodotEnv
56-
from stable_baselines3 import PPO
53+
### Train a model in editor:
54+
```bash
55+
python stable_baselines3_example.py
56+
```
5757

58-
# To download the env source and binary:
59-
# 1. gdrl.env_from_hub -r edbeeching/godot_rl_BallChase
60-
# 2. chmod +x examples/godot_rl_BallChase/bin/BallChase.x86_64
58+
### Train a model using an exported environment:
59+
```bash
60+
python stable_baselines3_example.py --env_path=path_to_executable
61+
```
62+
Note that the exported environment will not be rendered in order to accelerate training.
63+
If you want to display it, add the `--viz` argument.
6164

65+
### Train an exported environment using 4 environment processes:
66+
```bash
67+
python stable_baselines3_example.py --env_path=path_to_executable --n_parallel=4
68+
```
6269

63-
parser = argparse.ArgumentParser(allow_abbrev=False)
64-
parser.add_argument(
65-
"--env_path",
66-
# default="envs/example_envs/builds/JumperHard/jumper_hard.x86_64",
67-
default=None,
68-
type=str,
69-
help="The Godot binary to use, do not include for in editor training",
70-
)
70+
### Train an exported environment using 8 times speedup:
71+
```bash
72+
python stable_baselines3_example.py --env_path=path_to_executable --speedup=8
73+
```
7174

72-
parser.add_argument("--speedup", default=1, type=int, help="whether to speed up the physics in the env")
73-
parser.add_argument("--n_parallel", default=1, type=int, help="whether to speed up the physics in the env")
75+
### Set an experiment directory and name:
76+
You can optionally set an experiment directory and name to override the default. When saving checkpoints, you need to use a unique directory or name for each run (more about that below).
77+
```bash
78+
python stable_baselines3_example.py --experiment_dir="experiments" --experiment_name="experiment1"
79+
```
7480

75-
args, extras = parser.parse_known_args()
81+
### Train a model for 10_000 steps then save and export the model
82+
The exported .onnx model can be used by the Godot sync node to run inference from Godot directly, while the saved .zip model can be used to resume training later or run inference from the example script by adding `--inference`.
83+
```bash
84+
python stable_baselines3_example.py --timesteps=100_000 --onnx_export_path=model.onnx --save_model_path=model.zip
85+
```
7686

87+
### Resume training from a saved .zip model
88+
This will load the previously saved model.zip, and resume training for another 100 000 steps, so the saved model will have been trained for 200 000 steps in total.
89+
Note that the console log will display the `total_timesteps` for the last training session only, so it will show `100000` instead of `200000`.
90+
```bash
91+
python stable_baselines3_example.py --timesteps=100_000 --save_model_path=model_200_000_total_steps.zip --resume_model_path=model.zip
92+
```
7793

78-
env = StableBaselinesGodotEnv(env_path=args.env_path, show_window=True, n_parallel=args.n_parallel, speedup=args.speedup)
94+
### Save periodic checkpoints
95+
You can save periodic checkpoints and later resume training from any checkpoint using the same CL argument as above, or run inference on any checkpoint just like with the saved model.
96+
Note that you need to use a unique `experiment_name` or `experiment_dir` for each run so that checkpoints from one run won't overwrite checkpoints from another run.
97+
Alternatively, you can remove the folder containing checkpoints from a previous run if you don't need them anymore.
7998

80-
model = PPO("MultiInputPolicy", env, ent_coef=0.0001, verbose=2, n_steps=32, tensorboard_log="logs/sb3")
81-
model.learn(200000)
99+
E.g. train for a total of 2 000 000 steps with checkpoints saved at every 50 000 steps:
82100

83-
print("closing env")
84-
env.close()
101+
```bash
102+
python stable_baselines3_example.py --experiment_name=experiment1 --timesteps=2_000_000 --save_checkpoint_frequency=50_000
103+
```
85104

105+
Checkpoints will be saved to `logs\sb3\experiment1_checkpoints` in the above case, the location is affected by `--experiment_dir` and `--experiment_name`.
86106

87-
```
107+
### Run inference on a saved model for 100_000 steps
108+
You can run inference on a model that was previously saved using either `--save_model_path` or `--save_checkpoint_frequency`.
109+
```bash
110+
python stable_baselines3_example.py --timesteps=100_000 --resume_model_path=model.zip --inference
111+
```

0 commit comments

Comments
 (0)