best_policy_highway_eval_checkpoint_190000_episode_4.mp4
A research repository that integrates RACER (Risk-sensitive Actor Critic with Epistemic Robustness) with a continuous-action highway-env to train a distributional Soft Actor-Critic (SAC). The original RACER implementation by Kyle Stachowicz is available at: https://github.com/kylestach/epistemic-rl-release.
- Python 3.11 (tested)
- See
requirements.txtfor full dependency list
Create and activate a virtual environment, then install the dependencies:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtStart training with the included script:
python -m racer.scripts.train_highwayDuring training you should observe that the average speed of the agent reaches approximately 35 m/s by around 30k steps.
Comparing standard SAC with the RACER distributional SAC shows better performance and faster convergence for the distributional variant:
