We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 4e5b5b7 commit b0ac32eCopy full SHA for b0ac32e
ml-agents/mlagents/trainers/ppo/trainer.py
@@ -254,7 +254,7 @@ def create_torch_policy(
254
behavior_spec,
255
self.trainer_settings,
256
condition_sigma_on_obs=False, # Faster training for PPO
257
- separate_critic=behavior_spec.action_spec.is_continuous(),
+ separate_critic=True, # Match network architecture with TF
258
)
259
return policy
260
0 commit comments