We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent c5bc5ce commit a5efb4eCopy full SHA for a5efb4e
stable_baselines3/rsppo/rsppo.py
@@ -255,7 +255,7 @@ def train(self) -> None:
255
256
entropy_losses.append(entropy_loss.item())
257
258
- loss = policy_loss + self.ent_coef * entropy_loss + self.vf_coef * value_loss
+ loss = policy_loss + self.vf_coef * value_loss
259
260
# Calculate approximate form of reverse KL Divergence for early stopping
261
# see issue #417: https://github.com/DLR-RM/stable-baselines3/issues/417
0 commit comments